Transformer

A Transformer is a model architecture that eschews recurrence and instead relies entirely on an attention mechanism to draw global dependencies between input and output. Before Transformers, the dominant sequence transduction models were based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The Transformer also employs an encoder and decoder, but removing recurrence in favor of attention mechanisms allows for significantly more parallelization than methods like RNNs and CNNs.
相关学科: BERTALBERTSystems and ControlHilbert TransformLambertElectricNLPHerbertRoBERTaMachine Translation

学科讨论

讨论Icon

暂无讨论内容,你可以

推荐文献

按被引用数

学科管理组

暂无学科课代表,你可以申请成为课代表

重要学者

Yoshua Bengio

429868 被引用,1063 篇论文

Georg Kresse

234910 被引用,479 篇论文

Albert-László Barabási

214997 被引用,510 篇论文

Andrew Zisserman

195560 被引用,885 篇论文

Yann LeCun

175383 被引用,366 篇论文

Ilya Sutskever

165856 被引用,113 篇论文

Michael I. Jordan

150356 被引用,1056 篇论文

Peter M. Bentler

145839 被引用,408 篇论文

Herbert A. Simon

145136 被引用,873 篇论文

Terrence J. Sejnowski

134448 被引用,931 篇论文