Extended Transformer Construction (ETC)
0 订阅
Extended Transformer Construction, or ETC, is an extension of the Transformer architecture with a new attention mechanism that extends the original in two main ways: (1) it allows scaling up the input length from 512 to several thousands; and (2) it can ingesting structured inputs instead of just linear sequences. The key ideas that enable ETC to achieve these are a new global-local attention mechanism, coupled with relative position encodings. ETC also allows lifting weights from existing BERT models, saving computational resources while training.
相关学科: Systems and ControlComputational Engineering, Finance and ScienceElectricPerformanceComputational GeometryMASOther Computer ScienceDatabasesNetworking and Internet ArchitectureAWARE
学科讨论

暂无讨论内容,你可以
推荐文献
按被引用数
学科管理组
暂无学科课代表,你可以申请成为课代表
重要学者
Lotfi A. Zadeh
128623 被引用,362
篇论文
Claude E. Shannon
121827 被引用,63
篇论文
Jiawei Han
121361 被引用,1269
篇论文
Lei Zhang
76690 被引用,2397
篇论文
John R. Anderson
73543 被引用,583
篇论文
Iskander Ibragimov
70649 被引用,984
篇论文
Alessandro Vespignani
67484 被引用,490
篇论文
Colin F. Camerer
66837 被引用,474
篇论文
Anantha P. Chandrakasan
64203 被引用,626
篇论文
Jinde Cao
63560 被引用,1486
篇论文