This website requires JavaScript.

Cross-Lingual Transfer Learning for Statistical Type Inference

Zhiming LiXiaofei XieHaoliang LiZhengzi XuYi LiYang Liu
Jul 2021
Hitherto statistical type inference systems rely thoroughly on supervisedlearning approaches, which require laborious manual effort to collect and labellarge amounts of data. Most Turing-complete imperative languages share similarcontrol- and data-flow structures, which make it possible to transfer knowledgelearned from one language to another. In this paper, we propose a cross-lingualtransfer learning framework, PLATO, for statistical type inference, whichallows us to leverage prior knowledge learned from the labeled dataset of onelanguage and transfer it to the others, e.g., Python to JavaScript, Java toJavaScript, etc. PLATO is powered by a novel kernelized attention mechanism toconstrain the attention scope of the backbone Transformer model such that modelis forced to base its prediction on commonly shared features among languages.In addition, we propose the syntax enhancement that augments the learning onthe feature overlap among language domains. Furthermore, PLATO can also be usedto improve the performance of the conventional supervised-based type inferenceby introducing cross-language augmentation, which enables the model to learnmore general features across multiple languages. We evaluated PLATO under twosettings: 1) under the cross-domain scenario that the target language data isnot labeled or labeled partially, the results show that PLATO outperforms thestate-of-the-art domain transfer techniques by a large margin, e.g., itimproves the Python to TypeScript baseline by +14.6%@EM, +18.6%@weighted-F1,and 2) under the conventional monolingual supervised scenario, PLATO improvesthe Python baseline by +4.10%@EM, +1.90%@weighted-F1 with the introduction ofthe cross-lingual augmentation.