This website requires JavaScript.

Efficient On-Device Session-Based Recommendation

Xin XiaJunliang YuQinyong WangChaoqun YangQuoc Viet Hung NguyenHongzhi Yin
Sep 2022
On-device session-based recommendation systems have been achieving increasingattention on account of the low energy/resource consumption and privacyprotection while providing promising recommendation performance. To fit thepowerful neural session-based recommendation models in resource-constrainedmobile devices, tensor-train decomposition and its variants have been widelyapplied to reduce memory footprint by decomposing the embedding table intosmaller tensors, showing great potential in compressing recommendation models.However, these model compression techniques significantly increase the localinference time due to the complex process of generating index lists and aseries of tensor multiplications to form item embeddings, and the resultanton-device recommender fails to provide real-time response and recommendation.To improve the online recommendation efficiency, we propose to learncompositional encoding-based compact item representations. Specifically, eachitem is represented by a compositional code that consists of several codewords,and we learn embedding vectors to represent each codeword instead of each item.Then the composition of the codeword embedding vectors from different embeddingmatrices (i.e., codebooks) forms the item embedding. Since the size ofcodebooks can be extremely small, the recommender model is thus able to fit inresource-constrained devices and meanwhile can save the codebooks for fastlocal inference.Besides, to prevent the loss of model capacity caused bycompression, we propose a bidirectional self-supervised knowledge distillationframework. Extensive experimental results on two benchmark datasets demonstratethat compared with existing methods, the proposed on-device recommender notonly achieves an 8x inference speedup with a large compression ratio but alsoshows superior recommendation performance.