DUAL TRANSFORMER ENCODERS FOR SESSION-BASED RECOMMENDATION
Author affiliations
DOI:
https://doi.org/10.15625/1813-9663/37/4/15844Keywords:
Recommender systems, Session-based recommendation, Self-attention, Dual TransformerAbstract
When long-term user proles are not available, session-based recommendation methods are used to predict the user's next actions from anonymous sessions-based data. Recent advances in session-based recommendation highlight the necessity of modeling not only user sequential behaviors but also the user's main interest in a session, while avoiding the eect of unintended clicks causing interest drift of the user. In this work, we propose a Dual Transformer Encoder Recommendation model (DTER) as a solution to address this requirement. The idea is to combine the following recipes: (1) a Transformer-based model with dual encoders capable of modeling both sequential patterns and the main interest of the user in a session; (2) a new recommendation model that is designed for learning richer session contexts by conditioning on all permutations of the session prex. This approach provides a unied framework for leveraging the ability of the Transformer's self-attention mechanism in modeling session sequences while taking into account the user's main interest in the session. We empirically evaluate the proposed method on two benchmark datasets. The results show that DTER outperforms state-of-the-art session-based recommendation methods on common evaluation metrics.
Metrics
References
C. Alt, M. Hubner, and L. Hennig, Fine-tuning pre-trained transformer language models to distantly supervised relation extraction," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association
for Computational Linguistics, Jul. 2019, pp. 1388{1398. [Online]. Available: https://www.aclweb.org/anthology/P19-1134
Z. Batmaz, A. Yurekli, A. Bilge, and C. Kaleli, A review on deep learning for recommender systems: challenges and remedies," Articial Intelligence Review, vol. 52, pp. 1-37, 2019.
S. Chen, J. L. Moore, D. Turnbull, and T. Joachims, Playlist prediction via metric embedding," in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD '12. New York, NY, USA: ACM, 2012, pp. 714-722.
[Online]. Available: http://doi.acm.org/10.1145/2339530.2339643
K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, Oct. 2014,
pp. 1724-1734. [Online]. Available: https://www.aclweb.org/anthology/D14-1179
K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, Electra: Pre-training text encoders as discriminators rather than generators," in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=r1xMH1BtvB
Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, and R. Salakhutdinov., Transformer-XL:
attentive language models beyond a xed-length context," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 2978-2988.
M. Deshpande and G. Karypis, Item-based top-n recommendation algorithms," ACM Trans. Inf. Syst., vol. 22, no. 1, pp. 143{177, Jan. 2004. [Online]. Available: http://doi.acm.org/10.1145/963770.963776
J. Devlin, M. Chang, K. Lee, and K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding," in Proceedings of NAACL-HLT, 2019, pp. 4171-4186.
J. L. Elman, Finding structure in time," Cognitive Science, vol. 14, no. 2, pp. 179-211, 1990.
F. Figueiredo, B. Ribeiro, J. M. Almeida, and C. Faloutsos, Tribe
ow: Mining & predicting user trajectories," in Proceedings of the 25th International Conference on World Wide Web, ser. WWW '16. Republic and Canton of Geneva, Switzerland: International
World Wide Web Conferences Steering Committee, 2016, pp. 695-706. [Online]. Available: https://doi.org/10.1145/2872427.2883059
D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, Using collaborative ltering to weave an information tapestry," Commun. ACM, vol. 35, no. 12, pp. 61-70, Dec. 1992. [Online]. Available: http://doi.acm.org/10.1145/138859.138867
B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk, Session-based recommendations with recurrent neural networks," in Proceedings of the International Conference on Learning Representations, 2016.
B. Hidasi and A. Karatzoglou, Recurrent neural networks with top-k gains for session-based recommendations," in Proceedings of the 27th ACM International Conference on Information and Knowledge Management, ser. CIKM '18. New York, NY, USA: ACM, 2018, pp. 843-852.
[Online]. Available: http://doi.acm.org/10.1145/3269206.3271761
W.-C. Kang and J. McAuley, Self-attentive sequential recommendation," in Proceedings of IEEE International Conference on Data Mining (ICDM'18), 2018, pp. 197{206.
Y. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems," Computer, vol. 42, no. 8, pp. 30{37, Aug. 2009. [Online]. Available: http://dx.doi.org/10.1109/MC.2009.263
Dual Transformer Encoders for Session-based Recommendation 15
Y. LeCun, B. E. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. E. Hubbard, and L. D. Jackel, Handwritten Digit Recognition with a Back-Propagation Network," in Advances in Neural Information Processing Systems 2, D. S. Touretzky, Ed. Morgan-Kaufmann, 1990, pp. 396-404.
J. Li, P. Ren, Z. Chen, Z. Ren, T. Lian, and J. Ma, Neural attentive session-based recommendation," in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, ser. CIKM '17. New York, NY, USA: ACM, 2017, pp. 1419-1428.
[Online]. Available: http://doi.acm.org/10.1145/3132847.3132926
Q. Liu, Y. Zeng, R. Mokhosi, and H. Zhang, STAMP: short-term attention/memory priority model for session-based recommendation," in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ser. KDD
'18. New York, NY, USA: ACM, 2018, pp. 1831{1839. [Online]. Available: http://doi.acm.org/10.1145/3219819.3219950
T. M. Phuong, T. C. Thanh, and N. X. Bach, Neural session-aware recommendation," IEEE Access, vol. 7, pp. 86 884{86 896, 2019.
M. Quadrana, A. Karatzoglou, B. Hidasi, and P. Cremonesi, Personalizing session-based recommendations with hierarchical recurrent neural networks," in Proceedings of the Eleventh
ACM Conference on Recommender Systems, ser. RecSys '17. New York, NY, USA: ACM, 2017, pp. 130{137. [Online]. Available: http://doi.acm.org/10.1145/3109859.3109896
A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, Improving language understanding by generative pre-training," URL https://s3-us-west-2. amazonaws.com/openaiassets/researchcovers/languageunsupervised/language understanding paper. pdf, 2018.
S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme, Factorizing personalized markov chains for next-basket recommendation," in Proceedings of the 19th International Conference on World Wide Web, ser. WWW '10. New York, NY, USA: ACM, 2010, pp. 811{820. [Online]. Available: http://doi.acm.org/10.1145/1772690.1772773
G. Shani, R. I. Brafman, and D. Heckerman, "An MDP-based Recommender System"," arXiv e-prints, p. arXiv:1301.0600, "Dec" "2012".
Y. K. Tan, X. Xu, and Y. Liu, Improved recurrent neural networks for session-based recommendations," in Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, ser. DLRS 2016. New York, NY, USA: ACM, 2016, pp. 17-22. [Online]. Available:
http://doi.acm.org/10.1145/2988450.2988452
J. Tang and K. Wang, Personalized top-n sequential recommendation via convolutional sequence embedding," in Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, ser. WSDM '18. New York, NY, USA: ACM, 2018, pp. 565-573.
[Online]. Available: http://doi.acm.org/10.1145/3159652.3159656
T. X. Tuan and T. M. Phuong, 3D convolutional networks for session-based recommendation with content features," in Proceedings of the Eleventh ACM Conference on Recommender Systems, ser. RecSys '17. New York, NY, USA: ACM, 2017, pp. 138-146. [Online]. Available: http://doi.acm.org/10.1145/3109859.3109900
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, u. Kaiser, and I. Polosukhin, "Attention is all you need," in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS17. Red Hook, NY, USA: Curran Associates Inc., 2017, p. 60006010.
Pham Hoang Anh, Ngo Xuan Bach, Tu Minh Phuong
E. Voita, R. Sennrich, and I. Titov, "The bottom-up evolution of representations in the transformer: A study with machine translation and language modeling objectives," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 4396-4406.
S. Wang, L. Cao, and Y. Wang, A survey on session-based recommender systems," arXiv:1902.04864, 2019. [Online]. Available: https://arxiv.org/abs/1902.04864
S. Wang, L. Hu, L. Cao, X. Huang, D. Lian, and W. Liu, Attention-based transactional context embedding for next-item recommendation," in Proceedings of AAAI Conference on Articial
Intelligence, 2018.
S. Wu, Y. Tang, Y. Zhu, L. Wang, X. Xie, and T. Tan, "Session-based recommendation with
graph neural networks," in Proceedings of AAAI Conference on Articial Intelligence, 2019.
J. Xiao, H. Ye, X. He, H. Zhang, F. Wu, and T.-S. Chua, Attentional factorization machines:
Learning the weight of feature interactions via attention networks," in Proceedings of the 26th
International Joint Conference on Articial Intelligence, ser. IJCAI17. AAAI Press, 2017, p.
Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, XLNet: generalized
autoregressive pretraining for language understanding," arXiv:1906.08237, 2019. [Online].
Available: https://arxiv.org/abs/1906.08237
G. Yap, X. Li, and P. Yu, Eective next-items recommendation via personalized sequential
pattern mining," in Proceedings of the 17th International Conference on Database Systems for
Advanced Applications, 2012, pp. 48{64.
F. Yuan, A. Karatzoglou, I. Arapakis, J. M. Jose, and X. He, A simple convolutional generative
network for next item recommendation," in Proceedings of the Twelfth ACM International
Conference on Web Search and Data Mining, ser. WSDM '19. New York, NY, USA: ACM,
, pp. 582{590.
Downloads
Published
How to Cite
Issue
Section
License
1. We hereby assign copyright of our article (the Work) in all forms of media, whether now known or hereafter developed, to the Journal of Computer Science and Cybernetics. We understand that the Journal of Computer Science and Cybernetics will act on my/our behalf to publish, reproduce, distribute and transmit the Work.2. This assignment of copyright to the Journal of Computer Science and Cybernetics is done so on the understanding that permission from the Journal of Computer Science and Cybernetics is not required for me/us to reproduce, republish or distribute copies of the Work in whole or in part. We will ensure that all such copies carry a notice of copyright ownership and reference to the original journal publication.
3. We warrant that the Work is our results and has not been published before in its current or a substantially similar form and is not under consideration for another publication, does not contain any unlawful statements and does not infringe any existing copyright.
4. We also warrant that We have obtained the necessary permission from the copyright holder/s to reproduce in the article any materials including tables, diagrams or photographs not owned by me/us.