Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiao Lv

MARM: Unlocking the Future of Recommendation Systems through Memory Augmentation and Scalable Complexity

Nov 14, 2024

Xiao Lv, Jiangxia Cao, Shijie Guan, Xiaoyou Zhou, Zhiguang Qi, Yaqiang Zang, Ming Li, Ben Wang, Kun Gai, Guorui Zhou

Figure 1 for MARM: Unlocking the Future of Recommendation Systems through Memory Augmentation and Scalable Complexity

Figure 2 for MARM: Unlocking the Future of Recommendation Systems through Memory Augmentation and Scalable Complexity

Figure 3 for MARM: Unlocking the Future of Recommendation Systems through Memory Augmentation and Scalable Complexity

Figure 4 for MARM: Unlocking the Future of Recommendation Systems through Memory Augmentation and Scalable Complexity

Abstract:Scaling-law has guided the language model designing for past years, however, it is worth noting that the scaling laws of NLP cannot be directly applied to RecSys due to the following reasons: (1) The amount of training samples and model parameters is typically not the bottleneck for the model. Our recommendation system can generate over 50 billion user samples daily, and such a massive amount of training data can easily allow our model parameters to exceed 200 billion, surpassing many LLMs (about 100B). (2) To ensure the stability and robustness of the recommendation system, it is essential to control computational complexity FLOPs carefully. Considering the above differences with LLM, we can draw a conclusion that: for a RecSys model, compared to model parameters, the computational complexity FLOPs is a more expensive factor that requires careful control. In this paper, we propose our milestone work, MARM (Memory Augmented Recommendation Model), which explores a new cache scaling-laws successfully.

* Work in progress

Via

Access Paper or Ask Questions

Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Sep 02, 2022

Xinyi He, Mengyu Zhou, Jialiang Xu, Xiao Lv, Tianle Li, Yijia Shao, Shi Han, Zejian Yuan, Dongmei Zhang

Figure 1 for Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Figure 2 for Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Figure 3 for Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Figure 4 for Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Abstract:Many data analysis tasks heavily rely on a deep understanding of tables (multi-dimensional data). Across the tasks, there exist comonly used metadata attributes of table fields / columns. In this paper, we identify four such analysis metadata: Measure/dimension dichotomy, common field roles, semantic field type, and default aggregation function. While those metadata face challenges of insufficient supervision signals, utilizing existing knowledge and understanding distribution. To inference these metadata for a raw table, we propose our multi-tasking Metadata model which fuses field distribution and knowledge graph information into pre-trained tabular models. For model training and evaluation, we collect a large corpus (~582k tables from private spreadsheet and public tabular datasets) of analysis metadata by using diverse smart supervisions from downstream tasks. Our best model has accuracy = 98%, hit rate at top-1 > 67%, accuracy > 80%, and accuracy = 88% for the four analysis metadata inference tasks, respectively. It outperforms a series of baselines that are based on rules, traditional machine learning methods, and pre-trained tabular models. Analysis metadata models are deployed in a popular data analysis product, helping downstream intelligent features such as insights mining, chart / pivot table recommendation, and natural language QA...

* 13pages, 7 figures, 9 tables

Via

Access Paper or Ask Questions