Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yong Bai

BBQRec: Behavior-Bind Quantization for Multi-Modal Sequential Recommendation

Apr 09, 2025

Kaiyuan Li, Rui Xiang, Yong Bai, Yongxiang Tang, Yanhua Cheng, Xialong Liu, Peng Jiang, Kun Gai

Abstract:Multi-modal sequential recommendation systems leverage auxiliary signals (e.g., text, images) to alleviate data sparsity in user-item interactions. While recent methods exploit large language models to encode modalities into discrete semantic IDs for autoregressive prediction, we identify two critical limitations: (1) Existing approaches adopt fragmented quantization, where modalities are independently mapped to semantic spaces misaligned with behavioral objectives, and (2) Over-reliance on semantic IDs disrupts inter-modal semantic coherence, thereby weakening the expressive power of multi-modal representations for modeling diverse user preferences. To address these challenges, we propose a Behavior-Bind multi-modal Quantization for Sequential Recommendation (BBQRec for short) featuring dual-aligned quantization and semantics-aware sequence modeling. First, our behavior-semantic alignment module disentangles modality-agnostic behavioral patterns from noisy modality-specific features through contrastive codebook learning, ensuring semantic IDs are inherently tied to recommendation tasks. Second, we design a discretized similarity reweighting mechanism that dynamically adjusts self-attention scores using quantized semantic relationships, preserving multi-modal synergies while avoiding invasive modifications to the sequence modeling architecture. Extensive evaluations across four real-world benchmarks demonstrate BBQRec's superiority over the state-of-the-art baselines.

Via

Access Paper or Ask Questions

CHIME: A Compressive Framework for Holistic Interest Modeling

Apr 09, 2025

Yong Bai, Rui Xiang, Kaiyuan Li, Yongxiang Tang, Yanhua Cheng, Xialong Liu, Peng Jiang, Kun Gai

Abstract:Modeling holistic user interests is important for improving recommendation systems but is challenged by high computational cost and difficulty in handling diverse information with full behavior context. Existing search-based methods might lose critical signals during behavior selection. To overcome these limitations, we propose CHIME: A Compressive Framework for Holistic Interest Modeling. It uses adapted large language models to encode complete user behaviors with heterogeneous inputs. We introduce multi-granular contrastive learning objectives to capture both persistent and transient interest patterns and apply residual vector quantization to generate compact embeddings. CHIME demonstrates superior ranking performance across diverse datasets, establishing a robust solution for scalable holistic interest modeling in recommendation systems.

Via

Access Paper or Ask Questions

Group channel pruning and spatial attention distilling for object detection

Jun 02, 2023

Yun Chu, Pu Li, Yong Bai, Zhuhua Hu, Yongqing Chen, Jiafeng Lu

Abstract:Due to the over-parameterization of neural networks, many model compression methods based on pruning and quantization have emerged. They are remarkable in reducing the size, parameter number, and computational complexity of the model. However, most of the models compressed by such methods need the support of special hardware and software, which increases the deployment cost. Moreover, these methods are mainly used in classification tasks, and rarely directly used in detection tasks. To address these issues, for the object detection network we introduce a three-stage model compression method: dynamic sparse training, group channel pruning, and spatial attention distilling. Firstly, to select out the unimportant channels in the network and maintain a good balance between sparsity and accuracy, we put forward a dynamic sparse training method, which introduces a variable sparse rate, and the sparse rate will change with the training process of the network. Secondly, to reduce the effect of pruning on network accuracy, we propose a novel pruning method called group channel pruning. In particular, we divide the network into multiple groups according to the scales of the feature layer and the similarity of module structure in the network, and then we use different pruning thresholds to prune the channels in each group. Finally, to recover the accuracy of the pruned network, we use an improved knowledge distillation method for the pruned network. Especially, we extract spatial attention information from the feature maps of specific scales in each group as knowledge for distillation. In the experiments, we use YOLOv4 as the object detection network and PASCAL VOC as the training dataset. Our method reduces the parameters of the model by 64.7 % and the calculation by 34.9%.

* [J]. Applied Intelligence, 2022: 1-19
* Appl Intell

Via

Access Paper or Ask Questions

Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

May 22, 2023

Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, Xiang-Rong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu(+1 more)

Figure 1 for Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

Figure 2 for Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

Figure 3 for Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

Figure 4 for Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

Abstract:Conversion rate (CVR) prediction is one of the core components in online recommender systems, and various approaches have been proposed to obtain accurate and well-calibrated CVR estimation. However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions. This can be largely ascribed to the problem of the data distribution shift, in which the conventional methods no longer work. To this end, we seek to develop alternative modeling techniques for CVR prediction. Observing similar purchase patterns across different promotions, we propose reusing the historical promotion data to capture the promotional conversion patterns. Herein, we propose a novel \textbf{H}istorical \textbf{D}ata \textbf{R}euse (\textbf{HDR}) approach that first retrieves historically similar promotion data and then fine-tunes the CVR prediction model with the acquired data for better adaptation to the promotion mode. HDR consists of three components: an automated data retrieval module that seeks similar data from historical promotions, a distribution shift correction module that re-weights the retrieved data for better aligning with the target promotion, and a TransBlock module that quickly fine-tunes the original model for better adaptation to the promotion mode. Experiments conducted with real-world data demonstrate the effectiveness of HDR, as it improves both ranking and calibration metrics to a large extent. HDR has also been deployed on the display advertising system in Alibaba, bringing a lift of $9\%$ RPM and $16\%$ CVR during Double 11 Sales in 2022.

* Accepted at KDD 2023 (camera-ready version coming soon). This work has already been deployed on the display advertising system in Alibaba, bringing substantial economic gains

Via

Access Paper or Ask Questions

Adapting to Online Label Shift with Provable Guarantees

Jul 05, 2022

Yong Bai, Yu-Jie Zhang, Peng Zhao, Masashi Sugiyama, Zhi-Hua Zhou

Figure 1 for Adapting to Online Label Shift with Provable Guarantees

Figure 2 for Adapting to Online Label Shift with Provable Guarantees

Figure 3 for Adapting to Online Label Shift with Provable Guarantees

Figure 4 for Adapting to Online Label Shift with Provable Guarantees

Abstract:The standard supervised learning paradigm works effectively when training data shares the same distribution as the upcoming testing samples. However, this assumption is often violated in real-world applications, especially when testing data appear in an online fashion. In this paper, we formulate and investigate the problem of online label shift (OLaS): the learner trains an initial model from the labeled offline data and then deploys it to an unlabeled online environment where the underlying label distribution changes over time but the label-conditional density does not. The non-stationarity nature and the lack of supervision make the problem challenging to be tackled. To address the difficulty, we construct a new unbiased risk estimator that utilizes the unlabeled data, which exhibits many benign properties albeit with potential non-convexity. Building upon that, we propose novel online ensemble algorithms to deal with the non-stationarity of the environments. Our approach enjoys optimal dynamic regret, indicating that the performance is competitive with a clairvoyant who knows the online environments in hindsight and then chooses the best decision for each round. The obtained dynamic regret bound scales with the intensity and pattern of label distribution shift, hence exhibiting the adaptivity in the OLaS problem. Extensive experiments are conducted to validate the effectiveness and support our theoretical findings.

Via

Access Paper or Ask Questions

Fake Generated Painting Detection via Frequency Analysis

Mar 05, 2020

Yong Bai, Yuanfang Guo, Jinjie Wei, Lin Lu, Rui Wang, Yunhong Wang

Figure 1 for Fake Generated Painting Detection via Frequency Analysis

Figure 2 for Fake Generated Painting Detection via Frequency Analysis

Figure 3 for Fake Generated Painting Detection via Frequency Analysis

Figure 4 for Fake Generated Painting Detection via Frequency Analysis

Abstract:With the development of deep neural networks, digital fake paintings can be generated by various style transfer algorithms.To detect the fake generated paintings, we analyze the fake generated and real paintings in Fourier frequency domain and observe statistical differences and artifacts. Based on our observations, we propose Fake Generated Painting Detection via Frequency Analysis (FGPD-FA) by extracting three types of features in frequency domain. Besides, we also propose a digital fake painting detection database for assessing the proposed method. Experimental results demonstrate the excellence of the proposed method in different testing conditions.

* 5 pages, 6 figures

Via

Access Paper or Ask Questions

Recognition of Pyralidae Insects Using Intelligent Monitoring Autonomous Robot Vehicle in Natural Farm Scene

Mar 26, 2019

Boyi Liu, Zhuhua Hu, Yaochi Zhao, Yong Bai, Yu Wang

Figure 1 for Recognition of Pyralidae Insects Using Intelligent Monitoring Autonomous Robot Vehicle in Natural Farm Scene

Figure 2 for Recognition of Pyralidae Insects Using Intelligent Monitoring Autonomous Robot Vehicle in Natural Farm Scene

Figure 3 for Recognition of Pyralidae Insects Using Intelligent Monitoring Autonomous Robot Vehicle in Natural Farm Scene

Figure 4 for Recognition of Pyralidae Insects Using Intelligent Monitoring Autonomous Robot Vehicle in Natural Farm Scene

Abstract:The Pyralidae pests, such as corn borer and rice leaf roller, are main pests in economic crops. The timely detection and identification of Pyralidae pests is a critical task for agriculturists and farmers. However, the traditional identification of pests by humans is labor intensive and inefficient. To tackle the challenges, a pest monitoring autonomous robot vehicle and a method to recognize Pyralidae pests are presented in this paper. Firstly, the robot on autonomous vehicle collects images by performing camera sensing in natural farm scene. Secondly, the total probability image can be obtained by using inverse histogram mapping, and then the object contour of Pyralidae pests can be extracted quickly and accurately with the constrained Otsu method. Finally, by employing Hu moment and the perimeter and area characteristics, the correct contours of objects can be drawn, and the recognition results can be obtained by comparing them with the reference templates of Pyralidae pests. Additionally, the moving speed of the mechanical arms on the vehicle can be adjusted adaptively by interacting with the recognition algorithm. The experimental results demonstrate that the robot vehicle can automatically capture pest images, and can achieve 94.3$\%$ recognition accuracy in natural farm planting scene.

Via

Access Paper or Ask Questions