Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ri Su

SCAR: A Characterization Scheme for Multi-Modal Dataset

Aug 27, 2025

Ri Su, Zhao Chen, Caleb Chen Cao, Nan Tang, Lei Chen

Abstract:Foundation models exhibit remarkable generalization across diverse tasks, largely driven by the characteristics of their training data. Recent data-centric methods like pruning and compression aim to optimize training but offer limited theoretical insight into how data properties affect generalization, especially the data characteristics in sample scaling. Traditional perspectives further constrain progress by focusing predominantly on data quantity and training efficiency, often overlooking structural aspects of data quality. In this study, we introduce SCAR, a principled scheme for characterizing the intrinsic structural properties of datasets across four key measures: Scale, Coverage, Authenticity, and Richness. Unlike prior data-centric measures, SCAR captures stable characteristics that remain invariant under dataset scaling, providing a robust and general foundation for data understanding. Leveraging these structural properties, we introduce Foundation Data-a minimal subset that preserves the generalization behavior of the full dataset without requiring model-specific retraining. We model single-modality tasks as step functions and estimate the distribution of the foundation data size to capture step-wise generalization bias across modalities in the target multi-modal dataset. Finally, we develop a SCAR-guided data completion strategy based on this generalization bias, which enables efficient, modality-aware expansion of modality-specific characteristics in multimodal datasets. Experiments across diverse multi-modal datasets and model architectures validate the effectiveness of SCAR in predicting data utility and guiding data acquisition. Code is available at https://github.com/McAloma/SCAR.

* 6 pages, 3 figures

Via

Access Paper or Ask Questions

DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning

Jul 24, 2023

Menglin Kong, Ri Su, Shaojie Zhao, Muzhou Hou

Figure 1 for DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning

Figure 2 for DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning

Figure 3 for DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning

Figure 4 for DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning

Abstract:Recommendation system algorithm based on multi-task learning (MTL) is the major method for Internet operators to understand users and predict their behaviors in the multi-behavior scenario of platform. Task correlation is an important consideration of MTL goals, traditional models use shared-bottom models and gating experts to realize shared representation learning and information differentiation. However, The relationship between real-world tasks is often more complex than existing methods do not handle properly sharing information. In this paper, we propose an Different Expression Parallel Heterogeneous Network (DEPHN) to model multiple tasks simultaneously. DEPHN constructs the experts at the bottom of the model by using different feature interaction methods to improve the generalization ability of the shared information flow. In view of the model's differentiating ability for different task information flows, DEPHN uses feature explicit mapping and virtual gradient coefficient for expert gating during the training process, and adaptively adjusts the learning intensity of the gated unit by considering the difference of gating values and task correlation. Extensive experiments on artificial and real-world datasets demonstrate that our proposed method can capture task correlation in complex situations and achieve better performance than baseline models\footnote{Accepted in IJCNN2023}.

Via

Access Paper or Ask Questions

FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

Jul 24, 2023

Menglin Kong, Shaojie Zhao, Juan Cheng, Xingquan Li, Ri Su, Muzhou Hou, Cong Cao

Figure 1 for FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

Figure 2 for FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

Figure 3 for FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

Figure 4 for FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

Abstract:There are two fundamental problems in applying deep learning/machine learning methods to disease classification tasks, one is the insufficient number and poor quality of training samples; another one is how to effectively fuse multiple source features and thus train robust classification models. To address these problems, inspired by the process of human learning knowledge, we propose the Feature-aware Fusion Correlation Neural Network (FaFCNN), which introduces a feature-aware interaction module and a feature alignment module based on domain adversarial learning. This is a general framework for disease classification, and FaFCNN improves the way existing methods obtain sample correlation features. The experimental results show that training using augmented features obtained by pre-training gradient boosting decision tree yields more performance gains than random-forest based methods. On the low-quality dataset with a large amount of missing data in our setup, FaFCNN obtains a consistently optimal performance compared to competitive baselines. In addition, extensive experiments demonstrate the robustness of the proposed method and the effectiveness of each component of the model\footnote{Accepted in IEEE SMC2023}.

Via

Access Paper or Ask Questions

DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems

May 20, 2023

Menglin Kong, Muzhou Hou, Shaojie Zhao, Feng Liu, Ri Su, Yinghao Chen

Figure 1 for DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems

Figure 2 for DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems

Figure 3 for DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems

Figure 4 for DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems

Abstract:Click-Through Rate (CTR) prediction is one of the main tasks of the recommendation system, which is conducted by a user for different items to give the recommendation results. Cross-domain CTR prediction models have been proposed to overcome problems of data sparsity, long tail distribution of user-item interactions, and cold start of items or users. In order to make knowledge transfer from source domain to target domain more smoothly, an innovative deep learning cross-domain CTR prediction model, Domain Adversarial Deep Interest Network (DADIN) is proposed to convert the cross-domain recommendation task into a domain adaptation problem. The joint distribution alignment of two domains is innovatively realized by introducing domain agnostic layers and specially designed loss, and optimized together with CTR prediction loss in a way of adversarial training. It is found that the Area Under Curve (AUC) of DADIN is 0.08% higher than the most competitive baseline on Huawei dataset and is 0.71% higher than its competitors on Amazon dataset, achieving the state-of-the-art results on the basis of the evaluation of this model performance on two real datasets. The ablation study shows that by introducing adversarial method, this model has respectively led to the AUC improvements of 2.34% on Huawei dataset and 16.67% on Amazon dataset.

Via

Access Paper or Ask Questions

PHN: Parallel heterogeneous network with soft gating for CTR prediction

Jun 18, 2022

Ri Su, Alphonse Houssou Hounye, Cong Cao, Muzhou Hou

Figure 1 for PHN: Parallel heterogeneous network with soft gating for CTR prediction

Figure 2 for PHN: Parallel heterogeneous network with soft gating for CTR prediction

Figure 3 for PHN: Parallel heterogeneous network with soft gating for CTR prediction

Figure 4 for PHN: Parallel heterogeneous network with soft gating for CTR prediction

Abstract:The Click-though Rate (CTR) prediction task is a basic task in recommendation system. Most of the previous researches of CTR models built based on Wide \& deep structure and gradually evolved into parallel structures with different modules. However, the simple accumulation of parallel structures can lead to higher structural complexity and longer training time. Based on the Sigmoid activation function of output layer, the linear addition activation value of parallel structures in the training process is easy to make the samples fall into the weak gradient interval, resulting in the phenomenon of weak gradient, and reducing the effectiveness of training. To this end, this paper proposes a Parallel Heterogeneous Network (PHN) model, which constructs a network with parallel structure through three different interaction analysis methods, and uses Soft Selection Gating (SSG) to feature heterogeneous data with different structure. Finally, residual link with trainable parameters are used in the network to mitigate the influence of weak gradient phenomenon. Furthermore, we demonstrate the effectiveness of PHN in a large number of comparative experiments, and visualize the performance of the model in training process and structure.

Via

Access Paper or Ask Questions