Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bin Qin

CellCLAT: Preserving Topology and Trimming Redundancy in Self-Supervised Cellular Contrastive Learning

May 27, 2025

Bin Qin, Qirui Ji, Jiangmeng Li, Yupeng Wang, Xuesong Wu, Jianwen Cao, Fanjiang Xu

Abstract:Self-supervised topological deep learning (TDL) represents a nascent but underexplored area with significant potential for modeling higher-order interactions in simplicial complexes and cellular complexes to derive representations of unlabeled graphs. Compared to simplicial complexes, cellular complexes exhibit greater expressive power. However, the advancement in self-supervised learning for cellular TDL is largely hindered by two core challenges: \textit{extrinsic structural constraints} inherent to cellular complexes, and intrinsic semantic redundancy in cellular representations. The first challenge highlights that traditional graph augmentation techniques may compromise the integrity of higher-order cellular interactions, while the second underscores that topological redundancy in cellular complexes potentially diminish task-relevant information. To address these issues, we introduce Cellular Complex Contrastive Learning with Adaptive Trimming (CellCLAT), a twofold framework designed to adhere to the combinatorial constraints of cellular complexes while mitigating informational redundancy. Specifically, we propose a parameter perturbation-based augmentation method that injects controlled noise into cellular interactions without altering the underlying cellular structures, thereby preserving cellular topology during contrastive learning. Additionally, a cellular trimming scheduler is employed to mask gradient contributions from task-irrelevant cells through a bi-level meta-learning approach, effectively removing redundant topological elements while maintaining critical higher-order semantics. We provide theoretical justification and empirical validation to demonstrate that CellCLAT achieves substantial improvements over existing self-supervised graph learning methods, marking a significant attempt in this domain.

Via

Access Paper or Ask Questions

Distinctive Feature Codec: Adaptive Segmentation for Efficient Speech Representation

May 24, 2025

Xiangyu Zhang, Fuming Fang, Peng Gao, Bin Qin, Beena Ahmed, Julien Epps

Abstract:The tokenization of speech with neural speech codec models is a crucial aspect of AI systems designed for speech understanding and generation. While text-based systems naturally benefit from token boundaries between discrete symbols, tokenizing continuous speech signals is more complex due to the unpredictable timing of important acoustic variations. Most current neural speech codecs typically address this by using uniform processing at fixed time intervals, which overlooks the varying information density inherent in speech. In this paper, we introduce a distinctive feature-based approach that dynamically allocates tokens based on the perceptual significance of speech content. By learning to identify and prioritize distinctive regions in speech signals, our approach achieves a significantly more efficient speech representation compared with conventional frame-based methods. This work marks the first successful extension of traditional signal processing-based distinctive features into deep learning frameworks. Through rigorous experimentation, we demonstrate the effectiveness of our approach and provide theoretical insights into how aligning segment boundaries with natural acoustic transitions improves codebook utilization. Additionally, we enhance tokenization stability by developing a Group-wise Scalar Quantization approach for variable-length segments. Our distinctive feature-based approach offers a promising alternative to conventional frame-based processing and advances interpretable representation learning in the modern deep learning speech processing framework.

Via

Access Paper or Ask Questions

Rethinking the Bias of Foundation Model under Long-tailed Distribution

Jan 27, 2025

Jiahao Chen, Bin Qin, Jiangmeng Li, Hao Chen, Bing Su

Abstract:Long-tailed learning has garnered increasing attention due to its practical significance. Among the various approaches, the fine-tuning paradigm has gained considerable interest with the advent of foundation models. However, most existing methods primarily focus on leveraging knowledge from these models, overlooking the inherent biases introduced by the imbalanced training data they rely on. In this paper, we examine how such imbalances from pre-training affect long-tailed downstream tasks. Specifically, we find the imbalance biases inherited in foundation models on downstream task as parameter imbalance and data imbalance. During fine-tuning, we observe that parameter imbalance plays a more critical role, while data imbalance can be mitigated using existing re-balancing strategies. Moreover, we find that parameter imbalance cannot be effectively addressed by current re-balancing techniques, such as adjusting the logits, during training, unlike data imbalance. To tackle both imbalances simultaneously, we build our method on causal learning and view the incomplete semantic factor as the confounder, which brings spurious correlations between input samples and labels. To resolve the negative effects of this, we propose a novel backdoor adjustment method that learns the true causal effect between input samples and labels, rather than merely fitting the correlations in the data. Notably, we achieve an average performance increase of about $1.67\%$ on each dataset.

Via

Access Paper or Ask Questions

Retrieval-Augmented Generation by Evidence Retroactivity in LLMs

Jan 07, 2025

Liang Xiao, Wen Dai, Shuai Chen, Bin Qin, Chongyang Shi, Haopeng Jing, Tianyu Guo

Abstract:Retrieval-augmented generation has gained significant attention due to its ability to integrate relevant external knowledge, enhancing the accuracy and reliability of the LLMs' responses. Most of the existing methods apply a dynamic multiple retrieval-generating process, to address multi-hop complex questions by decomposing them into sub-problems. However, these methods rely on an unidirectional forward reasoning paradigm, where errors from insufficient reasoning steps or inherent flaws in current retrieval systems are irreversible, potentially derailing the entire reasoning chain. For the first time, this work introduces Retroactive Retrieval-Augmented Generation (RetroRAG), a novel framework to build a retroactive reasoning paradigm. RetroRAG revises and updates the evidence, redirecting the reasoning chain to the correct direction. RetroRAG constructs an evidence-collation-discovery framework to search, generate, and refine credible evidence. It synthesizes inferential evidence related to the key entities in the question from the existing source knowledge and formulates search queries to uncover additional information. As new evidence is found, RetroRAG continually updates and organizes this information, enhancing its ability to locate further necessary evidence. Paired with an Answerer to generate and evaluate outputs, RetroRAG is capable of refining its reasoning process iteratively until a reliable answer is obtained. Empirical evaluations show that RetroRAG significantly outperforms existing methods.

Via

Access Paper or Ask Questions

Teleporter Theory: A General and Simple Approach for Modeling Cross-World Counterfactual Causality

Jun 18, 2024

Jiangmeng Li, Bin Qin, Qirui Ji, Yi Li, Wenwen Qiang, Jianwen Cao, Fanjiang Xu

Abstract:Leveraging the development of structural causal model (SCM), researchers can establish graphical models for exploring the causal mechanisms behind machine learning techniques. As the complexity of machine learning applications rises, single-world interventionism causal analysis encounters theoretical adaptation limitations. Accordingly, cross-world counterfactual approach extends our understanding of causality beyond observed data, enabling hypothetical reasoning about alternative scenarios. However, the joint involvement of cross-world variables, encompassing counterfactual variables and real-world variables, challenges the construction of the graphical model. Twin network is a subtle attempt, establishing a symbiotic relationship, to bridge the gap between graphical modeling and the introduction of counterfactuals albeit with room for improvement in generalization. In this regard, we demonstrate the theoretical breakdowns of twin networks in certain cross-world counterfactual scenarios. To this end, we propose a novel teleporter theory to establish a general and simple graphical representation of counterfactuals, which provides criteria for determining teleporter variables to connect multiple worlds. In theoretical application, we determine that introducing the proposed teleporter theory can directly obtain the conditional independence between counterfactual variables and real-world variables from the cross-world SCM without requiring complex algebraic derivations. Accordingly, we can further identify counterfactual causal effects through cross-world symbolic derivation. We demonstrate the generality of the teleporter theory to the practical application. Adhering to the proposed theory, we build a plug-and-play module, and the effectiveness of which are substantiated by experiments on benchmarks.

Via

Access Paper or Ask Questions

Revisiting Spurious Correlation in Domain Generalization

Jun 17, 2024

Bin Qin, Jiangmeng Li, Yi Li, Xuesong Wu, Yupeng Wang, Wenwen Qiang, Jianwen Cao

Figure 1 for Revisiting Spurious Correlation in Domain Generalization

Figure 2 for Revisiting Spurious Correlation in Domain Generalization

Figure 3 for Revisiting Spurious Correlation in Domain Generalization

Figure 4 for Revisiting Spurious Correlation in Domain Generalization

Abstract:Without loss of generality, existing machine learning techniques may learn spurious correlation dependent on the domain, which exacerbates the generalization of models in out-of-distribution (OOD) scenarios. To address this issue, recent works build a structural causal model (SCM) to describe the causality within data generation process, thereby motivating methods to avoid the learning of spurious correlation by models. However, from the machine learning viewpoint, such a theoretical analysis omits the nuanced difference between the data generation process and representation learning process, resulting in that the causal analysis based on the former cannot well adapt to the latter. To this end, we explore to build a SCM for representation learning process and further conduct a thorough analysis of the mechanisms underlying spurious correlation. We underscore that adjusting erroneous covariates introduces bias, thus necessitating the correct selection of spurious correlation mechanisms based on practical application scenarios. In this regard, we substantiate the correctness of the proposed SCM and further propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator, which can be integrated into any existing OOD method as a plug-and-play module. The empirical results comprehensively demonstrate the effectiveness of our method on synthetic and large-scale real OOD datasets.

Via

Access Paper or Ask Questions

Relay Decoding: Concatenating Large Language Models for Machine Translation

May 05, 2024

Chengpeng Fu, Xiaocheng Feng, Yichong Huang, Wenshuai Huo, Baohang Li, Hui Wang, Bin Qin, Ting Liu

Abstract:Leveraging large language models for machine translation has demonstrated promising results. However, it does require the large language models to possess the capability of handling both the source and target languages in machine translation. When it is challenging to find large models that support the desired languages, resorting to continuous learning methods becomes a costly endeavor. To mitigate these expenses, we propose an innovative approach called RD (Relay Decoding), which entails concatenating two distinct large models that individually support the source and target languages. By incorporating a simple mapping layer to facilitate the connection between these two models and utilizing a limited amount of parallel data for training, we successfully achieve superior results in the machine translation task. Experimental results conducted on the Multi30k and WikiMatrix datasets validate the effectiveness of our proposed method.

* Work in progress

Via

Access Paper or Ask Questions

PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

Jun 18, 2021

Hengyi Zheng, Rui Wen, Xi Chen, Yifan Yang, Yunyan Zhang, Ziheng Zhang, Ningyu Zhang, Bin Qin, Ming Xu, Yefeng Zheng

Figure 1 for PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

Figure 2 for PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

Figure 3 for PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

Figure 4 for PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

Abstract:Joint extraction of entities and relations from unstructured texts is a crucial task in information extraction. Recent methods achieve considerable performance but still suffer from some inherent limitations, such as redundancy of relation prediction, poor generalization of span-based extraction and inefficiency. In this paper, we decompose this task into three subtasks, Relation Judgement, Entity Extraction and Subject-object Alignment from a novel perspective and then propose a joint relational triple extraction framework based on Potential Relation and Global Correspondence (PRGC). Specifically, we design a component to predict potential relations, which constrains the following entity extraction to the predicted relation subset rather than all relations; then a relation-specific sequence tagging component is applied to handle the overlapping problem between subjects and objects; finally, a global correspondence component is designed to align the subject and object into a triple with low-complexity. Extensive experiments show that PRGC achieves state-of-the-art performance on public benchmarks with higher efficiency and delivers consistent performance gain on complex scenarios of overlapping triples.

* Accepted by ACL 2021

Via

Access Paper or Ask Questions

Partial FC: Training 10 Million Identities on a Single Machine

Oct 11, 2020

Xiang An, Xuhan Zhu, Yang Xiao, Lan Wu, Ming Zhang, Yuan Gao, Bin Qin, Debing Zhang, Ying Fu

Figure 1 for Partial FC: Training 10 Million Identities on a Single Machine

Figure 2 for Partial FC: Training 10 Million Identities on a Single Machine

Figure 3 for Partial FC: Training 10 Million Identities on a Single Machine

Figure 4 for Partial FC: Training 10 Million Identities on a Single Machine

Abstract:Face recognition has been an active and vital topic among computer vision community for a long time. Previous researches mainly focus on loss functions used for facial feature extraction network, among which the improvements of softmax-based loss functions greatly promote the performance of face recognition. However, the contradiction between the drastically increasing number of face identities and the shortage of GPU memories is gradually becoming irreconcilable. In this paper, we thoroughly analyze the optimization goal of softmax-based loss functions and the difficulty of training massive identities. We find that the importance of negative classes in softmax function in face representation learning is not as high as we previously thought. The experiment demonstrates no loss of accuracy when training with only 10\% randomly sampled classes for the softmax-based loss functions, compared with training with full classes using state-of-the-art models on mainstream benchmarks. We also implement a very efficient distributed sampling algorithm, taking into account model accuracy and training efficiency, which uses only eight NVIDIA RTX2080Ti to complete classification tasks with tens of millions of identities. The code of this paper has been made available https://github.com/deepinsight/insightface/tree/master/recognition/partial_fc.

* 8 pages, 9 figures

Via

Access Paper or Ask Questions

Multi-view Clustering with the Cooperation of Visible and Hidden Views

Aug 12, 2019

Zhaohong Deng, Ruixiu Liu, Te Zhang, Peng Xu, Kup-Sze Choi, Bin Qin, Shitong Wang

Figure 1 for Multi-view Clustering with the Cooperation of Visible and Hidden Views

Figure 2 for Multi-view Clustering with the Cooperation of Visible and Hidden Views

Figure 3 for Multi-view Clustering with the Cooperation of Visible and Hidden Views

Figure 4 for Multi-view Clustering with the Cooperation of Visible and Hidden Views

Abstract:Multi-view data are becoming common in real-world modeling tasks and many multi-view data clustering algorithms have thus been proposed. The existing algorithms usually focus on the cooperation of different views in the original space but neglect the influence of the hidden information among these different visible views, or they only consider the hidden information between the views. The algorithms are therefore not efficient since the available information is not fully excavated, particularly the otherness information in different views and the consistency information between them. In practice, the otherness and consistency information in multi-view data are both very useful for effective clustering analyses. In this study, a Multi-View clustering algorithm developed with the Cooperation of Visible and Hidden views, i.e., MV-Co-VH, is proposed. The MV-Co-VH algorithm first projects the multiple views from different visible spaces to the common hidden space by using the non-negative matrix factorization (NMF) strategy to obtain the common hidden view data. Collaborative learning is then implemented in the clustering procedure based on the visible views and the shared hidden view. The results of extensive experiments on UCI multi-view datasets and real-world image multi-view datasets show that the clustering performance of the proposed algorithm is competitive with or even better than that of the existing algorithms.

* This paper has been submitted to IEEE TKDE in Jun. 2019

Via

Access Paper or Ask Questions