Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haoliang Wang

DistrAttention: An Efficient and Flexible Self-Attention Mechanism on Modern GPUs

Jul 23, 2025

Haolin Jin, Mengbai Xiao, Yuan Yuan, Xiao Zhang, Dongxiao Yu, Guanghui Zhang, Haoliang Wang

Abstract:The Transformer architecture has revolutionized deep learning, delivering the state-of-the-art performance in areas such as natural language processing, computer vision, and time series prediction. However, its core component, self-attention, has the quadratic time complexity relative to input sequence length, which hinders the scalability of Transformers. The exsiting approaches on optimizing self-attention either discard full-contextual information or lack of flexibility. In this work, we design DistrAttention, an effcient and flexible self-attention mechanism with the full context. DistrAttention achieves this by grouping data on the embedding dimensionality, usually referred to as $d$. We realize DistrAttention with a lightweight sampling and fusion method that exploits locality-sensitive hashing to group similar data. A block-wise grouping framework is further designed to limit the errors introduced by locality sensitive hashing. By optimizing the selection of block sizes, DistrAttention could be easily integrated with FlashAttention-2, gaining high-performance on modern GPUs. We evaluate DistrAttention with extensive experiments. The results show that our method is 37% faster than FlashAttention-2 on calculating self-attention. In ViT inference, DistrAttention is the fastest and the most accurate among approximate self-attention mechanisms. In Llama3-1B, DistrAttention still achieves the lowest inference time with only 1% accuray loss.

Via

Access Paper or Ask Questions

Feature-Space Semantic Invariance: Enhanced OOD Detection for Open-Set Domain Generalization

Nov 11, 2024

Haoliang Wang, Chen Zhao, Feng Chen

Abstract:Open-set domain generalization addresses a real-world challenge: training a model to generalize across unseen domains (domain generalization) while also detecting samples from unknown classes not encountered during training (open-set recognition). However, most existing approaches tackle these issues separately, limiting their practical applicability. To overcome this limitation, we propose a unified framework for open-set domain generalization by introducing Feature-space Semantic Invariance (FSI). FSI maintains semantic consistency across different domains within the feature space, enabling more accurate detection of OOD instances in unseen domains. Additionally, we adopt a generative model to produce synthetic data with novel domain styles or class labels, enhancing model robustness. Initial experiments show that our method improves AUROC by 9.1% to 18.9% on ColoredMNIST, while also significantly increasing in-distribution classification accuracy.

* IEEE BigData 2024, Ph.D. Forum

Via

Access Paper or Ask Questions

MADOD: Generalizing OOD Detection to Unseen Domains via G-Invariance Meta-Learning

Nov 02, 2024

Haoliang Wang, Chen Zhao, Feng Chen

Abstract:Real-world machine learning applications often face simultaneous covariate and semantic shifts, challenging traditional domain generalization and out-of-distribution (OOD) detection methods. We introduce Meta-learned Across Domain Out-of-distribution Detection (MADOD), a novel framework designed to address both shifts concurrently. MADOD leverages meta-learning and G-invariance to enhance model generalizability and OOD detection in unseen domains. Our key innovation lies in task construction: we randomly designate in-distribution classes as pseudo-OODs within each meta-learning task, simulating OOD scenarios using existing data. This approach, combined with energy-based regularization, enables the learning of robust, domain-invariant features while calibrating decision boundaries for effective OOD detection. Operating in a test domain-agnostic setting, MADOD eliminates the need for adaptation during inference, making it suitable for scenarios where test data is unavailable. Extensive experiments on real-world and synthetic datasets demonstrate MADOD's superior performance in semantic OOD detection across unseen domains, achieving an AUPR improvement of 8.48% to 20.81%, while maintaining competitive in-distribution classification accuracy, representing a significant advancement in handling both covariate and semantic shifts.

* IEEE International Conference on Big Data 2024

Via

Access Paper or Ask Questions

FEED: Fairness-Enhanced Meta-Learning for Domain Generalization

Nov 02, 2024

Kai Jiang, Chen Zhao, Haoliang Wang, Feng Chen

Abstract:Generalizing to out-of-distribution data while being aware of model fairness is a significant and challenging problem in meta-learning. The goal of this problem is to find a set of fairness-aware invariant parameters of classifier that is trained using data drawn from a family of related training domains with distribution shift on non-sensitive features as well as different levels of dependence between model predictions and sensitive features so that the classifier can achieve good generalization performance on unknown but distinct test domains. To tackle this challenge, existing state-of-the-art methods either address the domain generalization problem but completely ignore learning with fairness or solely specify shifted domains with various fairness levels. This paper introduces an approach to fairness-aware meta-learning that significantly enhances domain generalization capabilities. Our framework, Fairness-Enhanced Meta-Learning for Domain Generalization (FEED), disentangles latent data representations into content, style, and sensitive vectors. This disentanglement facilitates the robust generalization of machine learning models across diverse domains while adhering to fairness constraints. Unlike traditional methods that focus primarily on domain invariance or sensitivity to shifts, our model integrates a fairness-aware invariance criterion directly into the meta-learning process. This integration ensures that the learned parameters uphold fairness consistently, even when domain characteristics vary widely. We validate our approach through extensive experiments across multiple benchmarks, demonstrating not only superior performance in maintaining high accuracy and fairness but also significant improvements over existing state-of-the-art methods in domain generalization tasks.

* IEEE International Conference on Big Data 2024

Via

Access Paper or Ask Questions

Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

Oct 17, 2024

Yu Xia, Junda Wu, Sungchul Kim, Tong Yu, Ryan A. Rossi, Haoliang Wang, Julian McAuley

Figure 1 for Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

Figure 2 for Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

Figure 3 for Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

Figure 4 for Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

Abstract:Large language models (LLMs) have been used to generate query expansions augmenting original queries for improving information search. Recent studies also explore providing LLMs with initial retrieval results to generate query expansions more grounded to document corpus. However, these methods mostly focus on enhancing textual similarities between search queries and target documents, overlooking document relations. For queries like "Find me a highly rated camera for wildlife photography compatible with my Nikon F-Mount lenses", existing methods may generate expansions that are semantically similar but structurally unrelated to user intents. To handle such semi-structured queries with both textual and relational requirements, in this paper we propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG). To further address the limitation of entity-based scoring in existing KG-based methods, we leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR). Extensive experiments on three datasets of diverse domains show the advantages of our method compared against state-of-the-art baselines on textual and relational semi-structured retrieval.

Via

Access Paper or Ask Questions

Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

Mar 14, 2024

Xiaoyu Liu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, Yuhang Zhou, Fuxiao Liu, Tianrui Guan, Haoliang Wang, Tong Yu(+3 more)

Figure 1 for Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

Figure 2 for Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

Abstract:Causal inference has shown potential in enhancing the predictive accuracy, fairness, robustness, and explainability of Natural Language Processing (NLP) models by capturing causal relationships among variables. The emergence of generative Large Language Models (LLMs) has significantly impacted various NLP domains, particularly through their advanced reasoning capabilities. This survey focuses on evaluating and improving LLMs from a causal view in the following areas: understanding and improving the LLMs' reasoning capacity, addressing fairness and safety issues in LLMs, complementing LLMs with explanations, and handling multimodality. Meanwhile, LLMs' strong reasoning capacities can in turn contribute to the field of causal inference by aiding causal relationship discovery and causal effect estimations. This review explores the interplay between causal inference frameworks and LLMs from both perspectives, emphasizing their collective potential to further the development of more advanced and equitable artificial intelligence systems.

Via

Access Paper or Ask Questions

VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

Dec 04, 2023

Yizhou Wang, Ruiyi Zhang, Haoliang Wang, Uttaran Bhattacharya, Yun Fu, Gang Wu

Figure 1 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

Figure 2 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

Figure 3 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

Figure 4 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

Abstract:Recent advancements in language-model-based video understanding have been progressing at a remarkable pace, spurred by the introduction of Large Language Models (LLMs). However, the focus of prior research has been predominantly on devising a projection layer that maps video features to tokens, an approach that is both rudimentary and inefficient. In our study, we introduce a cutting-edge framework, VaQuitA, designed to refine the synergy between video and textual information. At the data level, instead of sampling frames uniformly, we implement a sampling method guided by CLIP-score rankings, which enables a more aligned selection of frames with the given question. At the feature level, we integrate a trainable Video Perceiver alongside a Visual-Query Transformer (abbreviated as VQ-Former), which bolsters the interplay between the input question and the video features. We also discover that incorporating a simple prompt, "Please be critical", into the LLM input can substantially enhance its video comprehension capabilities. Our experimental results indicate that VaQuitA consistently sets a new benchmark for zero-shot video question-answering tasks and is adept at producing high-quality, multi-turn video dialogues with users.

Via

Access Paper or Ask Questions

Fairness-Aware Domain Generalization under Covariate and Dependence Shifts

Nov 23, 2023

Chen Zhao, Kai Jiang, Xintao Wu, Haoliang Wang, Latifur Khan, Christan Grant, Feng Chen

Figure 1 for Fairness-Aware Domain Generalization under Covariate and Dependence Shifts

Figure 2 for Fairness-Aware Domain Generalization under Covariate and Dependence Shifts

Figure 3 for Fairness-Aware Domain Generalization under Covariate and Dependence Shifts

Figure 4 for Fairness-Aware Domain Generalization under Covariate and Dependence Shifts

Abstract:Achieving the generalization of an invariant classifier from source domains to shifted target domains while simultaneously considering model fairness is a substantial and complex challenge in machine learning. Existing domain generalization research typically attributes domain shifts to concept shift, which relates to alterations in class labels, and covariate shift, which pertains to variations in data styles. In this paper, by introducing another form of distribution shift, known as dependence shift, which involves variations in fair dependence patterns across domains, we propose a novel domain generalization approach that addresses domain shifts by considering both covariate and dependence shifts. We assert the existence of an underlying transformation model can transform data from one domain to another. By generating data in synthetic domains through the model, a fairness-aware invariant classifier is learned that enforces both model accuracy and fairness in unseen domains. Extensive empirical studies on four benchmark datasets demonstrate that our approach surpasses state-of-the-art methods.

Via

Access Paper or Ask Questions

Towards Effective Semantic OOD Detection in Unseen Domains: A Domain Generalization Perspective

Sep 18, 2023

Haoliang Wang, Chen Zhao, Yunhui Guo, Kai Jiang, Feng Chen

Abstract:Two prevalent types of distributional shifts in machine learning are the covariate shift (as observed across different domains) and the semantic shift (as seen across different classes). Traditional OOD detection techniques typically address only one of these shifts. However, real-world testing environments often present a combination of both covariate and semantic shifts. In this study, we introduce a novel problem, semantic OOD detection across domains, which simultaneously addresses both distributional shifts. To this end, we introduce two regularization strategies: domain generalization regularization, which ensures semantic invariance across domains to counteract the covariate shift, and OOD detection regularization, designed to enhance OOD detection capabilities against the semantic shift through energy bounding. Through rigorous testing on three standard domain generalization benchmarks, our proposed framework showcases its superiority over conventional domain generalization approaches in terms of OOD detection performance. Moreover, it holds its ground by maintaining comparable InD classification accuracy.

Via

Access Paper or Ask Questions

Measuring and Modeling Physical Intrinsic Motivation

May 24, 2023

Julio Martinez, Felix Binder, Haoliang Wang, Nicker Haber, Judith Fan, Daniel L. K. Yamins

Figure 1 for Measuring and Modeling Physical Intrinsic Motivation

Figure 2 for Measuring and Modeling Physical Intrinsic Motivation

Figure 3 for Measuring and Modeling Physical Intrinsic Motivation

Figure 4 for Measuring and Modeling Physical Intrinsic Motivation

Abstract:Humans are interactive agents driven to seek out situations with interesting physical dynamics. Here we formalize the functional form of physical intrinsic motivation. We first collect ratings of how interesting humans find a variety of physics scenarios. We then model human interestingness responses by implementing various hypotheses of intrinsic motivation including models that rely on simple scene features to models that depend on forward physics prediction. We find that the single best predictor of human responses is adversarial reward, a model derived from physical prediction loss. We also find that simple scene feature models do not generalize their prediction of human responses across all scenarios. Finally, linearly combining the adversarial model with the number of collisions in a scene leads to the greatest improvement in predictivity of human responses, suggesting humans are driven towards scenarios that result in high information gain and physical activity.

* 6 pages, 5 figures, accepted to CogSci 2023 with full paper publication in the proceedings

Via

Access Paper or Ask Questions