Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qiao Liu

More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce

Apr 09, 2025

Tong Piao, Pei Tang, Zhipeng Zhang, Jiaqi Li, Qiao Liu, Zufeng Wu

Abstract:In recent years, Large Language Models (LLMs) have been widely applied across various domains due to their powerful domain adaptation capabilities. Previous studies have suggested that diverse, multi-modal data can enhance LLMs' domain adaptation performance. However, this hypothesis remains insufficiently validated in the e-commerce sector. To address this gap, we propose a comprehensive e-commerce multi-task framework and design empirical experiments to examine the impact of diverse data and tasks on LLMs from two perspectives: "capability comprehensiveness" and "task comprehensiveness." Specifically, we observe significant improvements in LLM performance by progressively introducing tasks related to new major capability areas and by continuously adding subtasks within different major capability domains. Furthermore, we observe that increasing model capacity amplifies the benefits of diversity, suggesting a synergistic relationship between model capacity and data diversity. Finally, we validate the best-performing model from our empirical experiments in the KDD Cup 2024, achieving a rank 5 in Task 1. This outcome demonstrates the significance of our research for advancing LLMs in the e-commerce domain.

* Accepted by KDD workshop 2024

Via

Access Paper or Ask Questions

An AI-powered Bayesian generative modeling approach for causal inference in observational studies

Jan 01, 2025

Qiao Liu, Wing Hung Wong

Figure 1 for An AI-powered Bayesian generative modeling approach for causal inference in observational studies

Figure 2 for An AI-powered Bayesian generative modeling approach for causal inference in observational studies

Figure 3 for An AI-powered Bayesian generative modeling approach for causal inference in observational studies

Figure 4 for An AI-powered Bayesian generative modeling approach for causal inference in observational studies

Abstract:Causal inference in observational studies with high-dimensional covariates presents significant challenges. We introduce CausalBGM, an AI-powered Bayesian generative modeling approach that captures the causal relationship among covariates, treatment, and outcome variables. The core innovation of CausalBGM lies in its ability to estimate the individual treatment effect (ITE) by learning individual-specific distributions of a low-dimensional latent feature set (e.g., latent confounders) that drives changes in both treatment and outcome. This approach not only effectively mitigates confounding effects but also provides comprehensive uncertainty quantification, offering reliable and interpretable causal effect estimates at the individual level. CausalBGM adopts a Bayesian model and uses a novel iterative algorithm to update the model parameters and the posterior distribution of latent features until convergence. This framework leverages the power of AI to capture complex dependencies among variables while adhering to the Bayesian principles. Extensive experiments demonstrate that CausalBGM consistently outperforms state-of-the-art methods, particularly in scenarios with high-dimensional covariates and large-scale datasets. Its Bayesian foundation ensures statistical rigor, providing robust and well-calibrated posterior intervals. By addressing key limitations of existing methods, CausalBGM emerges as a robust and promising framework for advancing causal inference in modern applications in fields such as genomics, healthcare, and social sciences. CausalBGM is maintained at the website https://causalbgm.readthedocs.io/.

Via

Access Paper or Ask Questions

Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Nov 07, 2024

Zhaoyang Zhang, Ziqi Chen, Qiao Liu, Jinhan Xie, Hongtu Zhu

Figure 1 for Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Figure 2 for Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Figure 3 for Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Figure 4 for Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Abstract:In this paper, we propose a novel framework, the Sampling-guided Heterogeneous Graph Neural Network (SHT-GNN), to effectively tackle the challenge of missing data imputation in longitudinal studies. Unlike traditional methods, which often require extensive preprocessing to handle irregular or inconsistent missing data, our approach accommodates arbitrary missing data patterns while maintaining computational efficiency. SHT-GNN models both observations and covariates as distinct node types, connecting observation nodes at successive time points through subject-specific longitudinal subnetworks, while covariate-observation interactions are represented by attributed edges within bipartite graphs. By leveraging subject-wise mini-batch sampling and a multi-layer temporal smoothing mechanism, SHT-GNN efficiently scales to large datasets, while effectively learning node representations and imputing missing data. Extensive experiments on both synthetic and real-world datasets, including the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, demonstrate that SHT-GNN significantly outperforms existing imputation methods, even with high missing data rates. The empirical results highlight SHT-GNN's robust imputation capabilities and superior performance, particularly in the context of complex, large-scale longitudinal data.

Via

Access Paper or Ask Questions

DiffuTraj: A Stochastic Vessel Trajectory Prediction Approach via Guided Diffusion Process

Oct 12, 2024

Changlin Li, Yanglei Gan, Tian Lan, Yuxiang Cai, Xueyi Liu, Run Lin, Qiao Liu

Abstract:Maritime vessel maneuvers, characterized by their inherent complexity and indeterminacy, requires vessel trajectory prediction system capable of modeling the multi-modality nature of future motion states. Conventional stochastic trajectory prediction methods utilize latent variables to represent the multi-modality of vessel motion, however, tends to overlook the complexity and dynamics inherent in maritime behavior. In contrast, we explicitly simulate the transition of vessel motion from uncertainty towards a state of certainty, effectively handling future indeterminacy in dynamic scenes. In this paper, we present a novel framework (\textit{DiffuTraj}) to conceptualize the trajectory prediction task as a guided reverse process of motion pattern uncertainty diffusion, in which we progressively remove uncertainty from maritime regions to delineate the intended trajectory. Specifically, we encode the previous states of the target vessel, vessel-vessel interactions, and the environment context as guiding factors for trajectory generation. Subsequently, we devise a transformer-based conditional denoiser to capture spatio-temporal dependencies, enabling the generation of trajectories better aligned for particular maritime environment. Comprehensive experiments on vessel trajectory prediction benchmarks demonstrate the superiority of our method.

* containing 14pages, 9 figures and 3 tables; Submitted to IEEE Transactions on Intelligent Transportation Systems on 17-June-2024

Via

Access Paper or Ask Questions

Progressive Domain Adaptation for Thermal Infrared Object Tracking

Jul 28, 2024

Qiao Li, Kanlun Tan, Qiao Liu, Di Yuan, Xin Li, Yunpeng Liu

Figure 1 for Progressive Domain Adaptation for Thermal Infrared Object Tracking

Figure 2 for Progressive Domain Adaptation for Thermal Infrared Object Tracking

Figure 3 for Progressive Domain Adaptation for Thermal Infrared Object Tracking

Figure 4 for Progressive Domain Adaptation for Thermal Infrared Object Tracking

Abstract:Due to the lack of large-scale labeled Thermal InfraRed (TIR) training datasets, most existing TIR trackers are trained directly on RGB datasets. However, tracking methods trained on RGB datasets suffer a significant drop-off in TIR data due to the domain shift issue. To this end, in this work, we propose a Progressive Domain Adaptation framework for TIR Tracking (PDAT), which transfers useful knowledge learned from RGB tracking to TIR tracking. The framework makes full use of large-scale labeled RGB datasets without requiring time-consuming and labor-intensive labeling of large-scale TIR data. Specifically, we first propose an adversarial-based global domain adaptation module to reduce domain gap on the feature level coarsely. Second, we design a clustering-based subdomain adaptation method to further align the feature distributions of the RGB and TIR datasets finely. These two domain adaptation modules gradually eliminate the discrepancy between the two domains, and thus learn domain-invariant fine-grained features through progressive training. Additionally, we collect a largescale TIR dataset with over 1.48 million unlabeled TIR images for training the proposed domain adaptation framework. Experimental results on five TIR tracking benchmarks show that the proposed method gains a nearly 6% success rate, demonstrating its effectiveness.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Dec 24, 2023

Da Luo, Yanglei Gan, Rui Hou, Run Lin, Qiao Liu, Yuxiang Cai, Wannian Gao

Figure 1 for Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Figure 2 for Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Figure 3 for Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Figure 4 for Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Abstract:Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learned representation with semantic richness in this learning paradigm is not fully explored. To address this gap, we introduce a novel synergistic anchored contrastive pre-training framework. This framework is motivated by the insight that the diverse viewpoints conveyed through instance-label pairs capture incomplete yet complementary intrinsic textual semantics. Specifically, our framework involves a symmetrical contrastive objective that encompasses both sentence-anchored and label-anchored contrastive losses. By combining these two losses, the model establishes a robust and uniform representation space. This space effectively captures the reciprocal alignment of feature distributions among instances and relational facts, simultaneously enhancing the maximization of mutual information across diverse perspectives within the same relation. Experimental results demonstrate that our framework achieves significant performance enhancements compared to baseline models in downstream FSRE tasks. Furthermore, our approach exhibits superior adaptability to handle the challenges of domain shift and zero-shot relation extraction. Our code is available online at https://github.com/AONE-NLP/FSRE-SaCon.

Via

Access Paper or Ask Questions

Aspect-oriented Opinion Alignment Network for Aspect-Based Sentiment Classification

Aug 22, 2023

Xueyi Liu, Rui Hou, Yanglei Gan, Da Luo, Changlin Li, Xiaojun Shi, Qiao Liu

Abstract:Aspect-based sentiment classification is a crucial problem in fine-grained sentiment analysis, which aims to predict the sentiment polarity of the given aspect according to its context. Previous works have made remarkable progress in leveraging attention mechanism to extract opinion words for different aspects. However, a persistent challenge is the effective management of semantic mismatches, which stem from attention mechanisms that fall short in adequately aligning opinions words with their corresponding aspect in multi-aspect sentences. To address this issue, we propose a novel Aspect-oriented Opinion Alignment Network (AOAN) to capture the contextual association between opinion words and the corresponding aspect. Specifically, we first introduce a neighboring span enhanced module which highlights various compositions of neighboring words and given aspects. In addition, we design a multi-perspective attention mechanism that align relevant opinion information with respect to the given aspect. Extensive experiments on three benchmark datasets demonstrate that our model achieves state-of-the-art results. The source code is available at https://github.com/AONE-NLP/ABSA-AOAN.

* 8 pages, 5 figure, ECAI 2023

Via

Access Paper or Ask Questions

Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

Jul 20, 2023

Zhipeng Zhang, Piao Tong, Yingwei Ma, Qiao Liu, Xujiang Liu, Xu Luo

Figure 1 for Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

Figure 2 for Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

Figure 3 for Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

Figure 4 for Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

Abstract:Session-based recommendation techniques aim to capture dynamic user behavior by analyzing past interactions. However, existing methods heavily rely on historical item ID sequences to extract user preferences, leading to challenges such as popular bias and cold-start problems. In this paper, we propose a hybrid multimodal approach for session-based recommendation to address these challenges. Our approach combines different modalities, including textual content and item IDs, leveraging the complementary nature of these modalities using CatBoost. To learn universal item representations, we design a language representation-based item retrieval architecture that extracts features from the textual content utilizing pre-trained language models. Furthermore, we introduce a novel Decoupled Contrastive Learning method to enhance the effectiveness of the language representation. This technique decouples the sequence representation and item representation space, facilitating bidirectional alignment through dual-queue contrastive learning. Simultaneously, the momentum queue provides a large number of negative samples, effectively enhancing the effectiveness of contrastive learning. Our approach yielded competitive results, securing a 5th place ranking in KDD CUP 2023 Task 1. We have released the source code and pre-trained models associated with this work.

Via

Access Paper or Ask Questions

CausalEGM: a general causal inference framework by encoding generative modeling

Dec 13, 2022

Qiao Liu, Zhongren Chen, Wing Hung Wong

Abstract:Although understanding and characterizing causal effects have become essential in observational studies, it is challenging when the confounders are high-dimensional. In this article, we develop a general framework $\textit{CausalEGM}$ for estimating causal effects by encoding generative modeling, which can be applied in both binary and continuous treatment settings. Under the potential outcome framework with unconfoundedness, we establish a bidirectional transformation between the high-dimensional confounders space and a low-dimensional latent space where the density is known (e.g., multivariate normal distribution). Through this, CausalEGM simultaneously decouples the dependencies of confounders on both treatment and outcome and maps the confounders to the low-dimensional latent space. By conditioning on the low-dimensional latent features, CausalEGM can estimate the causal effect for each individual or the average causal effect within a population. Our theoretical analysis shows that the excess risk for CausalEGM can be bounded through empirical process theory. Under an assumption on encoder-decoder networks, the consistency of the estimate can be guaranteed. In a series of experiments, CausalEGM demonstrates superior performance over existing methods for both binary and continuous treatments. Specifically, we find CausalEGM to be substantially more powerful than competing methods in the presence of large sample sizes and high dimensional confounders. The software of CausalEGM is freely available at https://github.com/SUwonglab/CausalEGM.

* Corrected typos

Via

Access Paper or Ask Questions

Mutual Information Learned Regressor: an Information-theoretic Viewpoint of Training Regression Systems

Nov 23, 2022

Jirong Yi, Qiaosheng Zhang, Zhen Chen, Qiao Liu, Wei Shao, Yusen He, Yaohua Wang

Abstract:As one of the central tasks in machine learning, regression finds lots of applications in different fields. An existing common practice for solving regression problems is the mean square error (MSE) minimization approach or its regularized variants which require prior knowledge about the models. Recently, Yi et al., proposed a mutual information based supervised learning framework where they introduced a label entropy regularization which does not require any prior knowledge. When applied to classification tasks and solved via a stochastic gradient descent (SGD) optimization algorithm, their approach achieved significant improvement over the commonly used cross entropy loss and its variants. However, they did not provide a theoretical convergence analysis of the SGD algorithm for the proposed formulation. Besides, applying the framework to regression tasks is nontrivial due to the potentially infinite support set of the label. In this paper, we investigate the regression under the mutual information based supervised learning framework. We first argue that the MSE minimization approach is equivalent to a conditional entropy learning problem, and then propose a mutual information learning formulation for solving regression problems by using a reparameterization technique. For the proposed formulation, we give the convergence analysis of the SGD algorithm for solving it in practice. Finally, we consider a multi-output regression data model where we derive the generalization performance lower bound in terms of the mutual information associated with the underlying data distribution. The result shows that the high dimensionality can be a bless instead of a curse, which is controlled by a threshold. We hope our work will serve as a good starting point for further research on the mutual information based regression.

* 28 pages, 2 figures, presubmitted to AISTATS2023 for reviewing

Via

Access Paper or Ask Questions