Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Baijiong Lin

PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model

May 06, 2025

Baijiong Lin, Weisen Jiang, Yuancheng Xu, Hao Chen, Ying-Cong Chen

Abstract:Multi-objective test-time alignment aims to adapt large language models (LLMs) to diverse multi-dimensional user preferences during inference while keeping LLMs frozen. Recently, GenARM (Xu et al., 2025) first independently trains Autoregressive Reward Models (ARMs) for each preference dimension without awareness of each other, then combines their outputs based on user-specific preference vectors during inference to achieve multi-objective test-time alignment, leading to two key limitations: the need for \textit{multiple} ARMs increases the inference cost, and the separate training of ARMs causes the misalignment between the guided generation and the user preferences. To address these issues, we propose Preference-aware ARM (PARM), a single unified ARM trained across all preference dimensions. PARM uses our proposed Preference-Aware Bilinear Low-Rank Adaptation (PBLoRA), which employs a bilinear form to condition the ARM on preference vectors, enabling it to achieve precise control over preference trade-offs during inference. Experiments demonstrate that PARM reduces inference costs and achieves better alignment with preference vectors compared with existing methods. Additionally, PARM enables weak-to-strong guidance, allowing a smaller PARM to guide a larger frozen LLM without expensive training, making multi-objective alignment accessible with limited computing resources. The code is available at https://github.com/Baijiong-Lin/PARM.

* Accepted by ICML 2025

Via

Access Paper or Ask Questions

Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond

Jan 19, 2025

Weiyu Chen, Xiaoyuan Zhang, Baijiong Lin, Xi Lin, Han Zhao, Qingfu Zhang, James T. Kwok

Figure 1 for Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond

Figure 2 for Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond

Figure 3 for Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond

Figure 4 for Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond

Abstract:Multi-objective optimization (MOO) in deep learning aims to simultaneously optimize multiple conflicting objectives, a challenge frequently encountered in areas like multi-task learning and multi-criteria learning. Recent advancements in gradient-based MOO methods have enabled the discovery of diverse types of solutions, ranging from a single balanced solution to finite or even infinite Pareto sets, tailored to user needs. These developments have broad applications across domains such as reinforcement learning, computer vision, recommendation systems, and large language models. This survey provides the first comprehensive review of gradient-based MOO in deep learning, covering algorithms, theories, and practical applications. By unifying various approaches and identifying critical challenges, it serves as a foundational resource for driving innovation in this evolving field. A comprehensive list of MOO algorithms in deep learning is available at \url{https://github.com/Baijiong-Lin/Awesome-Multi-Objective-Deep-Learning}.

Via

Access Paper or Ask Questions

RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Sep 30, 2024

Shuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, Yu Zhang

Figure 1 for RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Figure 2 for RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Figure 3 for RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Figure 4 for RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Abstract:Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. However, existing routing models are ineffective when multiple LLMs perform well for a query. To address this problem, in this paper, we propose a method called query-based Router by Dual Contrastive learning (RouterDC). The RouterDC model consists of an encoder and LLM embeddings, and we propose two contrastive learning losses to train the RouterDC model. Experimental results show that RouterDC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution (+2.76\%) and out-of-distribution (+1.90\%) tasks. Source code is available at https://github.com/shuhao02/RouterDC.

* Accepted by NeurIPS 2024

Via

Access Paper or Ask Questions

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Jul 02, 2024

Baijiong Lin, Weisen Jiang, Pengguang Chen, Yu Zhang, Shu Liu, Ying-Cong Chen

Abstract:Multi-task dense scene understanding, which learns a model for multiple dense prediction tasks, has a wide range of application scenarios. Modeling long-range dependency and enhancing cross-task interactions are crucial to multi-task dense prediction. In this paper, we propose MTMamba, a novel Mamba-based architecture for multi-task scene understanding. It contains two types of core blocks: self-task Mamba (STM) block and cross-task Mamba (CTM) block. STM handles long-range dependency by leveraging Mamba, while CTM explicitly models task interactions to facilitate information exchange across tasks. Experiments on NYUDv2 and PASCAL-Context datasets demonstrate the superior performance of MTMamba over Transformer-based and CNN-based methods. Notably, on the PASCAL-Context dataset, MTMamba achieves improvements of +2.08, +5.01, and +4.90 over the previous best method in the tasks of semantic segmentation, human parsing, and object boundary detection, respectively. The code is available at \url{https://github.com/EnVision-Research/MTMamba}.

Via

Access Paper or Ask Questions

A First-Order Multi-Gradient Algorithm for Multi-Objective Bi-Level Optimization

Jan 17, 2024

Feiyang Ye, Baijiong Lin, Xiaofeng Cao, Yu Zhang, Ivor Tsang

Abstract:In this paper, we study the Multi-Objective Bi-Level Optimization (MOBLO) problem, where the upper-level subproblem is a multi-objective optimization problem and the lower-level subproblem is for scalar optimization. Existing gradient-based MOBLO algorithms need to compute the Hessian matrix, causing the computational inefficient problem. To address this, we propose an efficient first-order multi-gradient method for MOBLO, called FORUM. Specifically, we reformulate MOBLO problems as a constrained multi-objective optimization (MOO) problem via the value-function approach. Then we propose a novel multi-gradient aggregation method to solve the challenging constrained MOO problem. Theoretically, we provide the complexity analysis to show the efficiency of the proposed method and a non-asymptotic convergence result. Empirically, extensive experiments demonstrate the effectiveness and efficiency of the proposed FORUM method in different learning problems. In particular, it achieves state-of-the-art performance on three multi-task learning benchmark datasets.

* Technical Report

Via

Access Paper or Ask Questions

Effective and Parameter-Efficient Reusing Fine-Tuned Models

Oct 04, 2023

Weisen Jiang, Baijiong Lin, Han Shi, Yu Zhang, Zhenguo Li, James T. Kwok

Figure 1 for Effective and Parameter-Efficient Reusing Fine-Tuned Models

Figure 2 for Effective and Parameter-Efficient Reusing Fine-Tuned Models

Figure 3 for Effective and Parameter-Efficient Reusing Fine-Tuned Models

Figure 4 for Effective and Parameter-Efficient Reusing Fine-Tuned Models

Abstract:Many pre-trained large-scale models provided online have become highly effective in transferring to downstream tasks. At the same time, various task-specific models fine-tuned on these pre-trained models are available online for public use. In practice, as collecting task-specific data is labor-intensive and fine-tuning the large pre-trained models is computationally expensive, one can reuse task-specific finetuned models to deal with downstream tasks. However, using a model per task causes a heavy burden on storage and serving. Recently, many training-free and parameter-efficient methods have been proposed for reusing multiple fine-tuned task-specific models into a single multi-task model. However, these methods exhibit a large accuracy gap compared with using a fine-tuned model per task. In this paper, we propose Parameter-Efficient methods for ReUsing (PERU) fine-tuned models. For reusing Fully Fine-Tuned (FFT) models, we propose PERU-FFT by injecting a sparse task vector into a merged model by magnitude pruning. For reusing LoRA fine-tuned models, we propose PERU-LoRA use a lower-rank matrix to approximate the LoRA matrix by singular value decomposition. Both PERUFFT and PERU-LoRA are training-free. Extensive experiments conducted on computer vision and natural language process tasks demonstrate the effectiveness and parameter-efficiency of the proposed methods. The proposed PERU-FFT and PERU-LoRA outperform existing reusing model methods by a large margin and achieve comparable performance to using a fine-tuned model per task.

* Technical Report

Via

Access Paper or Ask Questions

Efficient Transfer Learning in Diffusion Models via Adversarial Noise

Aug 23, 2023

Xiyu Wang, Baijiong Lin, Daochang Liu, Chang Xu

Figure 1 for Efficient Transfer Learning in Diffusion Models via Adversarial Noise

Figure 2 for Efficient Transfer Learning in Diffusion Models via Adversarial Noise

Figure 3 for Efficient Transfer Learning in Diffusion Models via Adversarial Noise

Figure 4 for Efficient Transfer Learning in Diffusion Models via Adversarial Noise

Abstract:Diffusion Probabilistic Models (DPMs) have demonstrated substantial promise in image generation tasks but heavily rely on the availability of large amounts of training data. Previous works, like GANs, have tackled the limited data problem by transferring pre-trained models learned with sufficient data. However, those methods are hard to be utilized in DPMs since the distinct differences between DPM-based and GAN-based methods, showing in the unique iterative denoising process integral and the need for many timesteps with no-targeted noise in DPMs. In this paper, we propose a novel DPMs-based transfer learning method, TAN, to address the limited data problem. It includes two strategies: similarity-guided training, which boosts transfer with a classifier, and adversarial noise selection which adaptive chooses targeted noise based on the input image. Extensive experiments in the context of few-shot image generation tasks demonstrate that our method is not only efficient but also excels in terms of image quality and diversity when compared to existing GAN-based and DDPM-based methods.

Via

Access Paper or Ask Questions

A Scale-Invariant Task Balancing Approach for Multi-Task Learning

Aug 23, 2023

Baijiong Lin, Weisen Jiang, Feiyang Ye, Yu Zhang, Pengguang Chen, Ying-Cong Chen, Shu Liu

Abstract:Multi-task learning (MTL), a learning paradigm to learn multiple related tasks simultaneously, has achieved great success in various fields. However, task-balancing remains a significant challenge in MTL, with the disparity in loss/gradient scales often leading to performance compromises. In this paper, we propose a Scale-Invariant Multi-Task Learning (SI-MTL) method to alleviate the task-balancing problem from both loss and gradient perspectives. Specifically, SI-MTL contains a logarithm transformation which is performed on all task losses to ensure scale-invariant at the loss level, and a gradient balancing method, SI-G, which normalizes all task gradients to the same magnitude as the maximum gradient norm. Extensive experiments conducted on several benchmark datasets consistently demonstrate the effectiveness of SI-G and the state-of-the-art performance of SI-MTL.

* Technical Report

Via

Access Paper or Ask Questions

LibMTL: A Python Library for Multi-Task Learning

Mar 27, 2022

Baijiong Lin, Yu Zhang

Figure 1 for LibMTL: A Python Library for Multi-Task Learning

Abstract:This paper presents LibMTL, an open-source Python library built on PyTorch, which provides a unified, comprehensive, reproducible, and extensible implementation framework for Multi-Task Learning (MTL). LibMTL considers different settings and approaches in MTL, and it supports a large number of state-of-the-art MTL methods, including 12 loss weighting strategies, 7 architectures, and 84 combinations of different architectures and loss weighting methods. Moreover, the modular design in LibMTL makes it easy-to-use and well extensible, thus users can easily and fast develop new MTL methods, compare with existing MTL methods fairly, or apply MTL algorithms to real-world applications with the support of LibMTL. The source code and detailed documentations of LibMTL are available at https://github.com/median-research-group/LibMTL and https://libmtl.readthedocs.io, respectively.

Via

Access Paper or Ask Questions

A Closer Look at Loss Weighting in Multi-Task Learning

Nov 20, 2021

Baijiong Lin, Feiyang Ye, Yu Zhang

Figure 1 for A Closer Look at Loss Weighting in Multi-Task Learning

Figure 2 for A Closer Look at Loss Weighting in Multi-Task Learning

Figure 3 for A Closer Look at Loss Weighting in Multi-Task Learning

Figure 4 for A Closer Look at Loss Weighting in Multi-Task Learning

Abstract:Multi-Task Learning (MTL) has achieved great success in various fields, however, how to balance different tasks to avoid negative effects is still a key problem. To achieve the task balancing, there exist many works to balance task losses or gradients. In this paper, we unify eight representative task balancing methods from the perspective of loss weighting and provide a consistent experimental comparison. Moreover, we surprisingly find that training a MTL model with random weights sampled from a distribution can achieve comparable performance over state-of-the-art baselines. Based on this finding, we propose a simple yet effective weighting strategy called Random Loss Weighting (RLW), which can be implemented in only one additional line of code over existing works. Theoretically, we analyze the convergence of RLW and reveal that RLW has a higher probability to escape local minima than existing models with fixed task weights, resulting in a better generalization ability. Empirically, we extensively evaluate the proposed RLW method on six image datasets and four multilingual tasks from the XTREME benchmark to show the effectiveness of the proposed RLW strategy when compared with state-of-the-art strategies.

Via

Access Paper or Ask Questions