Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhenan Fan

Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs

May 20, 2025

Morgan Lindsay Heisler, Linzi Xing, Ge Shi, Hanieh Sadri, Gursimran Singh, Weiwei Zhang, Tao Ye, Ying Xiong, Yong Zhang, Zhenan Fan

Abstract:Huawei Cloud users leverage LoRA (Low-Rank Adaptation) as an efficient and scalable method to fine-tune and customize large language models (LLMs) for application-specific needs. However, tasks that require complex reasoning or deep contextual understanding are often hindered by biases or interference from the base model when using typical decoding methods like greedy or beam search. These biases can lead to generic or task-agnostic responses from the base model instead of leveraging the LoRA-specific adaptations. In this paper, we introduce Contrastive LoRA Decoding (CoLD), a novel decoding framework designed to maximize the use of task-specific knowledge in LoRA-adapted models, resulting in better downstream performance. CoLD uses contrastive decoding by scoring candidate tokens based on the divergence between the probability distributions of a LoRA-adapted expert model and the corresponding base model. This approach prioritizes tokens that better align with the LoRA's learned representations, enhancing performance for specialized tasks. While effective, a naive implementation of CoLD is computationally expensive because each decoding step requires evaluating multiple token candidates across both models. To address this, we developed an optimized kernel for Huawei's Ascend NPU. CoLD achieves up to a 5.54% increase in task accuracy while reducing end-to-end latency by 28% compared to greedy decoding. This work provides practical and efficient decoding strategies for fine-tuned LLMs in resource-constrained environments and has broad implications for applied data science in both cloud and on-premises settings.

* Accepted at ACM KDD 2025

Via

Access Paper or Ask Questions

Learn2Aggregate: Supervised Generation of Chvátal-Gomory Cuts Using Graph Neural Networks

Sep 10, 2024

Arnaud Deza, Elias B. Khalil, Zhenan Fan, Zirui Zhou, Yong Zhang

Figure 1 for Learn2Aggregate: Supervised Generation of Chvátal-Gomory Cuts Using Graph Neural Networks

Figure 2 for Learn2Aggregate: Supervised Generation of Chvátal-Gomory Cuts Using Graph Neural Networks

Figure 3 for Learn2Aggregate: Supervised Generation of Chvátal-Gomory Cuts Using Graph Neural Networks

Figure 4 for Learn2Aggregate: Supervised Generation of Chvátal-Gomory Cuts Using Graph Neural Networks

Abstract:We present $\textit{Learn2Aggregate}$, a machine learning (ML) framework for optimizing the generation of Chv\'atal-Gomory (CG) cuts in mixed integer linear programming (MILP). The framework trains a graph neural network to classify useful constraints for aggregation in CG cut generation. The ML-driven CG separator selectively focuses on a small set of impactful constraints, improving runtimes without compromising the strength of the generated cuts. Key to our approach is the formulation of a constraint classification task which favours sparse aggregation of constraints, consistent with empirical findings. This, in conjunction with a careful constraint labeling scheme and a hybrid of deep learning and feature engineering, results in enhanced CG cut generation across five diverse MILP benchmarks. On the largest test sets, our method closes roughly $\textit{twice}$ as much of the integrality gap as the standard CG method while running 40$% faster. This performance improvement is due to our method eliminating 75% of the constraints prior to aggregation.

* 12 pages, 8 figures

Via

Access Paper or Ask Questions

DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Jun 12, 2024

Yuxi Feng, Raymond Li, Zhenan Fan, Giuseppe Carenini, Mohammadreza Pourreza, Weiwei Zhang, Yong Zhang

Figure 1 for DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Figure 2 for DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Figure 3 for DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Figure 4 for DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Abstract:While in-context Learning (ICL) has proven to be an effective technique to improve the performance of Large Language Models (LLMs) in a variety of complex tasks, notably in translating natural language questions into Structured Query Language (NL2SQL), the question of how to select the most beneficial demonstration examples remains an open research problem. While prior works often adapted off-the-shelf encoders to retrieve examples dynamically, an inherent discrepancy exists in the representational capacities between the external retrievers and the LLMs. Further, optimizing the selection of examples is a non-trivial task, since there are no straightforward methods to assess the relative benefits of examples without performing pairwise inference. To address these shortcomings, we propose DeTriever, a novel demonstration retrieval framework that learns a weighted combination of LLM hidden states, where rich semantic information is encoded. To train the model, we propose a proxy score that estimates the relative benefits of examples based on the similarities between output queries. Experiments on two popular NL2SQL benchmarks demonstrate that our method significantly outperforms the state-of-the-art baselines on one-shot NL2SQL tasks.

Via

Access Paper or Ask Questions

SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Mar 24, 2024

Mohammadreza Pourreza, Davood Rafiei, Yuxi Feng, Raymond Li, Zhenan Fan, Weiwei Zhang

Figure 1 for SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Figure 2 for SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Figure 3 for SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Figure 4 for SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Abstract:Detecting structural similarity between queries is essential for selecting examples in in-context learning models. However, assessing structural similarity based solely on the natural language expressions of queries, without considering SQL queries, presents a significant challenge. This paper explores the significance of this similarity metric and proposes a model for accurately estimating it. To achieve this, we leverage a dataset comprising 170k question pairs, meticulously curated to train a similarity prediction model. Our comprehensive evaluation demonstrates that the proposed model adeptly captures the structural similarity between questions, as evidenced by improvements in Kendall-Tau distance and precision@k metrics. Notably, our model outperforms strong competitive embedding models from OpenAI and Cohere. Furthermore, compared to these competitive models, our proposed encoder enhances the downstream performance of NL2SQL models in 1-shot in-context learning scenarios by 1-2\% for GPT-3.5-turbo, 4-8\% for CodeLlama-7B, and 2-3\% for CodeLlama-13B.

Via

Access Paper or Ask Questions

Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

Jan 17, 2024

Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, Weilin Luo, Meng Lu, Yimin Huang, Zhenan Fan, Zirui Zhou, Yufei Kuang, Zhihai Wang(+16 more)

Figure 1 for Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

Figure 2 for Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

Figure 3 for Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

Figure 4 for Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

Abstract:In an era of digital ubiquity, efficient resource management and decision-making are paramount across numerous industries. To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional optimization techniques. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. Furthermore, we introduce a training framework leveraging augmentation policies to maintain solvers' utility in dynamic environments. Besides the data generation and augmentation, our proposed approaches also include novel ML-driven policies for personalized solver strategies, with an emphasis on applications like graph convolutional networks for initial basis selection and reinforcement learning for advanced presolving and cut selection. Additionally, we detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance. Compared with traditional solvers such as Cplex and SCIP, our ML-augmented OptVerse AI Solver demonstrates superior speed and precision across both established benchmarks and real-world scenarios, reinforcing the practical imperative and effectiveness of machine learning techniques in mathematical programming solvers.

Via

Access Paper or Ask Questions

Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process

Jan 06, 2024

Zhenan Fan, Bissan Ghaddar, Xinglu Wang, Linzi Xing, Yong Zhang, Zirui Zhou

Abstract:The rapid advancement of artificial intelligence (AI) techniques has opened up new opportunities to revolutionize various fields, including operations research (OR). This survey paper explores the integration of AI within the OR process (AI4OR) to enhance its effectiveness and efficiency across multiple stages, such as parameter generation, model formulation, and model optimization. By providing a comprehensive overview of the state-of-the-art and examining the potential of AI to transform OR, this paper aims to inspire further research and innovation in the development of AI-enhanced OR methods and tools. The synergy between AI and OR is poised to drive significant advancements and novel solutions in a multitude of domains, ultimately leading to more effective and efficient decision-making.

Via

Access Paper or Ask Questions

Knowledge-Injected Federated Learning

Aug 16, 2022

Zhenan Fan, Zirui Zhou, Jian Pei, Michael P. Friedlander, Jiajie Hu, Chengliang Li, Yong Zhang

Figure 1 for Knowledge-Injected Federated Learning

Figure 2 for Knowledge-Injected Federated Learning

Figure 3 for Knowledge-Injected Federated Learning

Figure 4 for Knowledge-Injected Federated Learning

Abstract:Federated learning is an emerging technique for training models from decentralized data sets. In many applications, data owners participating in the federated learning system hold not only the data but also a set of domain knowledge. Such knowledge includes human know-how and craftsmanship that can be extremely helpful to the federated learning task. In this work, we propose a federated learning framework that allows the injection of participants' domain knowledge, where the key idea is to refine the global model with knowledge locally. The scenario we consider is motivated by a real industry-level application, and we demonstrate the effectiveness of our approach to this application.

Via

Access Paper or Ask Questions

A dual approach for federated learning

Feb 04, 2022

Zhenan Fan, Huang Fang, Michael P. Friedlander

Figure 1 for A dual approach for federated learning

Figure 2 for A dual approach for federated learning

Figure 3 for A dual approach for federated learning

Figure 4 for A dual approach for federated learning

Abstract:We study the federated optimization problem from a dual perspective and propose a new algorithm termed federated dual coordinate descent (FedDCD), which is based on a type of coordinate descent method developed by Necora et al.[Journal of Optimization Theory and Applications, 2017]. Additionally, we enhance the FedDCD method with inexact gradient oracles and Nesterov's acceleration. We demonstrate theoretically that our proposed approach achieves better convergence rates than the state-of-the-art primal federated optimization algorithms under certain situations. Numerical experiments on real-world datasets support our analysis.

Via

Access Paper or Ask Questions

Fair and efficient contribution valuation for vertical federated learning

Jan 07, 2022

Zhenan Fan, Huang Fang, Zirui Zhou, Jian Pei, Michael P. Friedlander, Yong Zhang

Figure 1 for Fair and efficient contribution valuation for vertical federated learning

Figure 2 for Fair and efficient contribution valuation for vertical federated learning

Figure 3 for Fair and efficient contribution valuation for vertical federated learning

Figure 4 for Fair and efficient contribution valuation for vertical federated learning

Abstract:Federated learning is a popular technology for training machine learning models on distributed data sources without sharing data. Vertical federated learning or feature-based federated learning applies to the cases that different data sources share the same sample ID space but differ in feature space. To ensure the data owners' long-term engagement, it is critical to objectively assess the contribution from each data source and recompense them accordingly. The Shapley value (SV) is a provably fair contribution valuation metric originated from cooperative game theory. However, computing the SV requires extensively retraining the model on each subset of data sources, which causes prohibitively high communication costs in federated learning. We propose a contribution valuation metric called vertical federated Shapley value (VerFedSV) based on SV. We show that VerFedSV not only satisfies many desirable properties for fairness but is also efficient to compute, and can be adapted to both synchronous and asynchronous vertical federated learning algorithms. Both theoretical analysis and extensive experimental results verify the fairness, efficiency, and adaptability of VerFedSV.

Via

Access Paper or Ask Questions

Improving Fairness for Data Valuation in Federated Learning

Sep 19, 2021

Zhenan Fan, Huang Fang, Zirui Zhou, Jian Pei, Michael P. Friedlander, Changxin Liu, Yong Zhang

Figure 1 for Improving Fairness for Data Valuation in Federated Learning

Figure 2 for Improving Fairness for Data Valuation in Federated Learning

Figure 3 for Improving Fairness for Data Valuation in Federated Learning

Figure 4 for Improving Fairness for Data Valuation in Federated Learning

Abstract:Federated learning is an emerging decentralized machine learning scheme that allows multiple data owners to work collaboratively while ensuring data privacy. The success of federated learning depends largely on the participation of data owners. To sustain and encourage data owners' participation, it is crucial to fairly evaluate the quality of the data provided by the data owners and reward them correspondingly. Federated Shapley value, recently proposed by Wang et al. [Federated Learning, 2020], is a measure for data value under the framework of federated learning that satisfies many desired properties for data valuation. However, there are still factors of potential unfairness in the design of federated Shapley value because two data owners with the same local data may not receive the same evaluation. We propose a new measure called completed federated Shapley value to improve the fairness of federated Shapley value. The design depends on completing a matrix consisting of all the possible contributions by different subsets of the data owners. It is shown under mild conditions that this matrix is approximately low-rank by leveraging concepts and tools from optimization. Both theoretical analysis and empirical evaluation verify that the proposed measure does improve fairness in many circumstances.

Via

Access Paper or Ask Questions