Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhuo Huang

Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning

May 17, 2025

Puning Yang, Qizhou Wang, Zhuo Huang, Tongliang Liu, Chengqi Zhang, Bo Han

Abstract:Loss reweighting has shown significant benefits for machine unlearning with large language models (LLMs). However, their exact functionalities are left unclear and the optimal strategy remains an open question, thus impeding the understanding and improvement of existing methodologies. In this paper, we identify two distinct goals of loss reweighting, namely, Saturation and Importance -- the former indicates that those insufficiently optimized data should be emphasized, while the latter stresses some critical data that are most influential for loss minimization. To study their usefulness, we design specific reweighting strategies for each goal and evaluate their respective effects on unlearning. We conduct extensive empirical analyses on well-established benchmarks, and summarize some important observations as follows: (i) Saturation enhances efficacy more than importance-based reweighting, and their combination can yield additional improvements. (ii) Saturation typically allocates lower weights to data with lower likelihoods, whereas importance-based reweighting does the opposite. (iii) The efficacy of unlearning is also largely influenced by the smoothness and granularity of the weight distributions. Based on these findings, we propose SatImp, a simple reweighting method that combines the advantages of both saturation and importance. Empirical results on extensive datasets validate the efficacy of our method, potentially bridging existing research gaps and indicating directions for future research. Our code is available at https://github.com/Puning97/SatImp-for-LLM-Unlearning.

Via

Access Paper or Ask Questions

Towards Modality Generalization: A Benchmark and Prospective Analysis

Dec 24, 2024

Xiaohao Liu, Xiaobo Xia, Zhuo Huang, Tat-Seng Chua

Figure 1 for Towards Modality Generalization: A Benchmark and Prospective Analysis

Figure 2 for Towards Modality Generalization: A Benchmark and Prospective Analysis

Figure 3 for Towards Modality Generalization: A Benchmark and Prospective Analysis

Figure 4 for Towards Modality Generalization: A Benchmark and Prospective Analysis

Abstract:Multi-modal learning has achieved remarkable success by integrating information from various modalities, achieving superior performance in tasks like recognition and retrieval compared to uni-modal approaches. However, real-world scenarios often present novel modalities that are unseen during training due to resource and privacy constraints, a challenge current methods struggle to address. This paper introduces Modality Generalization (MG), which focuses on enabling models to generalize to unseen modalities. We define two cases: weak MG, where both seen and unseen modalities can be mapped into a joint embedding space via existing perceptors, and strong MG, where no such mappings exist. To facilitate progress, we propose a comprehensive benchmark featuring multi-modal algorithms and adapt existing methods that focus on generalization. Extensive experiments highlight the complexity of MG, exposing the limitations of existing methods and identifying key directions for future research. Our work provides a foundation for advancing robust and adaptable multi-modal models, enabling them to handle unseen modalities in realistic scenarios.

Via

Access Paper or Ask Questions

Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Dec 05, 2023

Zhuo Huang, Chang Liu, Yinpeng Dong, Hang Su, Shibao Zheng, Tongliang Liu

Figure 1 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Figure 2 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Figure 3 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Figure 4 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Abstract:Although vision models such as Contrastive Language-Image Pre-Training (CLIP) show impressive generalization performance, their zero-shot robustness is still limited under Out-of-Distribution (OOD) scenarios without fine-tuning. Instead of undesirably providing human supervision as commonly done, it is possible to take advantage of Multi-modal Large Language Models (MLLMs) that hold powerful visual understanding abilities. However, MLLMs are shown to struggle with vision problems due to the incompatibility of tasks, thus hindering their utilization. In this paper, we propose to effectively leverage MLLMs to conduct Machine Vision Therapy which aims to rectify the noisy predictions from vision models. By fine-tuning with the denoised labels, the learning model performance can be boosted in an unsupervised manner. To solve the incompatibility issue, we propose a novel Denoising In-Context Learning (DICL) strategy to align vision tasks with MLLMs. Concretely, by estimating a transition matrix that captures the probability of one class being confused with another, an instruction containing a correct exemplar and an erroneous one from the most probable noisy class can be constructed. Such an instruction can help any MLLMs with ICL ability to detect and rectify incorrect predictions of vision models. Through extensive experiments on ImageNet, WILDS, DomainBed, and other OOD datasets, we carefully validate the quantitative and qualitative effectiveness of our method. Our code is available at https://github.com/tmllab/Machine_Vision_Therapy.

* 37 pages, 19 figures, and 13 tables

Via

Access Paper or Ask Questions

Winning Prize Comes from Losing Tickets: Improve Invariant Learning by Exploring Variant Parameters for Out-of-Distribution Generalization

Oct 25, 2023

Zhuo Huang, Muyang Li, Li Shen, Jun Yu, Chen Gong, Bo Han, Tongliang Liu

Abstract:Out-of-Distribution (OOD) Generalization aims to learn robust models that generalize well to various environments without fitting to distribution-specific features. Recent studies based on Lottery Ticket Hypothesis (LTH) address this problem by minimizing the learning target to find some of the parameters that are critical to the task. However, in OOD problems, such solutions are suboptimal as the learning task contains severe distribution noises, which can mislead the optimization process. Therefore, apart from finding the task-related parameters (i.e., invariant parameters), we propose Exploring Variant parameters for Invariant Learning (EVIL) which also leverages the distribution knowledge to find the parameters that are sensitive to distribution shift (i.e., variant parameters). Once the variant parameters are left out of invariant learning, a robust subnetwork that is resistant to distribution shift can be found. Additionally, the parameters that are relatively stable across distributions can be considered invariant ones to improve invariant learning. By fully exploring both variant and invariant parameters, our EVIL can effectively identify a robust subnetwork to improve OOD generalization. In extensive experiments on integrated testbed: DomainBed, EVIL can effectively and efficiently enhance many popular methods, such as ERM, IRM, SAM, etc.

* 27 pages, 9 figures

Via

Access Paper or Ask Questions

FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness for Semi-Supervised Learning

Oct 25, 2023

Zhuo Huang, Li Shen, Jun Yu, Bo Han, Tongliang Liu

Abstract:Semi-Supervised Learning (SSL) has been an effective way to leverage abundant unlabeled data with extremely scarce labeled data. However, most SSL methods are commonly based on instance-wise consistency between different data transformations. Therefore, the label guidance on labeled data is hard to be propagated to unlabeled data. Consequently, the learning process on labeled data is much faster than on unlabeled data which is likely to fall into a local minima that does not favor unlabeled data, leading to sub-optimal generalization performance. In this paper, we propose FlatMatch which minimizes a cross-sharpness measure to ensure consistent learning performance between the two datasets. Specifically, we increase the empirical risk on labeled data to obtain a worst-case model which is a failure case that needs to be enhanced. Then, by leveraging the richness of unlabeled data, we penalize the prediction difference (i.e., cross-sharpness) between the worst-case model and the original model so that the learning direction is beneficial to generalization on unlabeled data. Therefore, we can calibrate the learning process without being limited to insufficient label information. As a result, the mismatched learning performance can be mitigated, further enabling the effective exploitation of unlabeled data and improving SSL performance. Through comprehensive validation, we show FlatMatch achieves state-of-the-art results in many SSL settings.

* NeurIPS 2023

Via

Access Paper or Ask Questions

Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

Mar 23, 2023

Zhuo Huang, Miaoxi Zhu, Xiaobo Xia, Li Shen, Jun Yu, Chen Gong, Bo Han, Bo Du, Tongliang Liu

Abstract:Robust generalization aims to tackle the most challenging data distributions which are rare in the training set and contain severe noises, i.e., photon-limited corruptions. Common solutions such as distributionally robust optimization (DRO) focus on the worst-case empirical risk to ensure low training error on the uncommon noisy distributions. However, due to the over-parameterized model being optimized on scarce worst-case data, DRO fails to produce a smooth loss landscape, thus struggling on generalizing well to the test set. Therefore, instead of focusing on the worst-case risk minimization, we propose SharpDRO by penalizing the sharpness of the worst-case distribution, which measures the loss changes around the neighbor of learning parameters. Through worst-case sharpness minimization, the proposed method successfully produces a flat loss curve on the corrupted distributions, thus achieving robust generalization. Moreover, by considering whether the distribution annotation is available, we apply SharpDRO to two problem settings and design a worst-case selection process for robust generalization. Theoretically, we show that SharpDRO has a great convergence guarantee. Experimentally, we simulate photon-limited corruptions using CIFAR10/100 and ImageNet30 datasets and show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.

* CVPR 2023

Via

Access Paper or Ask Questions

Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

Jul 07, 2022

Zhuo Huang, Xiaobo Xia, Li Shen, Bo Han, Mingming Gong, Chen Gong, Tongliang Liu

Figure 1 for Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

Figure 2 for Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

Figure 3 for Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

Figure 4 for Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

Abstract:Machine learning models are vulnerable to Out-Of-Distribution (OOD) examples, such a problem has drawn much attention. However, current methods lack a full understanding of different types of OOD data: there are benign OOD data that can be properly adapted to enhance the learning performance, while other malign OOD data would severely degenerate the classification result. To Harness OOD data, this paper proposes HOOD method that can leverage the content and style from each image instance to identify benign and malign OOD data. Particularly, we design a variational inference framework to causally disentangle content and style features by constructing a structural causal model. Subsequently, we augment the content and style through an intervention process to produce malign and benign OOD data, respectively. The benign OOD data contain novel styles but hold our interested contents, and they can be leveraged to help train a style-invariant model. In contrast, the malign OOD data inherit unknown contents but carry familiar styles, by detecting them can improve model robustness against deceiving anomalies. Thanks to the proposed novel disentanglement and data augmentation techniques, HOOD can effectively deal with OOD examples in unknown and open environments, whose effectiveness is empirically validated in three typical OOD applications including OOD detection, open-set semi-supervised learning, and open-set domain adaptation.

Via

Access Paper or Ask Questions

Prediction and optimization of NaV1.7 inhibitors based on machine learning methods

Nov 29, 2019

Weikaixin Kong, Xinyu Tu, Zhengwei Xie, Zhuo Huang

Figure 1 for Prediction and optimization of NaV1.7 inhibitors based on machine learning methods

Figure 2 for Prediction and optimization of NaV1.7 inhibitors based on machine learning methods

Figure 3 for Prediction and optimization of NaV1.7 inhibitors based on machine learning methods

Figure 4 for Prediction and optimization of NaV1.7 inhibitors based on machine learning methods

Abstract:Objectives NaV1.7 is a key target related to pain. This study focused on predicting and optimizing inhibitors of NaV1.7 using machine learning methods, and using patch-clamp methods to validate them at the cellular level. Materials and Methods We used Chembl, BindingDB, and data from the literature to establish classification models for inhibitors. The imbalanced data set test2 was used to screen the best-performing model to screen commercial compound libraries, and whole-cell voltage-clamp was used to validate inhibitors. We propose a molecular group optimization method using a combination of Grammer Variational Autoencoder, classification model, and simulated annealing algorithm. Results and Conclusion We get the model RF-CDK that performs best in the imbalanced data set. Of the three compounds that may have inhibitory effects, Nortriptyline has been experimentally verified. In the molecular optimization method, the best result of the optimization results of CHEMBL2325245 is MS = 1.052, PROB = 0.527, SA = 2.587, QED = 0.462. 40 molecules located in the applicability domain of RF-CDK were used for optimization, among which 34 molecules gave larger MS values.

Via

Access Paper or Ask Questions