Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaofeng Lin

ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation

Aug 06, 2025

Yihua Shao, Xiaofeng Lin, Xinwei Long, Siyu Chen, Minxi Yan, Yang Liu, Ziyang Yan, Ao Ma, Hao Tang, Jingcai Guo

Figure 1 for ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation

Figure 2 for ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation

Figure 3 for ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation

Figure 4 for ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation

Abstract:Enabling multi-task adaptation in pre-trained Low-Rank Adaptation (LoRA) models is crucial for enhancing their generalization capabilities. Most existing pre-trained LoRA fusion methods decompose weight matrices, sharing similar parameters while merging divergent ones. However, this paradigm inevitably induces inter-weight conflicts and leads to catastrophic domain forgetting. While incremental learning enables adaptation to multiple tasks, it struggles to achieve generalization in few-shot scenarios. Consequently, when the weight data follows a long-tailed distribution, it can lead to forgetting in the fused weights. To address this issue, we propose In-Context Meta LoRA Fusion (ICM-Fusion), a novel framework that synergizes meta-learning with in-context adaptation. The key innovation lies in our task vector arithmetic, which dynamically balances conflicting optimization directions across domains through learned manifold projections. ICM-Fusion obtains the optimal task vector orientation for the fused model in the latent space by adjusting the orientation of the task vectors. Subsequently, the fused LoRA is reconstructed by a self-designed Fusion VAE (F-VAE) to realize multi-task LoRA generation. We have conducted extensive experiments on visual and linguistic tasks, and the experimental results demonstrate that ICM-Fusion can be adapted to a wide range of architectural models and applied to various tasks. Compared to the current pre-trained LoRA fusion method, ICM-Fusion fused LoRA can significantly reduce the multi-tasking loss and can even achieve task enhancement in few-shot scenarios.

Via

Access Paper or Ask Questions

Risk In Context: Benchmarking Privacy Leakage of Foundation Models in Synthetic Tabular Data Generation

Jul 22, 2025

Jessup Byun, Xiaofeng Lin, Joshua Ward, Guang Cheng

Abstract:Synthetic tabular data is essential for machine learning workflows, especially for expanding small or imbalanced datasets and enabling privacy-preserving data sharing. However, state-of-the-art generative models (GANs, VAEs, diffusion models) rely on large datasets with thousands of examples. In low-data settings, often the primary motivation for synthetic data, these models can overfit, leak sensitive records, and require frequent retraining. Recent work uses large pre-trained transformers to generate rows via in-context learning (ICL), which needs only a few seed examples and no parameter updates, avoiding retraining. But ICL repeats seed rows verbatim, introducing a new privacy risk that has only been studied in text. The severity of this risk in tabular synthesis-where a single row may identify a person-remains unclear. We address this gap with the first benchmark of three foundation models (GPT-4o-mini, LLaMA 3.3 70B, TabPFN v2) against four baselines on 35 real-world tables from health, finance, and policy. We evaluate statistical fidelity, downstream utility, and membership inference leakage. Results show foundation models consistently have the highest privacy risk. LLaMA 3.3 70B reaches up to 54 percentage points higher true-positive rate at 1% FPR than the safest baseline. GPT-4o-mini and TabPFN are also highly vulnerable. We plot the privacy-utility frontier and show that CTGAN and GPT-4o-mini offer better tradeoffs. A factorial study finds that three zero-cost prompt tweaks-small batch size, low temperature, and using summary statistics-can reduce worst-case AUC by 14 points and rare-class leakage by up to 39 points while maintaining over 90% fidelity. Our benchmark offers a practical guide for safer low-data synthesis with foundation models.

* Accepted by Agentic & GenAI Evaluation KDD2025, poster presentation

Via

Access Paper or Ask Questions

Autonomous Robotic Bone Micro-Milling System with Automatic Calibration and 3D Surface Fitting

Mar 06, 2025

Enduo Zhao, Xiaofeng Lin, Yifan Wang, Kanako Harada

Abstract:Automating bone micro-milling using a robotic system presents challenges due to the uncertainties in both the external and internal features of bone tissue. For example, during a mouse cranial window creation, a circular path with a radius of 2 to 4 mm needs to be milled on the mouse skull using a microdrill. The uneven surface and non-uniform thickness of the mouse skull make it difficult to fully automate this process, requiring the system to possess advanced perceptual and adaptive capabilities. In this study, we propose an automatic calibration and 3D surface fitting method and integrate it into an autonomous robotic bone micro-milling system, enabling it to quickly, in real-time, and accurately perceive and adapt to the uneven surface and non-uniform thickness of the target without human assistance. Validation experiments on euthanized mice demonstrate that the improved system achieves a success rate of 85.7 % and an average milling time of 2.1 minutes, showing not only significant performance improvements over the previous system but also exceptional accuracy, speed, and stability compared to human operators.

* 8 pages, 8 figures, submitted to IROS 2025

Via

Access Paper or Ask Questions

Object State Estimation Through Robotic Active Interaction for Biological Autonomous Drilling

Mar 06, 2025

Xiaofeng Lin, Enduo Zhao, Saúl Alexis Heredia Pérez, Kanako Harada

Figure 1 for Object State Estimation Through Robotic Active Interaction for Biological Autonomous Drilling

Figure 2 for Object State Estimation Through Robotic Active Interaction for Biological Autonomous Drilling

Figure 3 for Object State Estimation Through Robotic Active Interaction for Biological Autonomous Drilling

Figure 4 for Object State Estimation Through Robotic Active Interaction for Biological Autonomous Drilling

Abstract:Estimating the state of biological specimens is challenging due to limited observation through microscopic vision. For instance, during mouse skull drilling, the appearance alters little when thinning bone tissue because of its semi-transparent property and the high-magnification microscopic vision. To obtain the object's state, we introduce an object state estimation method for biological specimens through active interaction based on the deflection. The method is integrated to enhance the autonomous drilling system developed in our previous work. The method and integrated system were evaluated through 12 autonomous eggshell drilling experiment trials. The results show that the system achieved a 91.7% successful ratio and 75% detachable ratio, showcasing its potential applicability in more complex surgical procedures such as mouse skull craniotomy. This research paves the way for further development of autonomous robotic systems capable of estimating the object's state through active interaction.

* The first and second authors contribute equally to this research. 6 pages, 5 figures, submitted to RA-L

Via

Access Paper or Ask Questions

A Cranial-Feature-Based Registration Scheme for Robotic Micromanipulation Using a Microscopic Stereo Camera System

Oct 24, 2024

Xiaofeng Lin, Saúl Alexis Heredia Pérez, Kanako Harada

Figure 1 for A Cranial-Feature-Based Registration Scheme for Robotic Micromanipulation Using a Microscopic Stereo Camera System

Figure 2 for A Cranial-Feature-Based Registration Scheme for Robotic Micromanipulation Using a Microscopic Stereo Camera System

Figure 3 for A Cranial-Feature-Based Registration Scheme for Robotic Micromanipulation Using a Microscopic Stereo Camera System

Figure 4 for A Cranial-Feature-Based Registration Scheme for Robotic Micromanipulation Using a Microscopic Stereo Camera System

Abstract:Biological specimens exhibit significant variations in size and shape, challenging autonomous robotic manipulation. We focus on the mouse skull window creation task to illustrate these challenges. The study introduces a microscopic stereo camera system (MSCS) enhanced by the linear model for depth perception. Alongside this, a precise registration scheme is developed for the partially exposed mouse cranial surface, employing a CNN-based constrained and colorized registration strategy. These methods are integrated with the MSCS for robotic micromanipulation tasks. The MSCS demonstrated a high precision of 0.10 mm $\pm$ 0.02 mm measured in a step height experiment and real-time performance of 30 FPS in 3D reconstruction. The registration scheme proved its precision, with a translational error of 1.13 mm $\pm$ 0.31 mm and a rotational error of 3.38$^{\circ}$ $\pm$ 0.89$^{\circ}$ tested on 105 continuous frames with an average speed of 1.60 FPS. This study presents the application of a MSCS and a novel registration scheme in enhancing the precision and accuracy of robotic micromanipulation in scientific and surgical settings. The innovations presented here offer automation methodology in handling the challenges of microscopic manipulation, paving the way for more accurate, efficient, and less invasive procedures in various fields of microsurgery and scientific research.

* Accepted by Advanced Robotics, Vol. 38, Issue 21

Via

Access Paper or Ask Questions

CTSyn: A Foundational Model for Cross Tabular Data Generation

Jun 07, 2024

Xiaofeng Lin, Chenheng Xu, Matthew Yang, Guang Cheng

Figure 1 for CTSyn: A Foundational Model for Cross Tabular Data Generation

Figure 2 for CTSyn: A Foundational Model for Cross Tabular Data Generation

Figure 3 for CTSyn: A Foundational Model for Cross Tabular Data Generation

Figure 4 for CTSyn: A Foundational Model for Cross Tabular Data Generation

Abstract:Generative Foundation Models (GFMs) have produced synthetic data with remarkable quality in modalities such as images and text. However, applying GFMs to tabular data poses significant challenges due to the inherent heterogeneity of table features. Existing cross-table learning frameworks are hindered by the absence of both a generative model backbone and a decoding mechanism for heterogeneous feature values. To overcome these limitations, we introduce the Cross-Table Synthesizer (CTSyn), a diffusion-based foundational model tailored for tabular data generation. CTSyn introduces three major components: an aggregator that consolidates heterogeneous tables into a unified latent space; a conditional latent diffusion model for sampling from this space; and type-specific decoders that reconstruct values of varied data types from sampled latent vectors. Extensive testing on real-world datasets reveals that CTSyn not only significantly outperforms existing table synthesizers in utility and diversity, but also uniquely enhances performances of downstream machine learning beyond what is achievable with real data, thus establishing a new paradigm for synthetic data generation.

Via

Access Paper or Ask Questions

Evaluating Saliency Explanations in NLP by Crowdsourcing

May 17, 2024

Xiaotian Lu, Jiyi Li, Zhen Wan, Xiaofeng Lin, Koh Takeuchi, Hisashi Kashima

Figure 1 for Evaluating Saliency Explanations in NLP by Crowdsourcing

Figure 2 for Evaluating Saliency Explanations in NLP by Crowdsourcing

Figure 3 for Evaluating Saliency Explanations in NLP by Crowdsourcing

Figure 4 for Evaluating Saliency Explanations in NLP by Crowdsourcing

Abstract:Deep learning models have performed well on many NLP tasks. However, their internal mechanisms are typically difficult for humans to understand. The development of methods to explain models has become a key issue in the reliability of deep learning models in many important applications. Various saliency explanation methods, which give each feature of input a score proportional to the contribution of output, have been proposed to determine the part of the input which a model values most. Despite a considerable body of work on the evaluation of saliency methods, whether the results of various evaluation metrics agree with human cognition remains an open question. In this study, we propose a new human-based method to evaluate saliency methods in NLP by crowdsourcing. We recruited 800 crowd workers and empirically evaluated seven saliency methods on two datasets with the proposed method. We analyzed the performance of saliency methods, compared our results with existing automated evaluation methods, and identified notable differences between NLP and computer vision (CV) fields when using saliency methods. The instance-level data of our crowdsourced experiments and the code to reproduce the explanations are available at https://github.com/xtlu/lreccoling_evaluation.

* 13 pages, 4 figures, Accepted for LREC-Coling 2024 (Oral)

Via

Access Paper or Ask Questions

Benefits of Transformer: In-Context Learning in Linear Regression Tasks with Unstructured Data

Feb 01, 2024

Yue Xing, Xiaofeng Lin, Namjoon Suh, Qifan Song, Guang Cheng

Abstract:In practice, it is observed that transformer-based models can learn concepts in context in the inference stage. While existing literature, e.g., \citet{zhang2023trained,huang2023context}, provide theoretical explanations on this in-context learning ability, they assume the input $x_i$ and the output $y_i$ for each sample are embedded in the same token (i.e., structured data). However, in reality, they are presented in two tokens (i.e., unstructured data \cite{wibisono2023role}). In this case, this paper conducts experiments in linear regression tasks to study the benefits of the architecture of transformers and provides some corresponding theoretical intuitions to explain why the transformer can learn from unstructured data. We study the exact components in a transformer that facilitate the in-context learning. In particular, we observe that (1) a transformer with two layers of softmax (self-)attentions with look-ahead attention mask can learn from the prompt if $y_i$ is in the token next to $x_i$ for each example; (2) positional encoding can further improve the performance; and (3) multi-head attention with a high input embedding dimension has a better prediction performance than single-head attention.

Via

Access Paper or Ask Questions

Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective

Jan 26, 2024

Yue Xing, Xiaofeng Lin, Qifan Song, Yi Xu, Belinda Zeng, Guang Cheng

Abstract:Pre-training is known to generate universal representations for downstream tasks in large-scale deep learning such as large language models. Existing literature, e.g., \cite{kim2020adversarial}, empirically observe that the downstream tasks can inherit the adversarial robustness of the pre-trained model. We provide theoretical justifications for this robustness inheritance phenomenon. Our theoretical results reveal that feature purification plays an important role in connecting the adversarial robustness of the pre-trained model and the downstream tasks in two-layer neural networks. Specifically, we show that (i) with adversarial training, each hidden node tends to pick only one (or a few) feature; (ii) without adversarial training, the hidden nodes can be vulnerable to attacks. This observation is valid for both supervised pre-training and contrastive learning. With purified nodes, it turns out that clean training is enough to achieve adversarial robustness in downstream tasks.

* To appear in AISTATS2024

Via

Access Paper or Ask Questions

AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizing

Oct 24, 2023

Namjoon Suh, Xiaofeng Lin, Din-Yin Hsieh, Merhdad Honarkhah, Guang Cheng

Abstract:Diffusion model has become a main paradigm for synthetic data generation in many subfields of modern machine learning, including computer vision, language model, or speech synthesis. In this paper, we leverage the power of diffusion model for generating synthetic tabular data. The heterogeneous features in tabular data have been main obstacles in tabular data synthesis, and we tackle this problem by employing the auto-encoder architecture. When compared with the state-of-the-art tabular synthesizers, the resulting synthetic tables from our model show nice statistical fidelities to the real data, and perform well in downstream tasks for machine learning utilities. We conducted the experiments over 15 publicly available datasets. Notably, our model adeptly captures the correlations among features, which has been a long-standing challenge in tabular data synthesis. Our code is available upon request and will be publicly released if paper is accepted.

Via

Access Paper or Ask Questions