Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Chen

Low-Rank Adapting Models for Sparse Autoencoders

Jan 31, 2025

Matthew Chen, Joshua Engels, Max Tegmark

Figure 1 for Low-Rank Adapting Models for Sparse Autoencoders

Figure 2 for Low-Rank Adapting Models for Sparse Autoencoders

Figure 3 for Low-Rank Adapting Models for Sparse Autoencoders

Figure 4 for Low-Rank Adapting Models for Sparse Autoencoders

Abstract:Sparse autoencoders (SAEs) decompose language model representations into a sparse set of linear latent vectors. Recent works have improved SAEs using language model gradients, but these techniques require many expensive backward passes during training and still cause a significant increase in cross entropy loss when SAE reconstructions are inserted into the model. In this work, we improve on these limitations by taking a fundamentally different approach: we use low-rank adaptation (LoRA) to finetune the language model itself around a previously trained SAE. We analyze our method across SAE sparsity, SAE width, language model size, LoRA rank, and model layer on the Gemma Scope family of SAEs. In these settings, our method reduces the cross entropy loss gap by 30% to 55% when SAEs are inserted during the forward pass. We also find that compared to end-to-end (e2e) SAEs, our approach achieves the same downstream cross entropy loss 3$\times$ to 20$\times$ faster on Gemma-2-2B and 2$\times$ to 10$\times$ faster on Llama-3.2-1B. We further show that our technique improves downstream metrics and can adapt multiple SAEs at once. Our results demonstrate that improving model interpretability is not limited to post-hoc SAE training; Pareto improvements can also be achieved by directly optimizing the model itself.

* Code available at https://github.com/matchten/LoRA-Models-for-SAEs

Via

Access Paper or Ask Questions

Robotic Defect Inspection with Visual and Tactile Perception for Large-scale Components

Sep 08, 2023

Arpit Agarwal, Abhiroop Ajith, Chengtao Wen, Veniamin Stryzheus, Brian Miller, Matthew Chen, Micah K. Johnson, Jose Luis Susa Rincon, Justinian Rosca, Wenzhen Yuan

Figure 1 for Robotic Defect Inspection with Visual and Tactile Perception for Large-scale Components

Figure 2 for Robotic Defect Inspection with Visual and Tactile Perception for Large-scale Components

Figure 3 for Robotic Defect Inspection with Visual and Tactile Perception for Large-scale Components

Figure 4 for Robotic Defect Inspection with Visual and Tactile Perception for Large-scale Components

Abstract:In manufacturing processes, surface inspection is a key requirement for quality assessment and damage localization. Due to this, automated surface anomaly detection has become a promising area of research in various industrial inspection systems. A particular challenge in industries with large-scale components, like aircraft and heavy machinery, is inspecting large parts with very small defect dimensions. Moreover, these parts can be of curved shapes. To address this challenge, we present a 2-stage multi-modal inspection pipeline with visual and tactile sensing. Our approach combines the best of both visual and tactile sensing by identifying and localizing defects using a global view (vision) and using the localized area for tactile scanning for identifying remaining defects. To benchmark our approach, we propose a novel real-world dataset with multiple metallic defect types per image, collected in the production environments on real aerospace manufacturing parts, as well as online robot experiments in two environments. Our approach is able to identify 85% defects using Stage I and identify 100% defects after Stage II. The dataset is publicly available at https://zenodo.org/record/8327713

* This is a pre-print for International Conference on Intelligent Robots and Systems 2023 publication

Via

Access Paper or Ask Questions