Picture for Zhengfu He

Zhengfu He

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Add code
Oct 27, 2024
Viaarxiv icon

Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures

Add code
Oct 10, 2024
Figure 1 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Figure 2 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Figure 3 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Figure 4 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Viaarxiv icon

Automatically Identifying Local and Global Circuits with Linear Computation Graphs

Add code
May 22, 2024
Viaarxiv icon

Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT

Add code
Feb 19, 2024
Viaarxiv icon

Can AI Assistants Know What They Don't Know?

Add code
Jan 28, 2024
Figure 1 for Can AI Assistants Know What They Don't Know?
Figure 2 for Can AI Assistants Know What They Don't Know?
Figure 3 for Can AI Assistants Know What They Don't Know?
Figure 4 for Can AI Assistants Know What They Don't Know?
Viaarxiv icon

DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

Add code
Nov 30, 2022
Viaarxiv icon

Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning

Add code
Oct 14, 2022
Figure 1 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Figure 2 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Figure 3 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Figure 4 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Viaarxiv icon

BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning

Add code
May 23, 2022
Figure 1 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Figure 2 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Figure 3 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Figure 4 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Viaarxiv icon