Picture for Difan Zou

Difan Zou

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Add code
Jan 09, 2025
Figure 1 for SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Figure 2 for SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Figure 3 for SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Figure 4 for SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Viaarxiv icon

Parallelized Autoregressive Visual Generation

Add code
Dec 19, 2024
Viaarxiv icon

On the Feature Learning in Diffusion Models

Add code
Dec 02, 2024
Viaarxiv icon

Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension Ability

Add code
Nov 29, 2024
Viaarxiv icon

An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models

Add code
Nov 26, 2024
Viaarxiv icon

How Does Critical Batch Size Scale in Pre-training?

Add code
Oct 29, 2024
Figure 1 for How Does Critical Batch Size Scale in Pre-training?
Figure 2 for How Does Critical Batch Size Scale in Pre-training?
Figure 3 for How Does Critical Batch Size Scale in Pre-training?
Figure 4 for How Does Critical Batch Size Scale in Pre-training?
Viaarxiv icon

Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers

Add code
Oct 24, 2024
Viaarxiv icon

How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression

Add code
Aug 08, 2024
Viaarxiv icon

Extracting Training Data from Unconditional Diffusion Models

Add code
Jun 18, 2024
Viaarxiv icon

Explainable Bayesian Recurrent Neural Smoother to Capture Global State Evolutionary Correlations

Add code
Jun 17, 2024
Viaarxiv icon