Picture for Tianzhe Chu

Tianzhe Chu

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Add code
Jan 28, 2025
Figure 1 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 2 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 3 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 4 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Viaarxiv icon

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

Add code
Nov 24, 2023
Viaarxiv icon

Emergence of Segmentation with Minimalistic White-Box Transformers

Add code
Aug 30, 2023
Figure 1 for Emergence of Segmentation with Minimalistic White-Box Transformers
Figure 2 for Emergence of Segmentation with Minimalistic White-Box Transformers
Figure 3 for Emergence of Segmentation with Minimalistic White-Box Transformers
Figure 4 for Emergence of Segmentation with Minimalistic White-Box Transformers
Viaarxiv icon

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

Add code
Jun 09, 2023
Figure 1 for Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Figure 2 for Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Figure 3 for Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Figure 4 for Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Viaarxiv icon

White-Box Transformers via Sparse Rate Reduction

Add code
Jun 01, 2023
Figure 1 for White-Box Transformers via Sparse Rate Reduction
Figure 2 for White-Box Transformers via Sparse Rate Reduction
Figure 3 for White-Box Transformers via Sparse Rate Reduction
Figure 4 for White-Box Transformers via Sparse Rate Reduction
Viaarxiv icon