Picture for Yuexiang Zhai

Yuexiang Zhai

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Add code
Jan 28, 2025
Figure 1 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 2 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 3 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 4 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Viaarxiv icon

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Add code
May 17, 2024
Figure 1 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Figure 2 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Figure 3 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Figure 4 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Viaarxiv icon

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement

Add code
Feb 24, 2024
Figure 1 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Figure 2 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Figure 3 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Figure 4 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Viaarxiv icon

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

Add code
Jan 11, 2024
Figure 1 for Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Figure 2 for Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Figure 3 for Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Figure 4 for Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Viaarxiv icon

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

Add code
Nov 30, 2023
Figure 1 for LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
Figure 2 for LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
Figure 3 for LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
Figure 4 for LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
Viaarxiv icon

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

Add code
Nov 24, 2023
Figure 1 for White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
Figure 2 for White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
Figure 3 for White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
Figure 4 for White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
Viaarxiv icon

RLIF: Interactive Imitation Learning as Reinforcement Learning

Add code
Nov 21, 2023
Viaarxiv icon

Investigating the Catastrophic Forgetting in Multimodal Large Language Models

Add code
Sep 26, 2023
Viaarxiv icon

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

Add code
Mar 09, 2023
Viaarxiv icon

Closed-Loop Transcription via Convolutional Sparse Coding

Add code
Feb 18, 2023
Figure 1 for Closed-Loop Transcription via Convolutional Sparse Coding
Figure 2 for Closed-Loop Transcription via Convolutional Sparse Coding
Figure 3 for Closed-Loop Transcription via Convolutional Sparse Coding
Figure 4 for Closed-Loop Transcription via Convolutional Sparse Coding
Viaarxiv icon