Picture for Hritik Bansal

Hritik Bansal

MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Add code
Dec 17, 2024
Viaarxiv icon

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

Add code
Aug 29, 2024
Figure 1 for Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Figure 2 for Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Figure 3 for Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Figure 4 for Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Viaarxiv icon

Generative Verifiers: Reward Modeling as Next-Token Prediction

Add code
Aug 27, 2024
Figure 1 for Generative Verifiers: Reward Modeling as Next-Token Prediction
Figure 2 for Generative Verifiers: Reward Modeling as Next-Token Prediction
Figure 3 for Generative Verifiers: Reward Modeling as Next-Token Prediction
Figure 4 for Generative Verifiers: Reward Modeling as Next-Token Prediction
Viaarxiv icon

Towards a Holistic Framework for Multimodal Large Language Models in Three-dimensional Brain CT Report Generation

Add code
Jul 02, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

VideoPhy: Evaluating Physical Commonsense for Video Generation

Add code
Jun 05, 2024
Figure 1 for VideoPhy: Evaluating Physical Commonsense for Video Generation
Figure 2 for VideoPhy: Evaluating Physical Commonsense for Video Generation
Figure 3 for VideoPhy: Evaluating Physical Commonsense for Video Generation
Figure 4 for VideoPhy: Evaluating Physical Commonsense for Video Generation
Viaarxiv icon

TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation

Add code
May 07, 2024
Viaarxiv icon

GenEARL: A Training-Free Generative Framework for Multimodal Event Argument Role Labeling

Add code
Apr 07, 2024
Viaarxiv icon

Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation

Add code
Apr 02, 2024
Viaarxiv icon

Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization

Add code
Mar 31, 2024
Viaarxiv icon