Picture for Tianle Li

Tianle Li

BREEN: Bridge Data-Efficient Encoder-Free Multimodal Learning with Learnable Queries

Add code
Mar 16, 2025
Viaarxiv icon

On the Robustness of Transformers against Context Hijacking for Linear Classification

Add code
Feb 21, 2025
Viaarxiv icon

Prompt-to-Leaderboard

Add code
Feb 20, 2025
Viaarxiv icon

Project MPG: towards a generalized performance benchmark for LLM capabilities

Add code
Oct 28, 2024
Viaarxiv icon

How to Evaluate Reward Models for RLHF

Add code
Oct 18, 2024
Figure 1 for How to Evaluate Reward Models for RLHF
Figure 2 for How to Evaluate Reward Models for RLHF
Figure 3 for How to Evaluate Reward Models for RLHF
Figure 4 for How to Evaluate Reward Models for RLHF
Viaarxiv icon

Y-Mol: A Multiscale Biomedical Knowledge-Guided Large Language Model for Drug Development

Add code
Oct 15, 2024
Viaarxiv icon

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Add code
Jun 17, 2024
Viaarxiv icon

GenAI Arena: An Open Evaluation Platform for Generative Models

Add code
Jun 06, 2024
Viaarxiv icon

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Add code
Jun 04, 2024
Figure 1 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Figure 2 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Figure 3 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Figure 4 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Viaarxiv icon

Long-context LLMs Struggle with Long In-context Learning

Add code
Apr 04, 2024
Viaarxiv icon