Picture for Yifan Song

Yifan Song

Harnessing Webpage UIs for Text-Rich Visual Understanding

Add code
Oct 17, 2024
Figure 1 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 2 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 3 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 4 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Viaarxiv icon

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Add code
Oct 17, 2024
Figure 1 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Figure 2 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Figure 3 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Figure 4 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Viaarxiv icon

AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories

Add code
Oct 10, 2024
Figure 1 for AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Figure 2 for AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Figure 3 for AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Figure 4 for AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Viaarxiv icon

Urban Region Pre-training and Prompting: A Graph-based Approach

Add code
Aug 12, 2024
Viaarxiv icon

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Add code
Jul 15, 2024
Viaarxiv icon

Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement

Add code
Jun 17, 2024
Figure 1 for Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement
Figure 2 for Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement
Figure 3 for Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement
Figure 4 for Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement
Viaarxiv icon

LongEmbed: Extending Embedding Models for Long Context Retrieval

Add code
Apr 18, 2024
Viaarxiv icon

VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?

Add code
Apr 09, 2024
Viaarxiv icon

CoUDA: Coherence Evaluation via Unified Data Augmentation

Add code
Mar 31, 2024
Viaarxiv icon

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Add code
Mar 04, 2024
Viaarxiv icon