Picture for Wen-tau Yih

Wen-tau Yih

CRAG-MM: Multi-modal Multi-turn Comprehensive RAG Benchmark

Add code
Oct 30, 2025
Viaarxiv icon

SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning

Add code
Oct 02, 2025
Figure 1 for SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning
Figure 2 for SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning
Figure 3 for SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning
Figure 4 for SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning
Viaarxiv icon

Learning to Reason for Factuality

Add code
Aug 07, 2025
Viaarxiv icon

MetaCLIP 2: A Worldwide Scaling Recipe

Add code
Jul 29, 2025
Figure 1 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 2 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 3 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 4 for MetaCLIP 2: A Worldwide Scaling Recipe
Viaarxiv icon

FlexOlmo: Open Language Models for Flexible Data Use

Add code
Jul 09, 2025
Figure 1 for FlexOlmo: Open Language Models for Flexible Data Use
Figure 2 for FlexOlmo: Open Language Models for Flexible Data Use
Figure 3 for FlexOlmo: Open Language Models for Flexible Data Use
Figure 4 for FlexOlmo: Open Language Models for Flexible Data Use
Viaarxiv icon

ConfQA: Answer Only If You Are Confident

Add code
Jun 08, 2025
Figure 1 for ConfQA: Answer Only If You Are Confident
Figure 2 for ConfQA: Answer Only If You Are Confident
Figure 3 for ConfQA: Answer Only If You Are Confident
Figure 4 for ConfQA: Answer Only If You Are Confident
Viaarxiv icon

ReasonIR: Training Retrievers for Reasoning Tasks

Add code
Apr 29, 2025
Viaarxiv icon

DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers

Add code
Feb 25, 2025
Figure 1 for DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
Figure 2 for DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
Figure 3 for DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
Figure 4 for DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
Viaarxiv icon

Data-Efficient Pretraining with Group-Level Data Influence Modeling

Add code
Feb 20, 2025
Figure 1 for Data-Efficient Pretraining with Group-Level Data Influence Modeling
Figure 2 for Data-Efficient Pretraining with Group-Level Data Influence Modeling
Figure 3 for Data-Efficient Pretraining with Group-Level Data Influence Modeling
Figure 4 for Data-Efficient Pretraining with Group-Level Data Influence Modeling
Viaarxiv icon

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Add code
Feb 13, 2025
Figure 1 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Figure 2 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Figure 3 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Figure 4 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Viaarxiv icon