Picture for Tongshuang Wu

Tongshuang Wu

Orbit: A Framework for Designing and Evaluating Multi-objective Rankers

Add code
Nov 07, 2024
Figure 1 for Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Figure 2 for Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Figure 3 for Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Figure 4 for Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Viaarxiv icon

HiMemFormer: Hierarchical Memory-Aware Transformer for Multi-Agent Action Anticipation

Add code
Nov 03, 2024
Viaarxiv icon

What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing

Add code
Sep 14, 2024
Viaarxiv icon

What You Say = What You Want? Teaching Humans to Articulate Requirements for LLMs

Add code
Sep 13, 2024
Viaarxiv icon

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

Add code
Jul 16, 2024
Viaarxiv icon

Synthetic Multimodal Question Generation

Add code
Jul 02, 2024
Viaarxiv icon

WebCanvas: Benchmarking Web Agents in Online Environments

Add code
Jun 18, 2024
Viaarxiv icon

Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness

Add code
May 04, 2024
Viaarxiv icon

Better Synthetic Data by Retrieving and Transforming Existing Datasets

Add code
Apr 26, 2024
Viaarxiv icon

Evaluating Mathematical Reasoning Beyond Accuracy

Add code
Apr 08, 2024
Figure 1 for Evaluating Mathematical Reasoning Beyond Accuracy
Figure 2 for Evaluating Mathematical Reasoning Beyond Accuracy
Figure 3 for Evaluating Mathematical Reasoning Beyond Accuracy
Figure 4 for Evaluating Mathematical Reasoning Beyond Accuracy
Viaarxiv icon