Picture for Chenyan Xiong

Chenyan Xiong

Microsoft Research

RAGViz: Diagnose and Visualize Retrieval-Augmented Generation

Add code
Nov 04, 2024
Viaarxiv icon

Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning

Add code
Oct 18, 2024
Viaarxiv icon

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

Add code
Oct 17, 2024
Figure 1 for RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Figure 2 for RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Figure 3 for RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Figure 4 for RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Viaarxiv icon

Harnessing Webpage UIs for Text-Rich Visual Understanding

Add code
Oct 17, 2024
Figure 1 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 2 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 3 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 4 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Viaarxiv icon

Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Add code
Jul 21, 2024
Viaarxiv icon

In-Context Probing Approximates Influence Function for Data Valuation

Add code
Jul 17, 2024
Viaarxiv icon

ResearchArena: Benchmarking LLMs' Ability to Collect and Organize Information as Research Agents

Add code
Jun 13, 2024
Viaarxiv icon

MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models

Add code
Jun 10, 2024
Viaarxiv icon

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Add code
May 13, 2024
Figure 1 for MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Figure 2 for MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Figure 3 for MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Figure 4 for MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Viaarxiv icon

Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval

Add code
Apr 05, 2024
Viaarxiv icon