Picture for Yun Luo

Yun Luo

RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction

Add code
Feb 25, 2025
Viaarxiv icon

PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization

Add code
Dec 17, 2024
Figure 1 for PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
Figure 2 for PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
Figure 3 for PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
Figure 4 for PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
Viaarxiv icon

Task Calibration: Calibrating Large Language Models on Inference Tasks

Add code
Oct 24, 2024
Viaarxiv icon

Keys to Robust Edits: from Theoretical Insights to Practical Advances

Add code
Oct 12, 2024
Viaarxiv icon

OpenResearcher: Unleashing AI for Accelerated Scientific Research

Add code
Aug 13, 2024
Figure 1 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Figure 2 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Figure 3 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Figure 4 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Viaarxiv icon

RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models

Add code
May 23, 2024
Figure 1 for RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Figure 2 for RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Figure 3 for RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Figure 4 for RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Viaarxiv icon

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

Add code
Apr 18, 2024
Viaarxiv icon

RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models

Add code
Feb 22, 2024
Viaarxiv icon

XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners

Add code
Oct 09, 2023
Viaarxiv icon

Enhancing Argument Structure Extraction with Efficient Leverage of Contextual Information

Add code
Oct 08, 2023
Viaarxiv icon