Picture for Hyeonseok Moon

Hyeonseok Moon

The Impact of Negated Text on Hallucination with Large Language Models

Add code
Oct 23, 2025
Viaarxiv icon

Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models

Add code
Oct 09, 2025
Figure 1 for Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Figure 2 for Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Figure 3 for Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Figure 4 for Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Viaarxiv icon

NeedleChain: Measuring Intact Long-Context Reasoning Capability of Large Language Models

Add code
Jul 30, 2025
Viaarxiv icon

Cross-Lingual Optimization for Language Transfer in Large Language Models

Add code
May 20, 2025
Viaarxiv icon

Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer

Add code
May 16, 2025
Viaarxiv icon

MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation

Add code
Apr 23, 2025
Figure 1 for MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Figure 2 for MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Figure 3 for MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Figure 4 for MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Viaarxiv icon

Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning

Add code
Apr 07, 2025
Figure 1 for Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning
Figure 2 for Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning
Figure 3 for Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning
Figure 4 for Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning
Viaarxiv icon

FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models

Add code
Mar 25, 2025
Figure 1 for FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
Figure 2 for FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
Figure 3 for FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
Figure 4 for FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
Viaarxiv icon

Call for Rigor in Reporting Quality of Instruction Tuning Data

Add code
Mar 04, 2025
Viaarxiv icon

Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models

Add code
Dec 27, 2024
Viaarxiv icon