Picture for Chenghao Yang

Chenghao Yang

DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains

Add code
Nov 14, 2025
Viaarxiv icon

Optimizing Diversity and Quality through Base-Aligned Model Collaboration

Add code
Nov 07, 2025
Viaarxiv icon

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning

Add code
Sep 16, 2025
Figure 1 for FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
Figure 2 for FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
Figure 3 for FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
Figure 4 for FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
Viaarxiv icon

Tokenized Bandit for LLM Decoding and Alignment

Add code
Jun 08, 2025
Figure 1 for Tokenized Bandit for LLM Decoding and Alignment
Figure 2 for Tokenized Bandit for LLM Decoding and Alignment
Figure 3 for Tokenized Bandit for LLM Decoding and Alignment
Figure 4 for Tokenized Bandit for LLM Decoding and Alignment
Viaarxiv icon

MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation

Add code
May 27, 2025
Viaarxiv icon

Grounded Persuasive Language Generation for Automated Marketing

Add code
Feb 24, 2025
Figure 1 for Grounded Persuasive Language Generation for Automated Marketing
Figure 2 for Grounded Persuasive Language Generation for Automated Marketing
Figure 3 for Grounded Persuasive Language Generation for Automated Marketing
Figure 4 for Grounded Persuasive Language Generation for Automated Marketing
Viaarxiv icon

CryptoX : Compositional Reasoning Evaluation of Large Language Models

Add code
Feb 08, 2025
Viaarxiv icon

Hello Again! LLM-powered Personalized Agent for Long-term Dialogue

Add code
Jun 09, 2024
Figure 1 for Hello Again! LLM-powered Personalized Agent for Long-term Dialogue
Figure 2 for Hello Again! LLM-powered Personalized Agent for Long-term Dialogue
Figure 3 for Hello Again! LLM-powered Personalized Agent for Long-term Dialogue
Figure 4 for Hello Again! LLM-powered Personalized Agent for Long-term Dialogue
Viaarxiv icon

Equipping Transformer with Random-Access Reading for Long-Context Understanding

Add code
May 21, 2024
Figure 1 for Equipping Transformer with Random-Access Reading for Long-Context Understanding
Figure 2 for Equipping Transformer with Random-Access Reading for Long-Context Understanding
Figure 3 for Equipping Transformer with Random-Access Reading for Long-Context Understanding
Figure 4 for Equipping Transformer with Random-Access Reading for Long-Context Understanding
Viaarxiv icon

When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models

Add code
Apr 14, 2024
Figure 1 for When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models
Figure 2 for When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models
Figure 3 for When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models
Figure 4 for When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models
Viaarxiv icon