Picture for Jianhao Yan

Jianhao Yan

Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels

Add code
Nov 21, 2024
Figure 1 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Figure 2 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Figure 3 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Figure 4 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Viaarxiv icon

Keys to Robust Edits: from Theoretical Insights to Practical Advances

Add code
Oct 12, 2024
Viaarxiv icon

ELICIT: LLM Augmentation via External In-Context Capability

Add code
Oct 12, 2024
Figure 1 for ELICIT: LLM Augmentation via External In-Context Capability
Figure 2 for ELICIT: LLM Augmentation via External In-Context Capability
Figure 3 for ELICIT: LLM Augmentation via External In-Context Capability
Figure 4 for ELICIT: LLM Augmentation via External In-Context Capability
Viaarxiv icon

See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses

Add code
Aug 16, 2024
Figure 1 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Figure 2 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Figure 3 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Figure 4 for See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Viaarxiv icon

GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

Add code
Jul 04, 2024
Figure 1 for GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels
Figure 2 for GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels
Figure 3 for GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels
Figure 4 for GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels
Viaarxiv icon

What Have We Achieved on Non-autoregressive Translation?

Add code
May 21, 2024
Figure 1 for What Have We Achieved on Non-autoregressive Translation?
Figure 2 for What Have We Achieved on Non-autoregressive Translation?
Figure 3 for What Have We Achieved on Non-autoregressive Translation?
Figure 4 for What Have We Achieved on Non-autoregressive Translation?
Viaarxiv icon

RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models

Add code
Feb 22, 2024
Viaarxiv icon

Potential and Challenges of Model Editing for Social Debiasing

Add code
Feb 21, 2024
Viaarxiv icon

Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Add code
Oct 30, 2023
Figure 1 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace
Figure 2 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace
Figure 3 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace
Figure 4 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace
Viaarxiv icon

Understanding In-Context Learning from Repetitions

Add code
Oct 10, 2023
Figure 1 for Understanding In-Context Learning from Repetitions
Figure 2 for Understanding In-Context Learning from Repetitions
Figure 3 for Understanding In-Context Learning from Repetitions
Figure 4 for Understanding In-Context Learning from Repetitions
Viaarxiv icon