Picture for Ethan Chern

Ethan Chern

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Add code
Nov 25, 2024
Viaarxiv icon

Halu-J: Critique-Based Hallucination Judge

Add code
Jul 17, 2024
Viaarxiv icon

ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation

Add code
Jul 08, 2024
Viaarxiv icon

BeHonest: Benchmarking Honesty of Large Language Models

Add code
Jun 19, 2024
Viaarxiv icon

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

Add code
Jun 18, 2024
Figure 1 for OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Figure 2 for OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Figure 3 for OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Figure 4 for OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Viaarxiv icon

Reformatted Alignment

Add code
Feb 19, 2024
Viaarxiv icon

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate

Add code
Jan 30, 2024
Viaarxiv icon

Align on the Fly: Adapting Chatbot Behavior to Established Norms

Add code
Dec 26, 2023
Viaarxiv icon

Alignment for Honesty

Add code
Dec 12, 2023
Viaarxiv icon