Picture for Pengfei Liu

Pengfei Liu

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Add code
Nov 25, 2024
Viaarxiv icon

Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

Add code
Oct 24, 2024
Viaarxiv icon

Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs

Add code
Oct 15, 2024
Figure 1 for Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Figure 2 for Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Figure 3 for Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Figure 4 for Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Viaarxiv icon

ECon: On the Detection and Resolution of Evidence Conflicts

Add code
Oct 05, 2024
Viaarxiv icon

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Add code
Sep 25, 2024
Figure 1 for Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
Figure 2 for Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
Figure 3 for Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
Figure 4 for Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
Viaarxiv icon

RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation

Add code
Aug 15, 2024
Viaarxiv icon

OpenResearcher: Unleashing AI for Accelerated Scientific Research

Add code
Aug 13, 2024
Figure 1 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Figure 2 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Figure 3 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Figure 4 for OpenResearcher: Unleashing AI for Accelerated Scientific Research
Viaarxiv icon

Data Contamination Report from the 2024 CONDA Shared Task

Add code
Jul 31, 2024
Figure 1 for Data Contamination Report from the 2024 CONDA Shared Task
Figure 2 for Data Contamination Report from the 2024 CONDA Shared Task
Figure 3 for Data Contamination Report from the 2024 CONDA Shared Task
Figure 4 for Data Contamination Report from the 2024 CONDA Shared Task
Viaarxiv icon

OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance

Add code
Jul 30, 2024
Figure 1 for OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance
Figure 2 for OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance
Figure 3 for OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance
Figure 4 for OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance
Viaarxiv icon

SAFETY-J: Evaluating Safety with Critique

Add code
Jul 25, 2024
Viaarxiv icon