Picture for Jiaheng Liu

Jiaheng Liu

Aligning Instruction Tuning with Pre-training

Add code
Jan 16, 2025
Viaarxiv icon

Molecular Graph Contrastive Learning with Line Graph

Add code
Jan 15, 2025
Viaarxiv icon

ProgCo: Program Helps Self-Correction of Large Language Models

Add code
Jan 02, 2025
Viaarxiv icon

Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models

Add code
Dec 23, 2024
Viaarxiv icon

PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models

Add code
Dec 10, 2024
Viaarxiv icon

PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos

Add code
Dec 02, 2024
Figure 1 for PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos
Figure 2 for PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos
Figure 3 for PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos
Figure 4 for PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos
Viaarxiv icon

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models

Add code
Nov 13, 2024
Figure 1 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 2 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 3 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 4 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Viaarxiv icon

MdEval: Massively Multilingual Code Debugging

Add code
Nov 04, 2024
Figure 1 for MdEval: Massively Multilingual Code Debugging
Figure 2 for MdEval: Massively Multilingual Code Debugging
Figure 3 for MdEval: Massively Multilingual Code Debugging
Figure 4 for MdEval: Massively Multilingual Code Debugging
Viaarxiv icon

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Add code
Oct 29, 2024
Figure 1 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Figure 2 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Figure 3 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Figure 4 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Viaarxiv icon

M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation

Add code
Oct 28, 2024
Figure 1 for M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Figure 2 for M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Figure 3 for M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Figure 4 for M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Viaarxiv icon