Picture for Yancheng He

Yancheng He

Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models

Add code
Dec 23, 2024
Viaarxiv icon

Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation

Add code
Dec 19, 2024
Viaarxiv icon

WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis

Add code
Dec 04, 2024
Figure 1 for WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis
Figure 2 for WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis
Figure 3 for WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis
Figure 4 for WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis
Viaarxiv icon

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models

Add code
Nov 13, 2024
Figure 1 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 2 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 3 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 4 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Viaarxiv icon

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision

Add code
Oct 25, 2024
Figure 1 for 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Figure 2 for 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Figure 3 for 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Figure 4 for 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Viaarxiv icon

GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models

Add code
Jun 20, 2024
Viaarxiv icon

Aiming at the Target: Filter Collaborative Information for Cross-Domain Recommendation

Add code
Mar 29, 2024
Figure 1 for Aiming at the Target: Filter Collaborative Information for Cross-Domain Recommendation
Figure 2 for Aiming at the Target: Filter Collaborative Information for Cross-Domain Recommendation
Figure 3 for Aiming at the Target: Filter Collaborative Information for Cross-Domain Recommendation
Figure 4 for Aiming at the Target: Filter Collaborative Information for Cross-Domain Recommendation
Viaarxiv icon

MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues

Add code
Feb 22, 2024
Figure 1 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Figure 2 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Figure 3 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Figure 4 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Viaarxiv icon

Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition

Add code
Feb 27, 2023
Viaarxiv icon

Modeling User Repeat Consumption Behavior for Online Novel Recommendation

Add code
Sep 05, 2022
Figure 1 for Modeling User Repeat Consumption Behavior for Online Novel Recommendation
Figure 2 for Modeling User Repeat Consumption Behavior for Online Novel Recommendation
Figure 3 for Modeling User Repeat Consumption Behavior for Online Novel Recommendation
Figure 4 for Modeling User Repeat Consumption Behavior for Online Novel Recommendation
Viaarxiv icon