Picture for Lei Hou

Lei Hou

MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark

Add code
Mar 10, 2025
Viaarxiv icon

Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models

Add code
Feb 27, 2025
Viaarxiv icon

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Add code
Feb 26, 2025
Viaarxiv icon

LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models

Add code
Feb 20, 2025
Viaarxiv icon

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Add code
Dec 19, 2024
Figure 1 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Figure 2 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Figure 3 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Figure 4 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Viaarxiv icon

EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents

Add code
Dec 16, 2024
Figure 1 for EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Figure 2 for EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Figure 3 for EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Figure 4 for EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Viaarxiv icon

AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning

Add code
Nov 25, 2024
Figure 1 for AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
Figure 2 for AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
Figure 3 for AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
Figure 4 for AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
Viaarxiv icon

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Add code
Oct 31, 2024
Figure 1 for Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Figure 2 for Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Figure 3 for Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Figure 4 for Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Viaarxiv icon

LongReward: Improving Long-context Large Language Models with AI Feedback

Add code
Oct 28, 2024
Viaarxiv icon

RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

Add code
Oct 21, 2024
Figure 1 for RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Figure 2 for RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Figure 3 for RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Figure 4 for RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Viaarxiv icon