Picture for Pinjia He

Pinjia He

MicLog: Towards Accurate and Efficient LLM-based Log Parsing via Progressive Meta In-Context Learning

Add code
Jan 11, 2026
Viaarxiv icon

CLEANet: Robust and Efficient Anomaly Detection in Contaminated Multivariate Time Series

Add code
Oct 26, 2025
Figure 1 for CLEANet: Robust and Efficient Anomaly Detection in Contaminated Multivariate Time Series
Figure 2 for CLEANet: Robust and Efficient Anomaly Detection in Contaminated Multivariate Time Series
Figure 3 for CLEANet: Robust and Efficient Anomaly Detection in Contaminated Multivariate Time Series
Figure 4 for CLEANet: Robust and Efficient Anomaly Detection in Contaminated Multivariate Time Series
Viaarxiv icon

Scalable Supervising Software Agents with Patch Reasoner

Add code
Oct 26, 2025
Figure 1 for Scalable Supervising Software Agents with Patch Reasoner
Figure 2 for Scalable Supervising Software Agents with Patch Reasoner
Figure 3 for Scalable Supervising Software Agents with Patch Reasoner
Figure 4 for Scalable Supervising Software Agents with Patch Reasoner
Viaarxiv icon

Curing Miracle Steps in LLM Mathematical Reasoning with Rubric Rewards

Add code
Oct 09, 2025
Viaarxiv icon

UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench

Add code
Jun 10, 2025
Viaarxiv icon

Towards Evaluating Proactive Risk Awareness of Multimodal Language Models

Add code
May 23, 2025
Viaarxiv icon

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

Add code
May 19, 2025
Viaarxiv icon

VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models

Add code
Feb 23, 2025
Figure 1 for VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models
Figure 2 for VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models
Figure 3 for VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models
Figure 4 for VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models
Viaarxiv icon

How Should I Build A Benchmark?

Add code
Jan 18, 2025
Viaarxiv icon

Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs

Add code
Oct 15, 2024
Figure 1 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Figure 2 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Figure 3 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Figure 4 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Viaarxiv icon