Picture for Yue Huang

Yue Huang

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Add code
Mar 24, 2025
Viaarxiv icon

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

Add code
Mar 19, 2025
Viaarxiv icon

Bridging Synthetic-to-Real Gaps: Frequency-Aware Perturbation and Selection for Single-shot Multi-Parametric Mapping Reconstruction

Add code
Mar 05, 2025
Viaarxiv icon

Beyond Single-Value Metrics: Evaluating and Enhancing LLM Unlearning with Cognitive Diagnosis

Add code
Feb 19, 2025
Viaarxiv icon

Breaking Focus: Contextual Distraction Curse in Large Language Models

Add code
Feb 03, 2025
Figure 1 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Figure 2 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Figure 3 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Figure 4 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Viaarxiv icon

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Add code
Feb 03, 2025
Viaarxiv icon

STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling

Add code
Dec 21, 2024
Figure 1 for STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
Figure 2 for STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
Figure 3 for STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
Figure 4 for STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
Viaarxiv icon

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

Add code
Nov 26, 2024
Viaarxiv icon

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

Add code
Oct 30, 2024
Viaarxiv icon

AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?

Add code
Oct 29, 2024
Viaarxiv icon