Picture for Siqi Yang

Siqi Yang

Proof-RM: A Scalable and Generalizable Reward Model for Math Proof

Add code
Feb 02, 2026
Viaarxiv icon

V-FAT: Benchmarking Visual Fidelity Against Text-bias

Add code
Jan 08, 2026
Viaarxiv icon

MeniMV: A Multi-view Benchmark for Meniscus Injury Severity Grading

Add code
Dec 20, 2025
Viaarxiv icon

Learning When to Look: A Disentangled Curriculum for Strategic Perception in Multimodal Reasoning

Add code
Dec 19, 2025
Viaarxiv icon

RecGPT-V2 Technical Report

Add code
Dec 16, 2025
Figure 1 for RecGPT-V2 Technical Report
Figure 2 for RecGPT-V2 Technical Report
Figure 3 for RecGPT-V2 Technical Report
Figure 4 for RecGPT-V2 Technical Report
Viaarxiv icon

Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner

Add code
Dec 11, 2025
Figure 1 for Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner
Figure 2 for Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner
Figure 3 for Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner
Figure 4 for Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner
Viaarxiv icon

Metis-HOME: Hybrid Optimized Mixture-of-Experts for Multimodal Reasoning

Add code
Oct 23, 2025
Viaarxiv icon

DocTron-Formula: Generalized Formula Recognition in Complex and Structured Scenarios

Add code
Aug 01, 2025
Figure 1 for DocTron-Formula: Generalized Formula Recognition in Complex and Structured Scenarios
Figure 2 for DocTron-Formula: Generalized Formula Recognition in Complex and Structured Scenarios
Figure 3 for DocTron-Formula: Generalized Formula Recognition in Complex and Structured Scenarios
Figure 4 for DocTron-Formula: Generalized Formula Recognition in Complex and Structured Scenarios
Viaarxiv icon

PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms

Add code
May 28, 2025
Viaarxiv icon

UniViTAR: Unified Vision Transformer with Native Resolution

Add code
Apr 02, 2025
Viaarxiv icon