Picture for Xingyu Zhang

Xingyu Zhang

DexTac: Learning Contact-aware Visuotactile Policies via Hand-by-hand Teaching

Add code
Jan 29, 2026
Viaarxiv icon

Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition

Add code
Jan 18, 2026
Viaarxiv icon

Causality-Aware Temporal Projection for Video Understanding in Video-LLMs

Add code
Jan 05, 2026
Viaarxiv icon

Cross-platform Product Matching Based on Entity Alignment of Knowledge Graph with RAEA model

Add code
Dec 08, 2025
Viaarxiv icon

Beyond ReAct: A Planner-Centric Framework for Complex Tool-Augmented LLM Reasoning

Add code
Nov 13, 2025
Viaarxiv icon

CAIFormer: A Causal Informed Transformer for Multivariate Time Series Forecasting

Add code
May 22, 2025
Figure 1 for CAIFormer: A Causal Informed Transformer for Multivariate Time Series Forecasting
Figure 2 for CAIFormer: A Causal Informed Transformer for Multivariate Time Series Forecasting
Figure 3 for CAIFormer: A Causal Informed Transformer for Multivariate Time Series Forecasting
Figure 4 for CAIFormer: A Causal Informed Transformer for Multivariate Time Series Forecasting
Viaarxiv icon

SLOT: Sample-specific Language Model Optimization at Test-time

Add code
May 18, 2025
Viaarxiv icon

GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving

Add code
Mar 07, 2025
Viaarxiv icon

Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving

Add code
Mar 05, 2025
Figure 1 for Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving
Figure 2 for Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving
Figure 3 for Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving
Figure 4 for Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving
Viaarxiv icon

AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals

Add code
Jan 28, 2025
Figure 1 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 2 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 3 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 4 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Viaarxiv icon