Picture for Yuan Zhang

Yuan Zhang

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations

Add code
Mar 15, 2025
Viaarxiv icon

TGP: Two-modal occupancy prediction with 3D Gaussian and sparse points for 3D Environment Awareness

Add code
Mar 13, 2025
Viaarxiv icon

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Add code
Mar 03, 2025
Viaarxiv icon

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete

Add code
Feb 28, 2025
Viaarxiv icon

Joint Registration and Conformal Prediction for Partially Observed Functional Data

Add code
Feb 20, 2025
Viaarxiv icon

A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models

Add code
Feb 11, 2025
Viaarxiv icon

Inverse Reinforcement Learning via Convex Optimization

Add code
Jan 27, 2025
Figure 1 for Inverse Reinforcement Learning via Convex Optimization
Figure 2 for Inverse Reinforcement Learning via Convex Optimization
Viaarxiv icon

EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration

Add code
Jan 10, 2025
Figure 1 for EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration
Figure 2 for EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration
Figure 3 for EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration
Figure 4 for EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration
Viaarxiv icon

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

Add code
Jan 03, 2025
Figure 1 for MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
Figure 2 for MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
Figure 3 for MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
Figure 4 for MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
Viaarxiv icon

DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images

Add code
Dec 25, 2024
Viaarxiv icon