Picture for Yan Lu

Yan Lu

SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

Add code
Apr 08, 2025
Viaarxiv icon

GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

Add code
Mar 24, 2025
Viaarxiv icon

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations

Add code
Mar 15, 2025
Viaarxiv icon

StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams

Add code
Mar 08, 2025
Viaarxiv icon

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Add code
Mar 03, 2025
Viaarxiv icon

Towards Practical Real-Time Neural Video Compression

Add code
Feb 28, 2025
Viaarxiv icon

UVRM: A Scalable 3D Reconstruction Model from Unposed Videos

Add code
Jan 16, 2025
Figure 1 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 2 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 3 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 4 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Viaarxiv icon

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Add code
Dec 23, 2024
Viaarxiv icon

GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Add code
Dec 22, 2024
Viaarxiv icon

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Viaarxiv icon