Picture for Bohan Li

Bohan Li

Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding

Add code
Oct 29, 2024
Figure 1 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 2 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 3 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 4 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Viaarxiv icon

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

Add code
Oct 23, 2024
Viaarxiv icon

TAPTRv2: Attention-based Position Update Improves Tracking Any Point

Add code
Jul 23, 2024
Figure 1 for TAPTRv2: Attention-based Position Update Improves Tracking Any Point
Figure 2 for TAPTRv2: Attention-based Position Update Improves Tracking Any Point
Figure 3 for TAPTRv2: Attention-based Position Update Improves Tracking Any Point
Figure 4 for TAPTRv2: Attention-based Position Update Improves Tracking Any Point
Viaarxiv icon

On the Effectiveness of Acoustic BPE in Decoder-Only TTS

Add code
Jul 04, 2024
Viaarxiv icon

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Add code
Jul 02, 2024
Viaarxiv icon

Extreme Video Compression with Pre-trained Diffusion Models

Add code
Feb 14, 2024
Viaarxiv icon

Closed-Loop Unsupervised Representation Disentanglement with $β$-VAE Distillation and Diffusion Probabilistic Feedback

Add code
Feb 04, 2024
Viaarxiv icon

Self-Supervised Dynamic Hypergraph Recommendation based on Hyper-Relational Knowledge Graph

Add code
Aug 15, 2023
Viaarxiv icon

One at A Time: Multi-step Volumetric Probability Distribution Diffusion for Depth Estimation

Add code
Jul 07, 2023
Viaarxiv icon

EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model

Add code
Jun 20, 2023
Viaarxiv icon