Picture for Long Ma

Long Ma

Simulating Human-like Daily Activities with Desire-driven Autonomy

Add code
Dec 09, 2024
Viaarxiv icon

FreeCodec: A disentangled neural speech codec with fewer tokens

Add code
Dec 02, 2024
Viaarxiv icon

HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning

Add code
Nov 29, 2024
Viaarxiv icon

PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors

Add code
Nov 28, 2024
Viaarxiv icon

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution

Add code
Nov 19, 2024
Viaarxiv icon

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Add code
Nov 01, 2024
Viaarxiv icon

Monge-Ampere Regularization for Learning Arbitrary Shapes from Point Clouds

Add code
Oct 24, 2024
Viaarxiv icon

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition

Add code
Aug 18, 2024
Viaarxiv icon

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Add code
Aug 09, 2024
Figure 1 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 2 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 3 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 4 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Viaarxiv icon

Seeing Text in the Dark: Algorithm and Benchmark

Add code
Apr 13, 2024
Figure 1 for Seeing Text in the Dark: Algorithm and Benchmark
Figure 2 for Seeing Text in the Dark: Algorithm and Benchmark
Figure 3 for Seeing Text in the Dark: Algorithm and Benchmark
Figure 4 for Seeing Text in the Dark: Algorithm and Benchmark
Viaarxiv icon