Picture for Long Ma

Long Ma

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Add code
Nov 01, 2024
Viaarxiv icon

Monge-Ampere Regularization for Learning Arbitrary Shapes from Point Clouds

Add code
Oct 24, 2024
Viaarxiv icon

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition

Add code
Aug 18, 2024
Viaarxiv icon

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Add code
Aug 09, 2024
Figure 1 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 2 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 3 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 4 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Viaarxiv icon

Seeing Text in the Dark: Algorithm and Benchmark

Add code
Apr 13, 2024
Figure 1 for Seeing Text in the Dark: Algorithm and Benchmark
Figure 2 for Seeing Text in the Dark: Algorithm and Benchmark
Figure 3 for Seeing Text in the Dark: Algorithm and Benchmark
Figure 4 for Seeing Text in the Dark: Algorithm and Benchmark
Viaarxiv icon

DeCoF: Generated Video Detection via Frame Consistency

Add code
Feb 06, 2024
Viaarxiv icon

Fast Peer Adaptation with Context-aware Exploration

Add code
Feb 04, 2024
Viaarxiv icon

From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Add code
Dec 31, 2023
Viaarxiv icon

3D-GOI: 3D GAN Omni-Inversion for Multifaceted and Multi-object Editing

Add code
Nov 23, 2023
Viaarxiv icon

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Add code
Sep 07, 2023
Viaarxiv icon