Picture for Jinfa Huang

Jinfa Huang

A Survey of Camouflaged Object Detection and Beyond

Add code
Aug 26, 2024
Viaarxiv icon

MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval

Add code
Aug 20, 2024
Figure 1 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Figure 2 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Figure 3 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Figure 4 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Viaarxiv icon

Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection

Add code
Jul 30, 2024
Viaarxiv icon

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Add code
Jun 26, 2024
Figure 1 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Figure 2 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Figure 3 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Figure 4 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Viaarxiv icon

LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference

Add code
Jun 26, 2024
Viaarxiv icon

RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter

Add code
May 29, 2024
Viaarxiv icon

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Add code
Apr 07, 2024
Viaarxiv icon

LLMBind: A Unified Modality-Task Integration Framework

Add code
Mar 08, 2024
Viaarxiv icon

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Add code
Feb 04, 2024
Figure 1 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Figure 2 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Figure 3 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Figure 4 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Viaarxiv icon

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

Add code
Jan 28, 2024
Viaarxiv icon