Picture for Minghan Li

Minghan Li

Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion

Add code
Oct 19, 2024
Viaarxiv icon

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

Add code
Aug 21, 2024
Viaarxiv icon

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions

Add code
Jun 27, 2024
Figure 1 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Figure 2 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Figure 3 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Figure 4 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Viaarxiv icon

Unifying Multimodal Retrieval via Document Screenshot Embedding

Add code
Jun 17, 2024
Viaarxiv icon

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Add code
May 29, 2024
Viaarxiv icon

Domain Adaptation for Dense Retrieval and Conversational Dense Retrieval through Self-Supervision by Meticulous Pseudo-Relevance Labeling

Add code
Mar 13, 2024
Viaarxiv icon

UniVS: Unified and Universal Video Segmentation with Prompts as Queries

Add code
Feb 28, 2024
Viaarxiv icon

OpenSD: Unified Open-Vocabulary Segmentation and Detection

Add code
Dec 10, 2023
Viaarxiv icon

Generate, Filter, and Fuse: Query Expansion via Multi-Step Keyword Generation for Zero-Shot Neural Rankers

Add code
Nov 15, 2023
Viaarxiv icon

BoxVIS: Video Instance Segmentation with Box Annotations

Add code
Mar 26, 2023
Viaarxiv icon