Picture for Minghan Li

Minghan Li

Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks

Add code
Dec 09, 2024
Viaarxiv icon

KeyB2: Selecting Key Blocks is Also Important for Long Document Ranking with Large Language Models

Add code
Nov 09, 2024
Figure 1 for KeyB2: Selecting Key Blocks is Also Important for Long Document Ranking with Large Language Models
Figure 2 for KeyB2: Selecting Key Blocks is Also Important for Long Document Ranking with Large Language Models
Figure 3 for KeyB2: Selecting Key Blocks is Also Important for Long Document Ranking with Large Language Models
Figure 4 for KeyB2: Selecting Key Blocks is Also Important for Long Document Ranking with Large Language Models
Viaarxiv icon

Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion

Add code
Oct 19, 2024
Viaarxiv icon

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

Add code
Aug 21, 2024
Viaarxiv icon

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions

Add code
Jun 27, 2024
Figure 1 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Figure 2 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Figure 3 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Figure 4 for Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
Viaarxiv icon

Unifying Multimodal Retrieval via Document Screenshot Embedding

Add code
Jun 17, 2024
Viaarxiv icon

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Add code
May 29, 2024
Viaarxiv icon

Domain Adaptation for Dense Retrieval and Conversational Dense Retrieval through Self-Supervision by Meticulous Pseudo-Relevance Labeling

Add code
Mar 13, 2024
Viaarxiv icon

UniVS: Unified and Universal Video Segmentation with Prompts as Queries

Add code
Feb 28, 2024
Viaarxiv icon

OpenSD: Unified Open-Vocabulary Segmentation and Detection

Add code
Dec 10, 2023
Viaarxiv icon