Picture for Yanyuan Qiao

Yanyuan Qiao

FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks

Add code
Mar 18, 2025
Viaarxiv icon

SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation

Add code
Mar 13, 2025
Viaarxiv icon

Ground-level Viewpoint Vision-and-Language Navigation in Continuous Environments

Add code
Feb 26, 2025
Viaarxiv icon

General Scene Adaptation for Vision-and-Language Navigation

Add code
Jan 29, 2025
Viaarxiv icon

Effective Tuning Strategies for Generalist Robot Manipulation Policies

Add code
Oct 02, 2024
Viaarxiv icon

MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation

Add code
Oct 02, 2024
Viaarxiv icon

MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation

Add code
Sep 27, 2024
Figure 1 for MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
Figure 2 for MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
Figure 3 for MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
Figure 4 for MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
Viaarxiv icon

Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs

Add code
Sep 27, 2024
Figure 1 for Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Figure 2 for Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Figure 3 for Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Figure 4 for Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Viaarxiv icon

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Add code
Jul 09, 2024
Figure 1 for Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Figure 2 for Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Figure 3 for Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Viaarxiv icon

VL-Mamba: Exploring State Space Models for Multimodal Learning

Add code
Mar 20, 2024
Figure 1 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Figure 2 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Figure 3 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Figure 4 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Viaarxiv icon