Picture for Shuaihang Yuan

Shuaihang Yuan

A Chain-of-Thought Subspace Meta-Learning for Few-shot Image Captioning with Large Vision and Language Models

Add code
Feb 19, 2025
Viaarxiv icon

GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance

Add code
Oct 31, 2024
Figure 1 for GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
Figure 2 for GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
Figure 3 for GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
Figure 4 for GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
Viaarxiv icon

Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation

Add code
Oct 29, 2024
Figure 1 for Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation
Figure 2 for Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation
Figure 3 for Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation
Figure 4 for Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation
Viaarxiv icon

Exploring the Reliability of Foundation Model-Based Frontier Selection in Zero-Shot Object Goal Navigation

Add code
Oct 28, 2024
Viaarxiv icon

Zero-shot Object Navigation with Vision-Language Models Reasoning

Add code
Oct 24, 2024
Figure 1 for Zero-shot Object Navigation with Vision-Language Models Reasoning
Figure 2 for Zero-shot Object Navigation with Vision-Language Models Reasoning
Figure 3 for Zero-shot Object Navigation with Vision-Language Models Reasoning
Figure 4 for Zero-shot Object Navigation with Vision-Language Models Reasoning
Viaarxiv icon

MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment

Add code
Sep 24, 2024
Figure 1 for MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment
Figure 2 for MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment
Figure 3 for MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment
Figure 4 for MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment
Viaarxiv icon

FairCLIP: Harnessing Fairness in Vision-Language Learning

Add code
Apr 05, 2024
Figure 1 for FairCLIP: Harnessing Fairness in Vision-Language Learning
Figure 2 for FairCLIP: Harnessing Fairness in Vision-Language Learning
Figure 3 for FairCLIP: Harnessing Fairness in Vision-Language Learning
Figure 4 for FairCLIP: Harnessing Fairness in Vision-Language Learning
Viaarxiv icon

How Secure Are Large Language Models for Navigation in Urban Environments?

Add code
Feb 14, 2024
Viaarxiv icon

VisPercep: A Vision-Language Approach to Enhance Visual Perception for People with Blindness and Low Vision

Add code
Oct 31, 2023
Figure 1 for VisPercep: A Vision-Language Approach to Enhance Visual Perception for People with Blindness and Low Vision
Figure 2 for VisPercep: A Vision-Language Approach to Enhance Visual Perception for People with Blindness and Low Vision
Figure 3 for VisPercep: A Vision-Language Approach to Enhance Visual Perception for People with Blindness and Low Vision
Figure 4 for VisPercep: A Vision-Language Approach to Enhance Visual Perception for People with Blindness and Low Vision
Viaarxiv icon

SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild

Add code
Jul 11, 2022
Figure 1 for SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild
Figure 2 for SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild
Figure 3 for SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild
Figure 4 for SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild
Viaarxiv icon