Picture for Handong Zhao

Handong Zhao

GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration

Add code
Jan 27, 2025
Viaarxiv icon

MAGNET: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities

Add code
Jan 15, 2025
Viaarxiv icon

DynaSaur: Large Language Agents Beyond Predefined Actions

Add code
Nov 04, 2024
Figure 1 for DynaSaur: Large Language Agents Beyond Predefined Actions
Figure 2 for DynaSaur: Large Language Agents Beyond Predefined Actions
Figure 3 for DynaSaur: Large Language Agents Beyond Predefined Actions
Figure 4 for DynaSaur: Large Language Agents Beyond Predefined Actions
Viaarxiv icon

VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs

Add code
Jul 02, 2024
Viaarxiv icon

Reminding Multimodal Large Language Models of Object-aware Knowledge with Retrieved Tags

Add code
Jun 16, 2024
Viaarxiv icon

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

Add code
Apr 18, 2024
Viaarxiv icon

Fine-tuning CLIP Text Encoders with Two-step Paraphrasing

Add code
Feb 23, 2024
Figure 1 for Fine-tuning CLIP Text Encoders with Two-step Paraphrasing
Figure 2 for Fine-tuning CLIP Text Encoders with Two-step Paraphrasing
Figure 3 for Fine-tuning CLIP Text Encoders with Two-step Paraphrasing
Figure 4 for Fine-tuning CLIP Text Encoders with Two-step Paraphrasing
Viaarxiv icon

Augment before You Try: Knowledge-Enhanced Table Question Answering via Table Expansion

Add code
Jan 28, 2024
Viaarxiv icon

Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations

Add code
Jan 11, 2024
Viaarxiv icon

InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding

Add code
Jun 08, 2023
Figure 1 for InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Figure 2 for InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Figure 3 for InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Figure 4 for InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Viaarxiv icon