Picture for Junyang Wang

Junyang Wang

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

Add code
Jun 03, 2024
Viaarxiv icon

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Add code
Jan 29, 2024
Viaarxiv icon

An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation

Add code
Nov 13, 2023
Viaarxiv icon

Evaluation and Analysis of Hallucination in Large Vision-Language Models

Add code
Aug 29, 2023
Figure 1 for Evaluation and Analysis of Hallucination in Large Vision-Language Models
Figure 2 for Evaluation and Analysis of Hallucination in Large Vision-Language Models
Figure 3 for Evaluation and Analysis of Hallucination in Large Vision-Language Models
Figure 4 for Evaluation and Analysis of Hallucination in Large Vision-Language Models
Viaarxiv icon

Overlap Bias Matching is Necessary for Point Cloud Registration

Add code
Aug 18, 2023
Viaarxiv icon

Benign Shortcut for Debiasing: Fair Visual Recognition via Intervention with Shortcut Features

Add code
Aug 13, 2023
Viaarxiv icon

From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping

Add code
May 08, 2023
Viaarxiv icon

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

Add code
Apr 27, 2023
Figure 1 for mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Figure 2 for mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Figure 3 for mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Figure 4 for mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Viaarxiv icon

Improved Visual Fine-tuning with Natural Language Supervision

Add code
Apr 04, 2023
Figure 1 for Improved Visual Fine-tuning with Natural Language Supervision
Figure 2 for Improved Visual Fine-tuning with Natural Language Supervision
Figure 3 for Improved Visual Fine-tuning with Natural Language Supervision
Figure 4 for Improved Visual Fine-tuning with Natural Language Supervision
Viaarxiv icon

Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment

Add code
Nov 14, 2022
Viaarxiv icon