Picture for Jiaxing Huang

Jiaxing Huang

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Add code
Dec 24, 2024
Viaarxiv icon

SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing

Add code
Nov 28, 2024
Viaarxiv icon

A Survey on Vision Autoregressive Model

Add code
Nov 13, 2024
Viaarxiv icon

Historical Test-time Prompt Tuning for Vision Foundation Models

Add code
Oct 27, 2024
Figure 1 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 2 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 3 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 4 for Historical Test-time Prompt Tuning for Vision Foundation Models
Viaarxiv icon

Open-Vocabulary Object Detection via Language Hierarchy

Add code
Oct 27, 2024
Figure 1 for Open-Vocabulary Object Detection via Language Hierarchy
Figure 2 for Open-Vocabulary Object Detection via Language Hierarchy
Figure 3 for Open-Vocabulary Object Detection via Language Hierarchy
Figure 4 for Open-Vocabulary Object Detection via Language Hierarchy
Viaarxiv icon

Foundation Models for Remote Sensing and Earth Observation: A Survey

Add code
Oct 22, 2024
Viaarxiv icon

LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models

Add code
Oct 15, 2024
Figure 1 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 2 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 3 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 4 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Viaarxiv icon

A Survey on Evaluation of Multimodal Large Language Models

Add code
Aug 28, 2024
Viaarxiv icon

Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures

Add code
Jul 20, 2024
Viaarxiv icon

Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans

Add code
Mar 22, 2024
Figure 1 for Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans
Figure 2 for Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans
Figure 3 for Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans
Figure 4 for Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans
Viaarxiv icon