Picture for Jiahao Wang

Jiahao Wang

Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and Benchmark

Add code
Mar 10, 2025
Viaarxiv icon

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

Add code
Mar 09, 2025
Viaarxiv icon

FuzzyLight: A Robust Two-Stage Fuzzy Approach for Traffic Signal Control Works in Real Cities

Add code
Jan 27, 2025
Viaarxiv icon

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

Add code
Dec 12, 2024
Figure 1 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Figure 2 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Figure 3 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Figure 4 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Viaarxiv icon

GenEx: Generating an Explorable World

Add code
Dec 12, 2024
Viaarxiv icon

VP-MEL: Visual Prompts Guided Multimodal Entity Linking

Add code
Dec 10, 2024
Viaarxiv icon

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Add code
Dec 06, 2024
Figure 1 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Figure 2 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Figure 3 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Figure 4 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Viaarxiv icon

Towards Precise Scaling Laws for Video Diffusion Transformers

Add code
Nov 25, 2024
Figure 1 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 2 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 3 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 4 for Towards Precise Scaling Laws for Video Diffusion Transformers
Viaarxiv icon

TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models

Add code
Nov 24, 2024
Figure 1 for TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models
Figure 2 for TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models
Figure 3 for TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models
Figure 4 for TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models
Viaarxiv icon

DMQR-RAG: Diverse Multi-Query Rewriting for RAG

Add code
Nov 20, 2024
Viaarxiv icon