Picture for Weiming Hu

Weiming Hu

PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments

Add code
Feb 24, 2025
Viaarxiv icon

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Add code
Feb 21, 2025
Viaarxiv icon

mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA

Add code
Nov 22, 2024
Figure 1 for mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA
Figure 2 for mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA
Figure 3 for mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA
Figure 4 for mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA
Viaarxiv icon

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

Add code
Nov 15, 2024
Figure 1 for SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning
Figure 2 for SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning
Figure 3 for SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning
Figure 4 for SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning
Viaarxiv icon

HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision

Add code
Nov 11, 2024
Figure 1 for HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision
Figure 2 for HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision
Figure 3 for HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision
Figure 4 for HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision
Viaarxiv icon

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

Add code
Nov 03, 2024
Figure 1 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 2 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 3 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 4 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Viaarxiv icon

Token Caching for Diffusion Transformer Acceleration

Add code
Sep 27, 2024
Viaarxiv icon

MobileIQA: Exploiting Mobile-level Diverse Opinion Network For No-Reference Image Quality Assessment Using Knowledge Distillation

Add code
Sep 02, 2024
Viaarxiv icon

Content-decoupled Contrastive Learning-based Implicit Degradation Modeling for Blind Image Super-Resolution

Add code
Aug 10, 2024
Viaarxiv icon

vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving

Add code
Jul 22, 2024
Figure 1 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Figure 2 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Figure 3 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Figure 4 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Viaarxiv icon