Picture for Wentao Liu

Wentao Liu

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension

Add code
Nov 04, 2024
Figure 1 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Figure 2 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Figure 3 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Figure 4 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Viaarxiv icon

PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging

Add code
Sep 26, 2024
Figure 1 for PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging
Figure 2 for PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging
Figure 3 for PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging
Figure 4 for PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging
Viaarxiv icon

CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models

Add code
Sep 04, 2024
Figure 1 for CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models
Figure 2 for CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models
Figure 3 for CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models
Figure 4 for CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models
Viaarxiv icon

CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications

Add code
Aug 07, 2024
Viaarxiv icon

TCFormer: Visual Recognition via Token Clustering Transformer

Add code
Jul 16, 2024
Viaarxiv icon

When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset

Add code
Jul 14, 2024
Viaarxiv icon

F-LMM: Grounding Frozen Large Multimodal Models

Add code
Jun 09, 2024
Viaarxiv icon

LenslessFace: An End-to-End Optimized Lensless System for Privacy-Preserving Face Verification

Add code
Jun 06, 2024
Viaarxiv icon

The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

Add code
May 14, 2024
Figure 1 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Figure 2 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Figure 3 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Figure 4 for The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Viaarxiv icon

UniFS: Universal Few-shot Instance Perception with Point Representations

Add code
Apr 30, 2024
Viaarxiv icon