Picture for Ziyuan Huang

Ziyuan Huang

Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight

Add code
Jul 22, 2024
Viaarxiv icon

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery

Add code
Dec 15, 2023
Figure 1 for SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
Figure 2 for SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
Figure 3 for SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
Figure 4 for SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
Viaarxiv icon

Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone

Add code
Oct 30, 2023
Viaarxiv icon

Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning

Add code
Sep 14, 2023
Viaarxiv icon

Towards Real-World Visual Tracking with Temporal Contexts

Add code
Aug 20, 2023
Viaarxiv icon

Temporally-Adaptive Models for Efficient Video Understanding

Add code
Aug 10, 2023
Viaarxiv icon

Rethinking Efficient Tuning Methods from a Unified Perspective

Add code
Mar 01, 2023
Viaarxiv icon

Physically Plausible Animation of Human Upper Body from a Single Image

Add code
Dec 09, 2022
Figure 1 for Physically Plausible Animation of Human Upper Body from a Single Image
Figure 2 for Physically Plausible Animation of Human Upper Body from a Single Image
Figure 3 for Physically Plausible Animation of Human Upper Body from a Single Image
Figure 4 for Physically Plausible Animation of Human Upper Body from a Single Image
Viaarxiv icon

Progressive Learning without Forgetting

Add code
Nov 28, 2022
Viaarxiv icon

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Add code
Nov 21, 2022
Viaarxiv icon