Picture for Jianyuan Guo

Jianyuan Guo

ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning

Add code
Oct 23, 2024
Viaarxiv icon

Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMs

Add code
Oct 14, 2024
Viaarxiv icon

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

Add code
Aug 13, 2024
Viaarxiv icon

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer

Add code
Jun 04, 2024
Viaarxiv icon

SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution

Add code
Feb 27, 2024
Viaarxiv icon

Data-efficient Large Vision Models through Sequential Autoregression

Add code
Feb 07, 2024
Viaarxiv icon

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Add code
Feb 06, 2024
Viaarxiv icon

A Survey on Transformer Compression

Add code
Feb 05, 2024
Viaarxiv icon

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

Add code
Dec 27, 2023
Figure 1 for PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation
Figure 2 for PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation
Figure 3 for PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation
Figure 4 for PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation
Viaarxiv icon

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

Add code
Oct 30, 2023
Viaarxiv icon