Picture for Aosong Cheng

Aosong Cheng

[CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster

Add code
Dec 02, 2024
Viaarxiv icon

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

Add code
Jul 30, 2024
Figure 1 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 2 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 3 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 4 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Viaarxiv icon

Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation

Add code
May 26, 2024
Figure 1 for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
Figure 2 for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
Figure 3 for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
Figure 4 for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
Viaarxiv icon

DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing

Add code
Mar 21, 2024
Figure 1 for DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
Figure 2 for DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
Figure 3 for DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
Figure 4 for DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
Viaarxiv icon