Picture for Chao Zhang

Chao Zhang

College of Computing, Georgia Institute of Technology

Fast Disentangled Slim Tensor Learning for Multi-view Clustering

Add code
Nov 12, 2024
Viaarxiv icon

Matryoshka: Learning to Drive Black-Box LLMs with LLMs

Add code
Oct 28, 2024
Viaarxiv icon

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

Add code
Oct 09, 2024
Figure 1 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 2 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 3 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 4 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Viaarxiv icon

BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models

Add code
Oct 09, 2024
Viaarxiv icon

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer

Add code
Oct 07, 2024
Viaarxiv icon

LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy

Add code
Oct 04, 2024
Viaarxiv icon

SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding

Add code
Sep 30, 2024
Figure 1 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Figure 2 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Figure 3 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Figure 4 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Viaarxiv icon

LW2G: Learning Whether to Grow for Prompt-based Continual Learning

Add code
Sep 27, 2024
Viaarxiv icon

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

Add code
Sep 25, 2024
Viaarxiv icon

MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events

Add code
Sep 25, 2024
Figure 1 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Figure 2 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Figure 3 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Figure 4 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Viaarxiv icon