Picture for Shiming Xiang

Shiming Xiang

Continuous Speculative Decoding for Autoregressive Image Generation

Add code
Nov 18, 2024
Viaarxiv icon

A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem

Add code
Oct 15, 2024
Figure 1 for A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Figure 2 for A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Figure 3 for A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Figure 4 for A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Viaarxiv icon

Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation

Add code
Oct 11, 2024
Figure 1 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Figure 2 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Figure 3 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Figure 4 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Viaarxiv icon

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Add code
Sep 10, 2024
Figure 1 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 2 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 3 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 4 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Viaarxiv icon

AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation

Add code
Aug 03, 2024
Figure 1 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Figure 2 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Figure 3 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Figure 4 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Viaarxiv icon

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization

Add code
Jul 11, 2024
Viaarxiv icon

SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical Imaging

Add code
Apr 02, 2024
Viaarxiv icon

Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning

Add code
Mar 31, 2024
Viaarxiv icon

Reusable Architecture Growth for Continual Stereo Matching

Add code
Mar 30, 2024
Viaarxiv icon

Enhancing Visual Continual Learning with Language-Guided Supervision

Add code
Mar 24, 2024
Figure 1 for Enhancing Visual Continual Learning with Language-Guided Supervision
Figure 2 for Enhancing Visual Continual Learning with Language-Guided Supervision
Figure 3 for Enhancing Visual Continual Learning with Language-Guided Supervision
Figure 4 for Enhancing Visual Continual Learning with Language-Guided Supervision
Viaarxiv icon