Picture for Yongqin Xian

Yongqin Xian

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Add code
Oct 30, 2024
Viaarxiv icon

Toward a Diffusion-Based Generalist for Dense Vision Tasks

Add code
Jun 29, 2024
Figure 1 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Figure 2 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Figure 3 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Figure 4 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Viaarxiv icon

LocCa: Visual Pretraining with Location-aware Captioners

Add code
Mar 28, 2024
Figure 1 for LocCa: Visual Pretraining with Location-aware Captioners
Figure 2 for LocCa: Visual Pretraining with Location-aware Captioners
Figure 3 for LocCa: Visual Pretraining with Location-aware Captioners
Figure 4 for LocCa: Visual Pretraining with Location-aware Captioners
Viaarxiv icon

Text-Conditioned Resampler For Long Form Video Understanding

Add code
Dec 19, 2023
Figure 1 for Text-Conditioned Resampler For Long Form Video Understanding
Figure 2 for Text-Conditioned Resampler For Long Form Video Understanding
Figure 3 for Text-Conditioned Resampler For Long Form Video Understanding
Figure 4 for Text-Conditioned Resampler For Long Form Video Understanding
Viaarxiv icon

LIME: Localized Image Editing via Attention Regularization in Diffusion Models

Add code
Dec 14, 2023
Figure 1 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Figure 2 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Figure 3 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Figure 4 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Viaarxiv icon

LALM: Long-Term Action Anticipation with Language Models

Add code
Nov 29, 2023
Viaarxiv icon

SILC: Improving Vision Language Pretraining with Self-Distillation

Add code
Oct 20, 2023
Figure 1 for SILC: Improving Vision Language Pretraining with Self-Distillation
Figure 2 for SILC: Improving Vision Language Pretraining with Self-Distillation
Figure 3 for SILC: Improving Vision Language Pretraining with Self-Distillation
Figure 4 for SILC: Improving Vision Language Pretraining with Self-Distillation
Viaarxiv icon

Detecting Adversarial Faces Using Only Real Face Self-Perturbations

Add code
May 04, 2023
Viaarxiv icon

Learning Prototype Classifiers for Long-Tailed Recognition

Add code
Feb 01, 2023
Viaarxiv icon

Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation

Add code
Dec 15, 2022
Figure 1 for Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation
Figure 2 for Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation
Figure 3 for Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation
Figure 4 for Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation
Viaarxiv icon