Picture for Rongyao Fang

Rongyao Fang

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Add code
Oct 17, 2024
Viaarxiv icon

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Add code
Mar 19, 2024
Viaarxiv icon

InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation

Add code
Nov 30, 2023
Viaarxiv icon

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Add code
Mar 09, 2023
Viaarxiv icon

FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature Augmentation

Add code
Mar 02, 2023
Viaarxiv icon

Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification

Add code
Jul 19, 2022
Figure 1 for Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification
Figure 2 for Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification
Figure 3 for Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification
Figure 4 for Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification
Viaarxiv icon

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

Add code
May 28, 2022
Figure 1 for Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Figure 2 for Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Figure 3 for Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Figure 4 for Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Viaarxiv icon

RBGNet: Ray-based Grouping for 3D Object Detection

Add code
Apr 05, 2022
Figure 1 for RBGNet: Ray-based Grouping for 3D Object Detection
Figure 2 for RBGNet: Ray-based Grouping for 3D Object Detection
Figure 3 for RBGNet: Ray-based Grouping for 3D Object Detection
Figure 4 for RBGNet: Ray-based Grouping for 3D Object Detection
Viaarxiv icon

Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling

Add code
Nov 15, 2021
Figure 1 for Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Figure 2 for Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Figure 3 for Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Figure 4 for Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Viaarxiv icon

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Add code
Oct 09, 2021
Figure 1 for CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Figure 2 for CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Figure 3 for CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Figure 4 for CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Viaarxiv icon