Picture for Ziyong Feng

Ziyong Feng

ORID: Organ-Regional Information Driven Framework for Radiology Report Generation

Add code
Nov 20, 2024
Viaarxiv icon

Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension

Add code
Oct 18, 2024
Figure 1 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Figure 2 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Figure 3 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Figure 4 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Viaarxiv icon

CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination

Add code
Aug 18, 2024
Figure 1 for CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination
Figure 2 for CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination
Figure 3 for CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination
Figure 4 for CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination
Viaarxiv icon

VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Add code
Aug 02, 2024
Viaarxiv icon

Multi-label Cluster Discrimination for Visual Representation Learning

Add code
Jul 24, 2024
Figure 1 for Multi-label Cluster Discrimination for Visual Representation Learning
Figure 2 for Multi-label Cluster Discrimination for Visual Representation Learning
Figure 3 for Multi-label Cluster Discrimination for Visual Representation Learning
Figure 4 for Multi-label Cluster Discrimination for Visual Representation Learning
Viaarxiv icon

High-Fidelity Facial Albedo Estimation via Texture Quantization

Add code
Jun 19, 2024
Figure 1 for High-Fidelity Facial Albedo Estimation via Texture Quantization
Figure 2 for High-Fidelity Facial Albedo Estimation via Texture Quantization
Figure 3 for High-Fidelity Facial Albedo Estimation via Texture Quantization
Figure 4 for High-Fidelity Facial Albedo Estimation via Texture Quantization
Viaarxiv icon

RWKV-CLIP: A Robust Vision-Language Representation Learner

Add code
Jun 11, 2024
Viaarxiv icon

1st Place Solution to the 1st SkatingVerse Challenge

Add code
Apr 22, 2024
Viaarxiv icon

Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models

Add code
Mar 28, 2024
Viaarxiv icon

IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models

Add code
Mar 21, 2024
Viaarxiv icon