Picture for Fangxiang Feng

Fangxiang Feng

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Add code
Oct 24, 2024
Figure 1 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 2 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 3 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 4 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Viaarxiv icon

Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis

Add code
Aug 07, 2024
Viaarxiv icon

DiffHarmony: Latent Diffusion Model Meets Image Harmonization

Add code
Apr 09, 2024
Figure 1 for DiffHarmony: Latent Diffusion Model Meets Image Harmonization
Figure 2 for DiffHarmony: Latent Diffusion Model Meets Image Harmonization
Figure 3 for DiffHarmony: Latent Diffusion Model Meets Image Harmonization
Figure 4 for DiffHarmony: Latent Diffusion Model Meets Image Harmonization
Viaarxiv icon

Whether you can locate or not? Interactive Referring Expression Generation

Add code
Aug 19, 2023
Viaarxiv icon

GR-GAN: Gradual Refinement Text-to-image Generation

Add code
May 23, 2022
Figure 1 for GR-GAN: Gradual Refinement Text-to-image Generation
Figure 2 for GR-GAN: Gradual Refinement Text-to-image Generation
Figure 3 for GR-GAN: Gradual Refinement Text-to-image Generation
Figure 4 for GR-GAN: Gradual Refinement Text-to-image Generation
Viaarxiv icon

Question-Driven Graph Fusion Network For Visual Question Answering

Add code
Apr 03, 2022
Figure 1 for Question-Driven Graph Fusion Network For Visual Question Answering
Figure 2 for Question-Driven Graph Fusion Network For Visual Question Answering
Figure 3 for Question-Driven Graph Fusion Network For Visual Question Answering
Figure 4 for Question-Driven Graph Fusion Network For Visual Question Answering
Viaarxiv icon

Co-VQA : Answering by Interactive Sub Question Sequence

Add code
Apr 02, 2022
Figure 1 for Co-VQA : Answering by Interactive Sub Question Sequence
Figure 2 for Co-VQA : Answering by Interactive Sub Question Sequence
Figure 3 for Co-VQA : Answering by Interactive Sub Question Sequence
Figure 4 for Co-VQA : Answering by Interactive Sub Question Sequence
Viaarxiv icon

Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene

Add code
Mar 16, 2022
Figure 1 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Figure 2 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Figure 3 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Figure 4 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Viaarxiv icon

Multi-stage Pre-training over Simplified Multimodal Pre-training Models

Add code
Jul 22, 2021
Figure 1 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Figure 2 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Figure 3 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Figure 4 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Viaarxiv icon

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue

Add code
Oct 01, 2020
Figure 1 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Figure 2 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Figure 3 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Figure 4 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Viaarxiv icon