Picture for Fangxiang Feng

Fangxiang Feng

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Add code
Oct 24, 2024
Figure 1 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 2 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 3 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 4 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Viaarxiv icon

Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis

Add code
Aug 07, 2024
Viaarxiv icon

DiffHarmony: Latent Diffusion Model Meets Image Harmonization

Add code
Apr 09, 2024
Viaarxiv icon

Whether you can locate or not? Interactive Referring Expression Generation

Add code
Aug 19, 2023
Viaarxiv icon

GR-GAN: Gradual Refinement Text-to-image Generation

Add code
May 23, 2022
Figure 1 for GR-GAN: Gradual Refinement Text-to-image Generation
Figure 2 for GR-GAN: Gradual Refinement Text-to-image Generation
Figure 3 for GR-GAN: Gradual Refinement Text-to-image Generation
Figure 4 for GR-GAN: Gradual Refinement Text-to-image Generation
Viaarxiv icon

Question-Driven Graph Fusion Network For Visual Question Answering

Add code
Apr 03, 2022
Figure 1 for Question-Driven Graph Fusion Network For Visual Question Answering
Figure 2 for Question-Driven Graph Fusion Network For Visual Question Answering
Figure 3 for Question-Driven Graph Fusion Network For Visual Question Answering
Figure 4 for Question-Driven Graph Fusion Network For Visual Question Answering
Viaarxiv icon

Co-VQA : Answering by Interactive Sub Question Sequence

Add code
Apr 02, 2022
Figure 1 for Co-VQA : Answering by Interactive Sub Question Sequence
Figure 2 for Co-VQA : Answering by Interactive Sub Question Sequence
Figure 3 for Co-VQA : Answering by Interactive Sub Question Sequence
Figure 4 for Co-VQA : Answering by Interactive Sub Question Sequence
Viaarxiv icon

Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene

Add code
Mar 16, 2022
Figure 1 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Figure 2 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Figure 3 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Figure 4 for Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Viaarxiv icon

Multi-stage Pre-training over Simplified Multimodal Pre-training Models

Add code
Jul 22, 2021
Figure 1 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Figure 2 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Figure 3 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Figure 4 for Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Viaarxiv icon

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue

Add code
Oct 01, 2020
Figure 1 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Figure 2 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Figure 3 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Figure 4 for Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Viaarxiv icon