Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Aug 14, 2023

Weijian Mai, Zhijun Zhang

Figure 1 for UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Figure 2 for UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Figure 3 for UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Figure 4 for UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Share this with someone who'll enjoy it:

Abstract:Image reconstruction and captioning from brain activity evoked by visual stimuli allow researchers to further understand the connection between the human brain and the visual perception system. While deep generative models have recently been employed in this field, reconstructing realistic captions and images with both low-level details and high semantic fidelity is still a challenging problem. In this work, we propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity. For the first time, we unify image reconstruction and captioning from visual-evoked functional magnetic resonance imaging (fMRI) through a latent diffusion model termed Versatile Diffusion. Specifically, we transform fMRI voxels into text and image latent for low-level information and guide the backward diffusion process through fMRI-based image and text conditions derived from CLIP to generate realistic captions and images. UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes Dataset (NSD) dataset. Moreover, the ablation experiments and functional region-of-interest (ROI) analysis further exhibit the superiority of UniBrain and provide comprehensive insight for visual-evoked brain decoding.

View paper on

Share this with someone who'll enjoy it:

Title:UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Paper and Code