Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kaifan Zhang

CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

Dec 13, 2024

Kaifan Zhang, Lihuo He, Xin Jiang, Wen Lu, Di Wang, Xinbo Gao

Figure 1 for CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

Figure 2 for CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

Figure 3 for CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

Figure 4 for CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

Abstract:Electroencephalogram (EEG) signals have attracted significant attention from researchers due to their non-invasive nature and high temporal sensitivity in decoding visual stimuli. However, most recent studies have focused solely on the relationship between EEG and image data pairs, neglecting the valuable ``beyond-image-modality" information embedded in EEG signals. This results in the loss of critical multimodal information in EEG. To address this limitation, we propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals. Specifically, CognitionCapturer trains Modality Expert Encoders for each modality to extract cross-modal information from the EEG modality. Then, it introduces a diffusion prior to map the EEG embedding space to the CLIP embedding space, followed by using a pretrained generative model, the proposed framework can reconstruct visual stimuli with high semantic and structural fidelity. Notably, the framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities. Through extensive experiments, we demonstrate that CognitionCapturer outperforms state-of-the-art methods both qualitatively and quantitatively. Code: https://github.com/XiaoZhangYES/CognitionCapturer.

Via

Access Paper or Ask Questions

AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

Nov 25, 2024

Jili Xia, Lihuo He, Fei Gao, Kaifan Zhang, Leida Li, Xinbo Gao

Figure 1 for AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

Figure 2 for AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

Figure 3 for AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

Figure 4 for AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

Abstract:Recently, AI-generated images (AIGIs) created by given prompts (initial prompts) have garnered widespread attention. Nevertheless, due to technical nonproficiency, they often suffer from poor perception quality and Text-to-Image misalignment. Therefore, assessing the perception quality and alignment quality of AIGIs is crucial to improving the generative model's performance. Existing assessment methods overly rely on the initial prompts in the task prompt design and use the same prompts to guide both perceptual and alignment quality evaluation, overlooking the distinctions between the two tasks. To address this limitation, we propose a novel quality assessment method for AIGIs named TSP-MGS, which designs task-specific prompts and measures multi-granularity similarity between AIGIs and the prompts. Specifically, task-specific prompts are first constructed to describe perception and alignment quality degrees separately, and the initial prompt is introduced for detailed quality perception. Then, the coarse-grained similarity between AIGIs and task-specific prompts is calculated, which facilitates holistic quality awareness. In addition, to improve the understanding of AIGI details, the fine-grained similarity between the image and the initial prompt is measured. Finally, precise quality prediction is acquired by integrating the multi-granularity similarities. Experiments on the commonly used AGIQA-1K and AGIQA-3K benchmarks demonstrate the superiority of the proposed TSP-MGS.

Via

Access Paper or Ask Questions

MsMorph: An Unsupervised pyramid learning network for brain image registration

Oct 23, 2024

Jiaofen Nan, Gaodeng Fan, Kaifan Zhang, Chen Zhao, Fubao Zhu, Weihua Zhou

Abstract:In the field of medical image analysis, image registration is a crucial technique. Despite the numerous registration models that have been proposed, existing methods still fall short in terms of accuracy and interpretability. In this paper, we present MsMorph, a deep learning-based image registration framework aimed at mimicking the manual process of registering image pairs to achieve more similar deformations, where the registered image pairs exhibit consistency or similarity in features. By extracting the feature differences between image pairs across various as-pects using gradients, the framework decodes semantic information at different scales and continuously compen-sates for the predicted deformation field, driving the optimization of parameters to significantly improve registration accuracy. The proposed method simulates the manual approach to registration, focusing on different regions of the image pairs and their neighborhoods to predict the deformation field between the two images, which provides strong interpretability. We compared several existing registration methods on two public brain MRI datasets, including LPBA and Mindboggle. The experimental results show that our method consistently outperforms state of the art in terms of metrics such as Dice score, Hausdorff distance, average symmetric surface distance, and non-Jacobian. The source code is publicly available at https://github.com/GaodengFan/MsMorph

* 18 pages, 10 figures

Via

Access Paper or Ask Questions