Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiayi Zhu

Concept Guided Co-saliency Objection Detection

Dec 21, 2024

Jiayi Zhu, Qing Guo, Felix Juefei-Xu, Yihao Huang, Yang Liu, Geguang Pu

Abstract:The task of co-saliency object detection (Co-SOD) seeks to identify common, salient objects across a collection of images by examining shared visual features. However, traditional Co-SOD methods often encounter limitations when faced with diverse object variations (e.g., different postures) and irrelevant background elements that introduce noise. To address these challenges, we propose ConceptCoSOD, a novel concept-guided approach that leverages text semantic information to enhance Co-SOD performance by guiding the model to focus on consistent object features. Through rethinking Co-SOD as an (image-text)-to-image task instead of an image-to-image task, ConceptCoSOD first captures shared semantic concepts within an image group and then uses them as guidance for precise object segmentation in complex scenarios. Experimental results on three benchmark datasets and six corruptions reveal that ConceptCoSOD significantly improves detection accuracy, especially in challenging settings with considerable background distractions and object variability.

Via

Access Paper or Ask Questions

CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection

Mar 27, 2024

Jiayi Zhu, Qing Guo, Felix Juefei-Xu, Yihao Huang, Yang Liu, Geguang Pu

Abstract:Co-salient object detection (CoSOD) aims to identify the common and salient (usually in the foreground) regions across a given group of images. Although achieving significant progress, state-of-the-art CoSODs could be easily affected by some adversarial perturbations, leading to substantial accuracy reduction. The adversarial perturbations can mislead CoSODs but do not change the high-level semantic information (e.g., concept) of the co-salient objects. In this paper, we propose a novel robustness enhancement framework by first learning the concept of the co-salient objects based on the input group images and then leveraging this concept to purify adversarial perturbations, which are subsequently fed to CoSODs for robustness enhancement. Specifically, we propose CosalPure containing two modules, i.e., group-image concept learning and concept-guided diffusion purification. For the first module, we adopt a pre-trained text-to-image diffusion model to learn the concept of co-salient objects within group images where the learned concept is robust to adversarial examples. For the second module, we map the adversarial image to the latent space and then perform diffusion generation by embedding the learned concept into the noise prediction function as an extra condition. Our method can effectively alleviate the influence of the SOTA adversarial attack containing different adversarial patterns, including exposure and noise. The extensive results demonstrate that our method could enhance the robustness of CoSODs significantly.

* 8 pages

Via

Access Paper or Ask Questions

Edge-aware Hard Clustering Graph Pooling for Brain Imaging Data

Sep 13, 2023

Cheng Zhu, Jiayi Zhu, Lijuan Zhang, Xi Wu, Shuqi Yang, Ping Liang, Honghan Chen, Ying Tan

Figure 1 for Edge-aware Hard Clustering Graph Pooling for Brain Imaging Data

Figure 2 for Edge-aware Hard Clustering Graph Pooling for Brain Imaging Data

Figure 3 for Edge-aware Hard Clustering Graph Pooling for Brain Imaging Data

Figure 4 for Edge-aware Hard Clustering Graph Pooling for Brain Imaging Data

Abstract:Graph Convolutional Networks (GCNs) can capture non-Euclidean spatial dependence between different brain regions, and the graph pooling operator in GCNs is key to enhancing the representation learning capability and acquiring abnormal brain maps. However, the majority of existing research designs graph pooling operators only from the perspective of nodes while disregarding the original edge features, in a way that not only confines graph pooling application scenarios, but also diminishes its ability to capture critical substructures. In this study, a clustering graph pooling method that first supports multidimensional edge features, called Edge-aware hard clustering graph pooling (EHCPool), is developed. EHCPool proposes the first 'Edge-to-node' score evaluation criterion based on edge features to assess node feature significance. To more effectively capture the critical subgraphs, a novel Iteration n-top strategy is further designed to adaptively learn sparse hard clustering assignments for graphs. Subsequently, an innovative N-E Aggregation strategy is presented to aggregate node and edge feature information in each independent subgraph. The proposed model was evaluated on multi-site brain imaging public datasets and yielded state-of-the-art performance. We believe this method is the first deep learning tool with the potential to probe different types of abnormal functional brain networks from data-driven perspective. Core code is at: https://github.com/swfen/EHCPool.

Via

Access Paper or Ask Questions

DC-Net: Divide-and-Conquer for Salient Object Detection

May 24, 2023

Jiayi Zhu, Xuebin Qin, Abdulmotaleb Elsaddik

Abstract:In this paper, we introduce Divide-and-Conquer into the salient object detection (SOD) task to enable the model to learn prior knowledge that is for predicting the saliency map. We design a novel network, Divide-and-Conquer Network (DC-Net) which uses two encoders to solve different subtasks that are conducive to predicting the final saliency map, here is to predict the edge maps with width 4 and location maps of salient objects and then aggregate the feature maps with different semantic information into the decoder to predict the final saliency map. The decoder of DC-Net consists of our newly designed two-level Residual nested-ASPP (ResASPP$^{2}$) modules, which have the ability to capture a large number of different scale features with a small number of convolution operations and have the advantages of maintaining high resolution all the time and being able to obtain a large and compact effective receptive field (ERF). Based on the advantage of Divide-and-Conquer's parallel computing, we use Parallel Acceleration to speed up DC-Net, allowing it to achieve competitive performance on six LR-SOD and five HR-SOD datasets under high efficiency (60 FPS and 55 FPS). Codes and results are available: https://github.com/PiggyJerry/DC-Net.

Via

Access Paper or Ask Questions

Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Apr 06, 2023

Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo

Figure 1 for Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Figure 2 for Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Figure 3 for Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Figure 4 for Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Abstract:The available evidence suggests that dynamic functional connectivity (dFC) can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia(SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal brain category graph convolutional network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain network analysis module, DSF-BrainNet, was designed to construct dynamic synchronization features. Subsequently, a revolutionary graph convolution method, TemporalConv, was proposed, based on the synchronous temporal properties of feature. Finally, the first modular abnormal hemispherical lateralization test tool in deep learning based on rs-fMRI data, named CategoryPool, was proposed. This study was validated on COBRE and UCLA datasets and achieved 83.62% and 89.71% average accuracies, respectively, outperforming the baseline model and other state-of-the-art methods. The ablation results also demonstrate the advantages of TemporalConv over the traditional edge feature graph convolution approach and the improvement of CategoryPool over the classical graph pooling approach. Interestingly, this study showed that the lower order perceptual system and higher order network regions in the left hemisphere are more severely dysfunctional than in the right hemisphere in SZ and reaffirms the importance of the left medial superior frontal gyrus in SZ. Our core code is available at: https://github.com/swfen/Temporal-BCGCN.

Via

Access Paper or Ask Questions

Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for Semi-Supervised Segmentation of MRI Scans

Mar 09, 2023

Jiayi Zhu, Bart Bolsterlee, Brian V. Y. Chow, Yang Song, Erik Meijering

Figure 1 for Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for Semi-Supervised Segmentation of MRI Scans

Figure 2 for Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for Semi-Supervised Segmentation of MRI Scans

Figure 3 for Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for Semi-Supervised Segmentation of MRI Scans

Figure 4 for Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for Semi-Supervised Segmentation of MRI Scans

Abstract:Semi-supervised learning has made significant progress in medical image segmentation. However, existing methods primarily utilize information acquired from a single dimensionality (2D/3D), resulting in sub-optimal performance on challenging data, such as magnetic resonance imaging (MRI) scans with multiple objects and highly anisotropic resolution. To address this issue, we present a Hybrid Dual Mean-Teacher (HD-Teacher) model with hybrid, semi-supervised, and multi-task learning to achieve highly effective semi-supervised segmentation. HD-Teacher employs a 2D and a 3D mean-teacher network to produce segmentation labels and signed distance fields from the hybrid information captured in both dimensionalities. This hybrid learning mechanism allows HD-Teacher to combine the `best of both worlds', utilizing features extracted from either 2D, 3D, or both dimensions to produce outputs as it sees fit. Outputs from 2D and 3D teacher models are also dynamically combined, based on their individual uncertainty scores, into a single hybrid prediction, where the hybrid uncertainty is estimated. We then propose a hybrid regularization module to encourage both student models to produce results close to the uncertainty-weighted hybrid prediction. The hybrid uncertainty suppresses unreliable knowledge in the hybrid prediction, leaving only useful information to improve network performance further. Extensive experiments of binary and multi-class segmentation conducted on three MRI datasets demonstrate the effectiveness of the proposed framework. Code is available at https://github.com/ThisGame42/Hybrid-Teacher.

Via

Access Paper or Ask Questions

Masked Faces with Faced Masks

Jan 17, 2022

Jiayi Zhu, Qing Guo, Felix Juefei-Xu, Yihao Huang, Yang Liu, Geguang Pu

Figure 1 for Masked Faces with Faced Masks

Figure 2 for Masked Faces with Faced Masks

Figure 3 for Masked Faces with Faced Masks

Figure 4 for Masked Faces with Faced Masks

Abstract:Modern face recognition systems (FRS) still fall short when the subjects are wearing facial masks, a common theme in the age of respiratory pandemics. An intuitive partial remedy is to add a mask detector to flag any masked faces so that the FRS can act accordingly for those low-confidence masked faces. In this work, we set out to investigate the potential vulnerability of such FRS, equipped with a mask detector, on large-scale masked faces. As existing face recognizers and mask detectors have high performance in their respective tasks, it is a challenge to simultaneously fool them and preserve the transferability of the attack. To this end, we devise realistic facial masks that exhibit partial face patterns (i.e., faced masks) and stealthily add adversarial textures that can not only lead to significant performance deterioration of the SOTA deep learning-based FRS, but also remain undetected by the SOTA facial mask detector, thus successfully fooling both systems at the same time. The proposed method unveils the vulnerability of the FRS when dealing with masked faces wearing faced masks.

* 8 pages

Via

Access Paper or Ask Questions

ALA: Adversarial Lightness Attack via Naturalness-aware Regularizations

Jan 16, 2022

Liangru Sun, Felix Juefei-Xu, Yihao Huang, Qing Guo, Jiayi Zhu, Jincao Feng, Yang Liu, Geguang Pu

Figure 1 for ALA: Adversarial Lightness Attack via Naturalness-aware Regularizations

Figure 2 for ALA: Adversarial Lightness Attack via Naturalness-aware Regularizations

Figure 3 for ALA: Adversarial Lightness Attack via Naturalness-aware Regularizations

Figure 4 for ALA: Adversarial Lightness Attack via Naturalness-aware Regularizations

Abstract:Most researchers have tried to enhance the robustness of deep neural networks (DNNs) by revealing and repairing the vulnerability of DNNs with specialized adversarial examples. Parts of the attack examples have imperceptible perturbations restricted by Lp norm. However, due to their high-frequency property, the adversarial examples usually have poor transferability and can be defensed by denoising methods. To avoid the defects, some works make the perturbations unrestricted to gain better robustness and transferability. However, these examples usually look unnatural and alert the guards. To generate unrestricted adversarial examples with high image quality and good transferability, in this paper, we propose Adversarial Lightness Attack (ALA), a white-box unrestricted adversarial attack that focuses on modifying the lightness of the images. The shape and color of the samples, which are crucial to human perception, are barely influenced. To obtain adversarial examples with high image quality, we craft a naturalness-aware regularization. To achieve stronger transferability, we propose random initialization and non-stop attack strategy in the attack procedure. We verify the effectiveness of ALA on two popular datasets for different tasks (i.e., ImageNet for image classification and Places-365 for scene recognition). The experiments show that the generated adversarial examples have both strong transferability and high image quality. Besides, the adversarial examples can also help to improve the standard trained ResNet50 on defending lightness corruption.

* 8 pages

Via

Access Paper or Ask Questions