Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jonathan Cui

CS-Mixer: A Cross-Scale Vision MLP Model with Spatial-Channel Mixing

Aug 25, 2023

Jonathan Cui, David A. Araujo, Suman Saha, Md. Faisal Kabir

Abstract:Despite their simpler information fusion designs compared with Vision Transformers and Convolutional Neural Networks, Vision MLP architectures have demonstrated strong performance and high data efficiency in recent research. However, existing works such as CycleMLP and Vision Permutator typically model spatial information in equal-size spatial regions and do not consider cross-scale spatial interactions. Further, their token mixers only model 1- or 2-axis correlations, avoiding 3-axis spatial-channel mixing due to its computational demands. We therefore propose CS-Mixer, a hierarchical Vision MLP that learns dynamic low-rank transformations for spatial-channel mixing through cross-scale local and global aggregation. The proposed methodology achieves competitive results on popular image recognition benchmarks without incurring substantially more compute. Our largest model, CS-Mixer-L, reaches 83.2% top-1 accuracy on ImageNet-1k with 13.7 GFLOPs and 94 M parameters.

* 8 page, 5 figures, developed under Penn State University's Multi-Campus Research Experience for Undergraduates Symposium, 2023

Via

Access Paper or Ask Questions

Whole Brain Segmentation with Full Volume Neural Network

Oct 29, 2021

Yeshu Li, Jonathan Cui, Yilun Sheng, Xiao Liang, Jingdong Wang, Eric I-Chao Chang, Yan Xu

Figure 1 for Whole Brain Segmentation with Full Volume Neural Network

Figure 2 for Whole Brain Segmentation with Full Volume Neural Network

Figure 3 for Whole Brain Segmentation with Full Volume Neural Network

Figure 4 for Whole Brain Segmentation with Full Volume Neural Network

Abstract:Whole brain segmentation is an important neuroimaging task that segments the whole brain volume into anatomically labeled regions-of-interest. Convolutional neural networks have demonstrated good performance in this task. Existing solutions, usually segment the brain image by classifying the voxels, or labeling the slices or the sub-volumes separately. Their representation learning is based on parts of the whole volume whereas their labeling result is produced by aggregation of partial segmentation. Learning and inference with incomplete information could lead to sub-optimal final segmentation result. To address these issues, we propose to adopt a full volume framework, which feeds the full volume brain image into the segmentation network and directly outputs the segmentation result for the whole brain volume. The framework makes use of complete information in each volume and can be implemented easily. An effective instance in this framework is given subsequently. We adopt the $3$D high-resolution network (HRNet) for learning spatially fine-grained representations and the mixed precision training scheme for memory-efficient training. Extensive experiment results on a publicly available $3$D MRI brain dataset show that our proposed model advances the state-of-the-art methods in terms of segmentation performance. Source code is publicly available at https://github.com/microsoft/VoxHRNet.

* Computerized Medical Imaging and Graphics, Volume 93, October 2021, 101991
* Accepted to CMIG

Via

Access Paper or Ask Questions

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Mar 18, 2021

Shengyu Zhao, Jonathan Cui, Yilun Sheng, Yue Dong, Xiao Liang, Eric I Chang, Yan Xu

Figure 1 for Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Figure 2 for Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Figure 3 for Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Figure 4 for Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Abstract:Numerous task-specific variants of conditional generative adversarial networks have been developed for image completion. Yet, a serious limitation remains that all existing algorithms tend to fail when handling large-scale missing regions. To overcome this challenge, we propose a generic new approach that bridges the gap between image-conditional and recent modulated unconditional generative architectures via co-modulation of both conditional and stochastic style representations. Also, due to the lack of good quantitative metrics for image completion, we propose the new Paired/Unpaired Inception Discriminative Score (P-IDS/U-IDS), which robustly measures the perceptual fidelity of inpainted images compared to real images via linear separability in a feature space. Experiments demonstrate superior performance in terms of both quality and diversity over state-of-the-art methods in free-form image completion and easy generalization to image-to-image translation. Code is available at https://github.com/zsyzzsoft/co-mod-gan.

* ICLR 2021 (Spotlight). Code: https://github.com/zsyzzsoft/co-mod-gan Demo: https://comodgan.ml/

Via

Access Paper or Ask Questions