Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lei Pang

Adaptive Split-Fusion Transformer

Apr 26, 2022

Zixuan Su, Hao Zhang, Jingjing Chen, Lei Pang, Chong-Wah Ngo, Yu-Gang Jiang

Figure 1 for Adaptive Split-Fusion Transformer

Figure 2 for Adaptive Split-Fusion Transformer

Figure 3 for Adaptive Split-Fusion Transformer

Figure 4 for Adaptive Split-Fusion Transformer

Abstract:Neural networks for visual content understanding have recently evolved from convolutional ones (CNNs) to transformers. The prior (CNN) relies on small-windowed kernels to capture the regional clues, demonstrating solid local expressiveness. On the contrary, the latter (transformer) establishes long-range global connections between localities for holistic learning. Inspired by this complementary nature, there is a growing interest in designing hybrid models to best utilize each technique. Current hybrids merely replace convolutions as simple approximations of linear projection or juxtapose a convolution branch with attention, without concerning the importance of local/global modeling. To tackle this, we propose a new hybrid named Adaptive Split-Fusion Transformer (ASF-former) to treat convolutional and attention branches differently with adaptive weights. Specifically, an ASF-former encoder equally splits feature channels into half to fit dual-path inputs. Then, the outputs of dual-path are fused with weighting scalars calculated from visual cues. We also design the convolutional path compactly for efficiency concerns. Extensive experiments on standard benchmarks, such as ImageNet-1K, CIFAR-10, and CIFAR-100, show that our ASF-former outperforms its CNN, transformer counterparts, and hybrid pilots in terms of accuracy (83.9% on ImageNet-1K), under similar conditions (12.9G MACs/56.7M Params, without large-scale pre-training). The code is available at: https://github.com/szx503045266/ASF-former.

Via

Access Paper or Ask Questions

Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS

May 07, 2019

Yang Jiang, Cong Zhao, Lei Pang

Figure 1 for Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS

Figure 2 for Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS

Figure 3 for Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS

Figure 4 for Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS

Abstract:Neural architecture search (NAS) is proposed to automate the architecture design process and attracts overwhelming interest from both academia and industry. However, it is confronted with overfitting issue due to the high-dimensional search space composed by $operator$ selection and $skip$ connection of each layer. This paper analyzes the overfitting issue from a novel perspective, which separates the primitives of search space into architecture-overfitting related and parameter-overfitting related elements. The $operator$ of each layer, which mainly contributes to parameter-overfitting and is important for model acceleration, is selected as our optimization target based on state-of-the-art architecture, meanwhile $skip$ which related to architecture-overfitting, is ignored. With the largely reduced search space, our proposed method is both quick to converge and practical to use in various tasks. Extensive experiments have demonstrated that the proposed method can achieve fascinated results, including classification, face recognition etc.

* 9 pages, 1 figures, 5 tables

Via

Access Paper or Ask Questions