Abstract:In this paper, we propose a simple and general approach to augment regular convolution operator by injecting extra group-wise transformation during training and recover it during inference. Extra transformation is carefully selected to ensure it can be merged with regular convolution in each group and will not change the topological structure of regular convolution during inference. Compared with regular convolution operator, our approach (AugConv) can introduce larger learning capacity to improve model performance during training but will not increase extra computational overhead for model deployment. Based on ResNet, we utilize AugConv to build convolutional neural networks named AugResNet. Result on image classification dataset Cifar-10 shows that AugResNet outperforms its baseline in terms of model performance.
Abstract:Based on ShuffleNetV2, we build a more powerful and efficient model family, termed as AugShuffleNets, by introducing higher frequency of cross-layer information communication for better model performance. Evaluated on the CIFAR-10 and CIFAR-100 datasets, AugShuffleNet consistently outperforms ShuffleNetV2 in terms of accuracy, with less computational cost, fewer parameter count.
Abstract:Convolutional networks (ConvNets) have shown impressive capability to solve various vision tasks. Nevertheless, the trade-off between performance and efficiency is still a challenge for a feasible model deployment on resource-constrained platforms. In this paper, we introduce a novel concept termed multi-path fully connected pattern (MPFC) to rethink the interdependencies of topology pattern, accuracy and efficiency for ConvNets. Inspired by MPFC, we further propose a dual-branch module named dynamic clone transformer (DCT) where one branch generates multiple replicas from inputs and another branch reforms those clones through a series of difference vectors conditional on inputs itself to produce more variants. This operation allows the self-expansion of channel-wise information in a data-driven way with little computational cost while providing sufficient learning capacity, which is a potential unit to replace computationally expensive pointwise convolution as an expansion layer in the bottleneck structure.