Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kaiqiang Xu

Multi-view Contrastive Learning for Online Knowledge Distillation

Jun 07, 2020

Chuanguang Yang, Zhulin An, Xiaolong Hu, Hui Zhu, Kaiqiang Xu, Yongjun Xu

Figure 1 for Multi-view Contrastive Learning for Online Knowledge Distillation

Figure 2 for Multi-view Contrastive Learning for Online Knowledge Distillation

Figure 3 for Multi-view Contrastive Learning for Online Knowledge Distillation

Figure 4 for Multi-view Contrastive Learning for Online Knowledge Distillation

Abstract:Existing Online Knowledge Distillation (OKD) aims to perform collaborative and mutual learning among multiple peer networks in terms of probabilistic outputs, but ignores the representational knowledge. We thus introduce Multi-view Contrastive Learning (MCL) for OKD to implicitly capture correlations of representations encoded by multiple peer networks, which provide various views for understanding the input data samples. Contrastive loss is applied for maximizing the consensus of positive data pairs, while pushing negative data pairs apart in embedding space among various views. Benefit from MCL, we can learn a more discriminative representation for classification than previous OKD methods. Experimental results on image classification and few-shot learning demonstrate that our MCL-OKD outperforms other state-of-the-art methods of both OKD and KD by large margins without sacrificing additional inference cost.

Via

Access Paper or Ask Questions

Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants

Jan 19, 2020

Hui Zhu, Zhulin An, Kaiqiang Xu, Xiaolong Hu, Yongjun Xu

Figure 1 for Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants

Figure 2 for Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants

Figure 3 for Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants

Figure 4 for Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants

Abstract:Existing approaches to improve the performances of convolutional neural networks by optimizing the local architectures or deepening the networks tend to increase the size of models significantly. In order to deploy and apply the neural networks to edge devices which are in great demand, reducing the scale of networks are quite crucial. However, It is easy to degrade the performance of image processing by compressing the networks. In this paper, we propose a method which is suitable for edge devices while improving the efficiency and effectiveness of inference. The joint decision of multi-participants, mainly contain multi-layers and multi-networks, can achieve higher classification accuracy (0.26% on CIFAR-10 and 4.49% on CIFAR-100 at most) with similar total number of parameters for classical convolutional neural networks.

Via

Access Paper or Ask Questions

Gated Convolutional Networks with Hybrid Connectivity for Image Classification

Sep 08, 2019

Chuanguang Yang, Zhulin An, Hui Zhu, Xiaolong Hu, Kun Zhang, Kaiqiang Xu, Chao Li, Yongjun Xu

Figure 1 for Gated Convolutional Networks with Hybrid Connectivity for Image Classification

Figure 2 for Gated Convolutional Networks with Hybrid Connectivity for Image Classification

Figure 3 for Gated Convolutional Networks with Hybrid Connectivity for Image Classification

Figure 4 for Gated Convolutional Networks with Hybrid Connectivity for Image Classification

Abstract:We design a highly efficient architecture called Gated Convolutional Network with Hybrid Connectivity (HCGNet), which is equipped with the combination of local residual and global dense connectivity to enjoy their individual superiorities as well as attention-based gate mechanism to assist feature recalibration. To adapt our hybrid connectivity, we further propose a novel module which includes a squeeze cell for obtaining the compact features from input and then a multi-scale excitation cell attached an update gate to model the global context features for capturing long-range dependency based on multi-scale information. We also locate a forget gate on residual connectivity to decay the reused features, which can be aggergated with newly global context features to form the output that can facilitate effective feature exploration as well as re-exploitation to some extent. Moreover, the number of our proposed modules under dense connectivity can be quite fewer than classical DenseNet thus reducing considerable redundancy but with empirically better performance. On CIFAR-10/100 datasets, HCGNets significantly outperform state-of-the-art both human-designed and auto-searched networks with much fewer parameters. It can also consistently obtain better performance and interpretability than widely applied networks in practice on ImageNet dataset.

Via

Access Paper or Ask Questions

Rethinking the Number of Channels for the Convolutional Neural Network

Sep 04, 2019

Hui Zhu, Zhulin An, Chuanguang Yang, Xiaolong Hu, Kaiqiang Xu, Yongjun Xu

Figure 1 for Rethinking the Number of Channels for the Convolutional Neural Network

Figure 2 for Rethinking the Number of Channels for the Convolutional Neural Network

Figure 3 for Rethinking the Number of Channels for the Convolutional Neural Network

Figure 4 for Rethinking the Number of Channels for the Convolutional Neural Network

Abstract:Latest algorithms for automatic neural architecture search perform remarkable but few of them can effectively design the number of channels for convolutional neural networks and consume less computational efforts. In this paper, we propose a method for efficient automatic architecture search which is special to the widths of networks instead of the connections of neural architecture. Our method, functionally incremental search based on function-preserving, will explore the number of channels rapidly while controlling the number of parameters of the target network. On CIFAR-10 and CIFAR-100 classification, our method using minimal computational resources (0.4~1.3 GPU-days) can discover more efficient rules of the widths of networks to improve the accuracy by about 0.5% on CIFAR-10 and a~2.33% on CIFAR-100 with fewer number of parameters. In particular, our method is suitable for exploring the number of channels of almost any convolutional neural network rapidly.

Via

Access Paper or Ask Questions

EENA: Efficient Evolution of Neural Architecture

May 20, 2019

Hui Zhu, Zhulin An, Chuanguang Yang, Kaiqiang Xu, Yongjun Xu

Figure 1 for EENA: Efficient Evolution of Neural Architecture

Figure 2 for EENA: Efficient Evolution of Neural Architecture

Figure 3 for EENA: Efficient Evolution of Neural Architecture

Figure 4 for EENA: Efficient Evolution of Neural Architecture

Abstract:Latest algorithms for automatic neural architecture search perform remarkable but are basically directionless in search space and computational expensive in training of every intermediate architecture. In this paper, we propose a method for efficient architecture search called EENA (Efficient Evolution of Neural Architecture). Due to the elaborately designed mutation and crossover operations, the evolution process can be guided by the information have already been learned. Therefore, less computational effort will be required while the searching and training time can be reduced significantly. On CIFAR-10 classification, EENA using minimal computational resources (0.65 GPU-days) can design highly effective neural architecture which achieves 2.56% test error with 8.47M parameters. Furthermore, the best architecture discovered is also transferable for CIFAR-100.

Via

Access Paper or Ask Questions