Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yanwen Zhu

Meta Feature Modulator for Long-tailed Recognition

Aug 08, 2020

Renzhen Wang, Kaiqin Hu, Yanwen Zhu, Jun Shu, Qian Zhao, Deyu Meng

Figure 1 for Meta Feature Modulator for Long-tailed Recognition

Figure 2 for Meta Feature Modulator for Long-tailed Recognition

Figure 3 for Meta Feature Modulator for Long-tailed Recognition

Figure 4 for Meta Feature Modulator for Long-tailed Recognition

Abstract:Deep neural networks often degrade significantly when training data suffer from class imbalance problems. Existing approaches, e.g., re-sampling and re-weighting, commonly address this issue by rearranging the label distribution of training data to train the networks fitting well to the implicit balanced label distribution. However, most of them hinder the representative ability of learned features due to insufficient use of intra/inter-sample information of training data. To address this issue, we propose meta feature modulator (MFM), a meta-learning framework to model the difference between the long-tailed training data and the balanced meta data from the perspective of representation learning. Concretely, we employ learnable hyper-parameters (dubbed modulation parameters) to adaptively scale and shift the intermediate features of classification networks, and the modulation parameters are optimized together with the classification network parameters guided by a small amount of balanced meta data. We further design a modulator network to guide the generation of the modulation parameters, and such a meta-learner can be readily adapted to train the classification network on other long-tailed datasets. Extensive experiments on benchmark vision datasets substantiate the superiority of our approach on long-tailed recognition tasks beyond other state-of-the-art methods.

Via

Access Paper or Ask Questions

Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Jul 31, 2020

Jun Shu, Yanwen Zhu, Qian Zhao, Deyu Meng, Zongben Xu

Figure 1 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Figure 2 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Figure 3 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Figure 4 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Abstract:The learning rate (LR) is one of the most important hyper-parameters in stochastic gradient descent (SGD) for deep neural networks (DNNs) training and generalization. However, current hand-designed LR schedules need to manually pre-specify schedule as well as its extra hyper-parameters, which limits its ability to adapt non-convex optimization problems due to the significant variation of training dynamic. To address this issue, we propose a model capable of adaptively learning LR schedule from data. We specifically design a meta-learner with explicit mapping formulation to parameterize LR schedules, which can adjust LR adaptively to comply with current training dynamic by leveraging the information from past training histories. Image and text classification benchmark experiments substantiate the capability of our method for achieving proper LR schedules compared with baseline methods. Moreover, we transfer the learned LR schedule to other various tasks, like different training batch sizes, epochs, datasets, network architectures, especially large scale ImageNet dataset, showing its stronger generalization capability than related methods. Finally, guided by a small set of clean validation set, we show our method can achieve better generalization error when training data is biased with corrupted noise than baseline methods.

* 21 pages

Via

Access Paper or Ask Questions