Abstract:Age estimation from a single face image has been an essential task in the field of human-computer interaction and computer vision which has a wide range of practical application value. Concerning the problem that accuracy of age estimation of face images in the wild are relatively low for existing methods, where they take into account only the whole features of face image while neglecting the fine-grained features of age-sensitive area, we propose a method based on Attention LSTM network for Fine-Grained age estimation in the wild based on the idea of Fine-Grained categories and visual attention mechanism. This method combines ResNets or RoR models with LSTM unit to construct AL-ResNets or AL-RoR networks to extract age-sensitive local regions, which effectively improves age estimation accuracy. Firstly, ResNets or RoR model pre-trained on ImageNet dataset is selected as the basic model, which is then fine-tuned on the IMDB-WIKI-101 dataset for age estimation. Then, we fine-tune ResNets or RoR on the target age datasets to extract the global features of face images. To extract the local characteristics of age-sensitive areas, the LSTM unit is then presented to obtain the coordinates of the age-sensitive region automatically. Finally, the age group classification experiment is conducted directly on the Adience dataset, and age-regression experiments are performed by the Deep EXpectation algorithm (DEX) on MORPH Album 2, FG-NET and LAP datasets. By combining the global and local features, we got our final prediction results. Our experiments illustrate the effectiveness of AL-ResNets or AL-RoR for age estimation in the wild, where it achieves new state-of-the-art performance than all other CNN methods on the Adience, MORPH Album 2, FG-NET and LAP datasets.