Abstract:Training deep neural networks on large and sparse datasets is still challenging and can require large amounts of computation and memory. In this work, we address the task of performing semantic segmentation on large volumetric data sets, such as CT scans. Our contribution is threefold: 1) We propose a boosted sampling scheme that uses a-posterior error maps, generated throughout training, to focus sampling on difficult regions, resulting in a more informative loss. This results in a significant training speed up and improves learning performance for image segmentation. 2) We propose a novel algorithm for boosting the SGD learning rate schedule by adaptively increasing and lowering the learning rate, avoiding the need for extensive hyperparameter tuning. 3) We show that our method is able to attain new state-of-the-art results on the VISCERAL Anatomy benchmark.
Abstract:In this work, we have concentrated our efforts on the interpretability of classification results coming from a fully convolutional neural network. Motivated by the classification of oesophageal tissue for real-time detection of early squamous neoplasia, the most frequent kind of oesophageal cancer in Asia, we present a new dataset and a novel deep learning method that by means of deep supervision and a newly introduced concept, the embedded Class Activation Map (eCAM), focuses on the interpretability of results as a design constraint of a convolutional network. We present a new approach to visualise attention that aims to give some insights on those areas of the oesophageal tissue that lead a network to conclude that the images belong to a particular class and compare them with those visual features employed by clinicians to produce a clinical diagnosis. In comparison to a baseline method which does not feature deep supervision but provides attention by grafting Class Activation Maps, we improve the F1-score from 87.3% to 92.7% and provide more detailed attention maps.
Abstract:Deep convolutional neural networks (CNNs) have shown excellent performance in object recognition tasks and dense classification problems such as semantic segmentation. However, training deep neural networks on large and sparse datasets is still challenging and can require large amounts of computation and memory. In this work, we address the task of performing semantic segmentation on large data sets, such as three-dimensional medical images. We propose an adaptive sampling scheme that uses a-posterior error maps, generated throughout training, to focus sampling on difficult regions, resulting in improved learning. Our contribution is threefold: 1) We give a detailed description of the proposed sampling algorithm to speed up and improve learning performance on large images. We propose a deep dual path CNN that captures information at fine and coarse scales, resulting in a network with a large field of view and high resolution outputs. We show that our method is able to attain new state-of-the-art results on the VISCERAL Anatomy benchmark.