Abstract:Neural Networks design is a complex and often daunting task, particularly for resource-constrained scenarios typical of mobile-sized models. Neural Architecture Search is a promising approach to automate this process, but existing competitive methods require large training time and computational resources to generate accurate models. To overcome these limits, this paper contributes with: i) a novel training-free metric, named Entropic Score, to estimate model expressivity through the aggregated element-wise entropy of its activations; ii) a cyclic search algorithm to separately yet synergistically search model size and topology. Entropic Score shows remarkable ability in searching for the topology of the network, and a proper combination with LogSynflow, to search for model size, yields superior capability to completely design high-performance Hybrid Transformers for edge applications in less than 1 GPU hour, resulting in the fastest and most accurate NAS method for ImageNet classification.
Abstract:Unsupervised Domain Adaptation (UDA) is a key field in visual recognition, as it enables robust performances across different visual domains. In the deep learning era, the performance of UDA methods has been driven by better losses and by improved network architectures, specifically the addition of auxiliary domain-alignment branches to pre-trained backbones. However, all the neural architectures proposed so far are hand-crafted, which might hinder further progress. The current copious offspring of Neural Architecture Search (NAS) only alleviates hand-crafting so far, as it requires labels for model selection, which are not available in UDA, and is usually applied to the whole architecture, while using pre-trained models is a strict requirement for high performance. No prior work has addressed these aspects in the context of NAS for UDA. Here we propose an Adversarial Branch Architecture Search (ABAS) for UDA, to learn the auxiliary branch network from data without handcrafting. Our main contribution include i. a novel data-driven ensemble approach for model selection, to circumvent the lack of target labels, and ii. a pipeline to automatically search for the best performing auxiliary branch. To the best of our knowledge, ABAS is the first NAS method for UDA to comply with a pre-trained backbone, a strict requirement for high performance. ABAS outputs both the optimal auxiliary branch and its trained parameters. When applied to two modern UDA techniques, DANN and ALDA, it improves performance on three standard CV datasets (Office31, Office-Home and PACS). In all cases, ABAS robustly finds the branch architectures which yield best performances. Code will be released.
Abstract:Unsupervised Domain Adaptation (DA) exploits the supervision of a label-rich source dataset to make predictions on an unlabeled target dataset by aligning the two data distributions. In robotics, DA is used to take advantage of automatically generated synthetic data, that come with "free" annotation, to make effective predictions on real data. However, existing DA methods are not designed to cope with the multi-modal nature of RGB-D data, which are widely used in robotic vision. We propose a novel RGB-D DA method that reduces the synthetic-to-real domain shift by exploiting the inter-modal relation between the RGB and depth image. Our method consists of training a convolutional neural network to solve, in addition to the main recognition task, the pretext task of predicting the relative rotation between the RGB and depth image. To evaluate our method and encourage further research in this area, we define two benchmark datasets for object categorization and instance recognition. With extensive experiments, we show the benefits of leveraging the inter-modal relations for RGB-D DA.