Abstract:The long-standing theory that a colour-naming system evolves under the dual pressure of efficient communication and perceptual mechanism is supported by more and more linguistic studies including the analysis of four decades' diachronic data from the Nafaanra language. This inspires us to explore whether artificial intelligence could evolve and discover a similar colour-naming system via optimising the communication efficiency represented by high-level recognition performance. Here, we propose a novel colour quantisation transformer, CQFormer, that quantises colour space while maintaining the accuracy of machine recognition on the quantised images. Given an RGB image, Annotation Branch maps it into an index map before generating the quantised image with a colour palette, meanwhile the Palette Branch utilises a key-point detection way to find proper colours in palette among whole colour space. By interacting with colour annotation, CQFormer is able to balance both the machine vision accuracy and colour perceptual structure such as distinct and stable colour distribution for discovered colour system. Very interestingly, we even observe the consistent evolution pattern between our artificial colour system and basic colour terms across human languages. Besides, our colour quantisation method also offers an efficient quantisation method that effectively compresses the image storage while maintaining a high performance in high-level recognition tasks such as classification and detection. Extensive experiments demonstrate the superior performance of our method with extremely low bit-rate colours. We will release the source code soon.
Abstract:Challenging illumination conditions (low light, underexposure and overexposure) in the real world not only cast an unpleasant visual appearance but also taint the computer vision tasks. Existing light adaptive methods often deal with each condition individually. What is more, most of them often operate on a RAW image or over-simplify the camera image signal processing (ISP) pipeline. By decomposing the light transformation pipeline into local and global ISP components, we propose a lightweight fast Illumination Adaptive Transformer (IAT) which comprises two transformer-style branches: local estimation branch and global ISP branch. While the local branch estimates the pixel-wise local components relevant to illumination, the global branch defines learnable quires that attend the whole image to decode the parameters. Our IAT could also conduct both object detection and semantic segmentation under various light conditions. We have extensively evaluated IAT on multiple real-world datasets on 2 low-level tasks and 3 high-level tasks. With only 90k parameters and 0.004s processing speed (excluding high-level module), our IAT has consistently achieved superior performance over SOTA. Code is available at https://github.com/cuiziteng/IlluminationAdaptive-Transformer.
Abstract:Deep learning methods exhibit outstanding performance in synthetic aperture radar (SAR) image interpretation tasks. However, these are black box models that limit the comprehension of their predictions. Therefore, to meet this challenge, we have utilized explainable artificial intelligence (XAI) methods for the SAR image classification task. Specifically, we trained state-of-the-art convolutional neural networks for each polarization format on OpenSARUrban dataset and then investigate eight explanation methods to analyze the predictions of the CNN classifiers of SAR images. These XAI methods are also evaluated qualitatively and quantitatively which shows that Occlusion achieves the most reliable interpretation performance in terms of Max-Sensitivity but with a low-resolution explanation heatmap. The explanation results provide some insights into the internal mechanism of black-box decisions for SAR image classification.