Abstract:Autism Spectrum Disorder (ASD) can affect individuals at varying degrees of intensity, from challenges in overall health, communication, and sensory processing, and this often begins at a young age. Thus, it is critical for medical professionals to be able to accurately diagnose ASD in young children, but doing so is difficult. Deep learning can be responsibly leveraged to improve productivity in addressing this task. The availability of data, however, remains a considerable obstacle. Hence, in this work, we introduce the Video ASD dataset--a dataset that contains video frame convolutional and attention map feature data--to foster further progress in the task of ASD classification. The original videos showcase children reacting to chemo-sensory stimuli, among auditory, touch, and vision This dataset contains the features of the frames spanning 2,467 videos, for a total of approximately 1.4 million frames. Additionally, head pose angles are included to account for head movement noise, as well as full-sentence text labels for the taste and smell videos that describe how the facial expression changes before, immediately after, and long after interaction with the stimuli. In addition to providing features, we also test foundation models on this data to showcase how movement noise affects performance and the need for more data and more complex labels.
Abstract:Parameterized Quantum Circuits (PQCs) have been acknowledged as a leading strategy to utilize near-term quantum advantages in multiple problems, including machine learning and combinatorial optimization. When applied to specific tasks, the parameters in the quantum circuits are trained to minimize the target function. Although there have been comprehensive studies to improve the performance of the PQCs on practical tasks, the errors caused by the quantum noise downgrade the performance when running on real quantum computers. In particular, when the quantum state is transformed through multiple quantum circuit layers, the effect of the quantum noise happens cumulatively and becomes closer to the maximally mixed state or complete noise. This paper studies the relationship between the quantum noise and the diffusion model. Then, we propose a novel diffusion-inspired learning approach to mitigate the quantum noise in the PQCs and reduce the error for specific tasks. Through our experiments, we illustrate the efficiency of the learning strategy and achieve state-of-the-art performance on classification tasks in the quantum noise scenarios.
Abstract:In precision agriculture, the detection and recognition of insects play an essential role in the ability of crops to grow healthy and produce a high-quality yield. The current machine vision model requires a large volume of data to achieve high performance. However, there are approximately 5.5 million different insect species in the world. None of the existing insect datasets can cover even a fraction of them due to varying geographic locations and acquisition costs. In this paper, we introduce a novel ``Insect-1M'' dataset, a game-changing resource poised to revolutionize insect-related foundation model training. Covering a vast spectrum of insect species, our dataset, including 1 million images with dense identification labels of taxonomy hierarchy and insect descriptions, offers a panoramic view of entomology, enabling foundation models to comprehend visual and semantic information about insects like never before. Then, to efficiently establish an Insect Foundation Model, we develop a micro-feature self-supervised learning method with a Patch-wise Relevant Attention mechanism capable of discerning the subtle differences among insect images. In addition, we introduce Description Consistency loss to improve micro-feature modeling via insect descriptions. Through our experiments, we illustrate the effectiveness of our proposed approach in insect modeling and achieve State-of-the-Art performance on standard benchmarks of insect-related tasks. Our Insect Foundation Model and Dataset promise to empower the next generation of insect-related vision models, bringing them closer to the ultimate goal of precision agriculture.
Abstract:Unsupervised visual clustering has recently received considerable attention. It aims to explain distributions of unlabeled visual images by clustering them via a parameterized appearance model. From a different perspective, the clustering algorithms can be treated as assignment problems, often NP-hard. They can be solved precisely for small instances on current hardware. Adiabatic quantum computing (AQC) offers a solution, as it can soon provide a considerable speedup on a range of NP-hard optimization problems. However, current clustering formulations are unsuitable for quantum computing due to their scaling properties. Consequently, in this work, we propose the first clustering formulation designed to be solved with AQC. We employ an Ising model representing the quantum mechanical system implemented on the AQC. Our approach is competitive compared to state-of-the-art optimization-based approaches, even using of-the-shelf integer programming solvers. Finally, we demonstrate that our clustering problem is already solvable on the current generation of real quantum computers for small examples and analyze the properties of the measured solutions.
Abstract:In quantum machine field, detecting two-dimensional (2D) materials in Silicon chips is one of the most critical problems. Instance segmentation can be considered as a potential approach to solve this problem. However, similar to other deep learning methods, the instance segmentation requires a large scale training dataset and high quality annotation in order to achieve a considerable performance. In practice, preparing the training dataset is a challenge since annotators have to deal with a large image, e.g 2K resolution, and extremely dense objects in this problem. In this work, we present a novel method to tackle the problem of missing annotation in instance segmentation in 2D quantum material identification. We propose a new mechanism for automatically detecting false negative objects and an attention based loss strategy to reduce the negative impact of these objects contributing to the overall loss function. We experiment on the 2D material detection datasets, and the experiments show our method outperforms previous works.
Abstract:Unsupervised domain adaptation is one of the challenging problems in computer vision. This paper presents a novel approach to unsupervised domain adaptations based on the optimal transport-based distance. Our approach allows aligning target and source domains without the requirement of meaningful metrics across domains. In addition, the proposal can associate the correct mapping between source and target domains and guarantee a constraint of topology between source and target domains. The proposed method is evaluated on different datasets in various problems, i.e. (i) digit recognition on MNIST, MNIST-M, USPS datasets, (ii) Object recognition on Amazon, Webcam, DSLR, and VisDA datasets, (iii) Insect Recognition on the IP102 dataset. The experimental results show that our proposed method consistently improves performance accuracy. Also, our framework could be incorporated with any other CNN frameworks within an end-to-end deep network design for recognition problems to improve their performance.