Abstract:Scalp diseases and alopecia affect millions of people around the world, underscoring the urgent need for early diagnosis and management of the disease. However, the development of a comprehensive AI-based diagnosis system encompassing these conditions remains an underexplored domain due to the challenges associated with data imbalance and the costly nature of labeling. To address these issues, we propose ScalpVision, an AI-driven system for the holistic diagnosis of scalp diseases and alopecia. In ScalpVision, effective hair segmentation is achieved using pseudo image-label pairs and an innovative prompting method in the absence of traditional hair masking labels. This approach is crucial for extracting key features such as hair thickness and count, which are then used to assess alopecia severity. Additionally, ScalpVision introduces DiffuseIT-M, a generative model adept at dataset augmentation while maintaining hair information, facilitating improved predictions of scalp disease severity. Our experimental results affirm ScalpVision's efficiency in diagnosing a variety of scalp conditions and alopecia, showcasing its potential as a valuable tool in dermatological care.
Abstract:Pretrained Large Language Models (LLMs) have gained significant attention for addressing open-domain Question Answering (QA). While they exhibit high accuracy in answering questions related to common knowledge, LLMs encounter difficulties in learning about uncommon long-tail knowledge (tail entities). Since manually constructing QA datasets demands substantial human resources, the types of existing QA datasets are limited, leaving us with a scarcity of datasets to study the performance of LLMs on tail entities. In this paper, we propose an automatic approach to generate specialized QA datasets for tail entities and present the associated research challenges. We conduct extensive experiments by employing pretrained LLMs on our newly generated long-tail QA datasets, comparing their performance with and without external resources including Wikipedia and Wikidata knowledge graphs.
Abstract:Sign Language Translation (SLT) is a task that has not been studied relatively much compared to the study of Sign Language Recognition (SLR). However, the SLR is a study that recognizes the unique grammar of sign language, which is different from the spoken language and has a problem that non-disabled people cannot easily interpret. So, we're going to solve the problem of translating directly spoken language in sign language video. To this end, we propose a new keypoint normalization method for performing translation based on the skeleton point of the signer and robustly normalizing these points in sign language translation. It contributed to performance improvement by a customized normalization method depending on the body parts. In addition, we propose a stochastic frame selection method that enables frame augmentation and sampling at the same time. Finally, it is translated into the spoken language through an Attention-based translation model. Our method can be applied to various datasets in a way that can be applied to datasets without glosses. In addition, quantitative experimental evaluation proved the excellence of our method.
Abstract:We consider a type of constrained optimization problem, where the violation of a constraint leads to an irrevocable loss, such as breakage of a valuable experimental resource/platform or loss of human life. Such problems are referred to as safe optimization problems (SafeOPs). While SafeOPs have received attention in the machine learning community in recent years, there was little interest in the evolutionary computation (EC) community despite some early attempts between 2009 and 2011. Moreover, there is a lack of acceptable guidelines on how to benchmark different algorithms for SafeOPs, an area where the EC community has significant experience in. Driven by the need for more efficient algorithms and benchmark guidelines for SafeOPs, the objective of this paper is to reignite the interest of this problem class in the EC community. To achieve this we (i) provide a formal definition of SafeOPs and contrast it to other types of optimization problems that the EC community is familiar with, (ii) investigate the impact of key SafeOP parameters on the performance of selected safe optimization algorithms, (iii) benchmark EC against state-of-the-art safe optimization algorithms from the machine learning community, and (iv) provide an open-source Python framework to replicate and extend our work.
Abstract:Safe learning and optimization deals with learning and optimization problems that avoid, as much as possible, the evaluation of non-safe input points, which are solutions, policies, or strategies that cause an irrecoverable loss (e.g., breakage of a machine or equipment, or life threat). Although a comprehensive survey of safe reinforcement learning algorithms was published in 2015, a number of new algorithms have been proposed thereafter, and related works in active learning and in optimization were not considered. This paper reviews those algorithms from a number of domains including reinforcement learning, Gaussian process regression and classification, evolutionary algorithms, and active learning. We provide the fundamental concepts on which the reviewed algorithms are based and a characterization of the individual algorithms. We conclude by explaining how the algorithms are connected and suggestions for future research.
Abstract:It is well-known that GANs are difficult to train, and several different techniques have been proposed in order to stabilize their training. In this paper, we propose a novel training method called manifold-matching, and a new GAN model called manifold-matching GAN (MMGAN). MMGAN finds two manifolds representing the vector representations of real and fake images. If these two manifolds match, it means that real and fake images are statistically identical. To assist the manifold-matching task, we also use i) kernel tricks to find better manifold structures, ii) moving-averaged manifolds across mini-batches, and iii) a regularizer based on correlation matrix to suppress mode collapse. We conduct in-depth experiments with three image datasets and compare with several state-of-the-art GAN models. 32.4% of images generated by the proposed MMGAN are recognized as fake images during our user study (16% enhancement compared to other state-of-the-art model). MMGAN achieved an unsupervised inception score of 7.8 for CIFAR-10.