Abstract:In this demo, we present VirtualConductor, a system that can generate conducting video from any given music and a single user's image. First, a large-scale conductor motion dataset is collected and constructed. Then, we propose Audio Motion Correspondence Network (AMCNet) and adversarial-perceptual learning to learn the cross-modal relationship and generate diverse, plausible, music-synchronized motion. Finally, we combine 3D animation rendering and a pose transfer model to synthesize conducting video from a single given user's image. Therefore, any user can become a virtual conductor through the system.
Abstract:Computational intelligence-based ocean characteristics forecasting applications, such as Significant Wave Height (SWH) prediction, are crucial for avoiding social and economic loss in coastal cities. Compared to the traditional empirical-based or numerical-based forecasting models, "soft computing" approaches, including machine learning and deep learning models, have shown numerous success in recent years. In this paper, we focus on enabling the deep learning model to learn both short-term and long-term spatial-temporal dependencies for SWH prediction. A Wavelet Graph Neural Network (WGNN) approach is proposed to integrate the advantages of wavelet transform and graph neural network. Several parallel graph neural networks are separately trained on wavelet decomposed data, and the reconstruction of each model's prediction forms the final SWH prediction. Experimental results show that the proposed WGNN approach outperforms other models, including the numerical models, the machine learning models, and several deep learning models.
Abstract:Offline reinforcement learning (RL) aims at learning a good policy from a batch of collected data, without extra interactions with the environment during training. However, current offline RL benchmarks commonly have a large reality gap, because they involve large datasets collected by highly exploratory policies, and the trained policy is directly evaluated in the environment. In real-world situations, running a highly exploratory policy is prohibited to ensure system safety, the data is commonly very limited, and a trained policy should be well validated before deployment. In this paper, we present a near real-world offline RL benchmark, named NeoRL, which contains datasets from various domains with controlled sizes, and extra test datasets for policy validation. We evaluate existing offline RL algorithms on NeoRL and argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward. The empirical results demonstrate that the tested offline RL algorithms become less competitive to the deterministic policy on many datasets, and the offline policy evaluation hardly helps. The NeoRL suit can be found at http://polixir.ai/research/neorl. We hope this work will shed some light on future research and draw more attention when deploying RL in real-world systems.
Abstract:The pandemic of COVID-19 has caused millions of infectious. Due to the false-negative rate and the time cost of conventional RT-PCR tests, X-ray images and Computed Tomography (CT) images based diagnosing become widely adopted. Therefore, researchers of the computer vision area have developed many automatic diagnosing models to help the radiologists and pro-mote the diagnosing accuracy. In this paper, we present a review of these recently emerging automatic diagnosing models. 62 models from 14, February to 5, May, 2020 are involved. We analyzed the models from the perspective of preprocessing, feature extraction, classification, and evaluation. Then we pointed out that domain adaption in transfer learning and interpretability promotion are the possible future directions.
Abstract:Face recognition has been an active research area in the field of pattern recognition, especially since the rise of deep learning in recent years. However, in some practical situations, each identity in the training set has only a single sample. This type of situation is called Single Sample Per Person (SSPP), which brings a great challenge to the effective training of deep models. To resolve this issue, and to unleash the full potential of deep learning, many deep learning based SSPP face recognition methods have been proposed in recent years. There have been several comprehensive surveys for traditional methods based SSPP face recognition approaches, but emerging deep learning based methods are rarely involved. In this paper, we focus on those deep methods, classifying them as virtual sample methods and generic learning methods. In virtual sample methods, virtual face images or virtual face features are generated to benefit the training of the deep model. In generic learning methods, additional multi-sample generic set are used. Efforts of traditional methods and deep feature combining, loss function improving and network structure improving are involved in our analysis in the generic learning methods section. Finally, we discuss existing problems of identity information retention in virtual sample methods and domain adaption in generic learning methods. Further, we regard the semantic gap as an important future issue that needs to be considered in deep SSPP methods.
Abstract:Convolutional Neural Network (CNN) is one of the most significant networks in the deep learning field. Since CNN made impressive achievements in many areas, including but not limited to computer vision and natural language processing, it attracted much attention both of industry and academia in the past few years. The existing reviews mainly focus on the applications of CNN in different scenarios without considering CNN from a general perspective, and some novel ideas proposed recently are not covered. In this review, we aim to provide novel ideas and prospects in this fast-growing field as much as possible. Besides, not only two-dimensional convolution but also one-dimensional and multi-dimensional ones are involved. First, this review starts with a brief introduction to the history of CNN. Second, we provide an overview of CNN. Third, classic and advanced CNN models are introduced, especially those key points making them reach state-of-the-art results. Fourth, through experimental analysis, we draw some conclusions and provide several rules of thumb for function selection. Fifth, the applications of one-dimensional, two-dimensional, and multi-dimensional convolution are covered. Finally, some open issues and promising directions for CNN are discussed to serve as guidelines for future work.