Senior Member, IEEE
Abstract:Automated vehicles are envisioned to navigate safely in complex mixed-traffic scenarios alongside human-driven vehicles. To promise a high degree of safety, accurately predicting the maneuvers of surrounding vehicles and their future positions is a critical task and attracts much attention. However, most existing studies focused on reasoning about positional information based on objective historical trajectories without fully considering the heterogeneity of driving behaviors. Therefore, this study proposes a trajectory prediction framework that combines Mixture Density Networks (MDN) and considers the driving heterogeneity to provide probabilistic and personalized predictions. Specifically, based on a certain length of historical trajectory data, the situation-specific driving preferences of each driver are identified, where key driving behavior feature vectors are extracted to characterize heterogeneity in driving behavior among different drivers. With the inputs of the short-term historical trajectory data and key driving behavior feature vectors, a probabilistic LSTMMD-DBV model combined with LSTM-based encoder-decoder networks and MDN layers is utilized to carry out personalized predictions. Finally, the SHapley Additive exPlanations (SHAP) method is employed to interpret the trained model for predictions. The proposed framework is tested based on a wide-range vehicle trajectory dataset. The results indicate that the proposed model can generate probabilistic future trajectories with remarkably improved predictions compared to existing benchmark models. Moreover, the results confirm that the additional input of driving behavior feature vectors representing the heterogeneity of driving behavior could provide more information and thus contribute to improving the prediction accuracy.
Abstract:Considering that Coupled Dictionary Learning (CDL) method can obtain a reasonable linear mathematical relationship between resource images, we propose a novel CDL-based Synthetic Aperture Radar (SAR) and multispectral pseudo-color fusion method. Firstly, the traditional Brovey transform is employed as a pre-processing method on the paired SAR and multispectral images. Then, CDL is used to capture the correlation between the pre-processed image pairs based on the dictionaries generated from the source images via enforced joint sparse coding. Afterward, the joint sparse representation in the pair of dictionaries is utilized to construct an image mask via calculating the reconstruction errors, and therefore generate the final fusion image. The experimental verification results of the SAR images from the Sentinel-1 satellite and the multispectral images from the Landsat-8 satellite show that the proposed method can achieve superior visual effects, and excellent quantitative performance in terms of spectral distortion, correlation coefficient, MSE, NIQE, BRISQUE, and PIQE.
Abstract:We propose a framework that can incrementally expand the explanatory temporal logic rule set to explain the occurrence of temporal events. Leveraging the temporal point process modeling and learning framework, the rule content and weights will be gradually optimized until the likelihood of the observational event sequences is optimal. The proposed algorithm alternates between a master problem, where the current rule set weights are updated, and a subproblem, where a new rule is searched and included to best increase the likelihood. The formulated master problem is convex and relatively easy to solve using continuous optimization, whereas the subproblem requires searching the huge combinatorial rule predicate and relationship space. To tackle this challenge, we propose a neural search policy to learn to generate the new rule content as a sequence of actions. The policy parameters will be trained end-to-end using the reinforcement learning framework, where the reward signals can be efficiently queried by evaluating the subproblem objective. The trained policy can be used to generate new rules in a controllable way. We evaluate our methods on both synthetic and real healthcare datasets, obtaining promising results.
Abstract:Learning first-order logic programs (LPs) from relational facts which yields intuitive insights into the data is a challenging topic in neuro-symbolic research. We introduce a novel differentiable inductive logic programming (ILP) model, called differentiable first-order rule learner (DFOL), which finds the correct LPs from relational facts by searching for the interpretable matrix representations of LPs. These interpretable matrices are deemed as trainable tensors in neural networks (NNs). The NNs are devised according to the differentiable semantics of LPs. Specifically, we first adopt a novel propositionalization method that transfers facts to NN-readable vector pairs representing interpretation pairs. We replace the immediate consequence operator with NN constraint functions consisting of algebraic operations and a sigmoid-like activation function. We map the symbolic forward-chained format of LPs into NN constraint functions consisting of operations between subsymbolic vector representations of atoms. By applying gradient descent, the trained well parameters of NNs can be decoded into precise symbolic LPs in forward-chained logic format. We demonstrate that DFOL can perform on several standard ILP datasets, knowledge bases, and probabilistic relation facts and outperform several well-known differentiable ILP models. Experimental results indicate that DFOL is a precise, robust, scalable, and computationally cheap differentiable ILP model.
Abstract:To better understand early brain growth patterns in health and disorder, it is critical to accurately segment infant brain magnetic resonance (MR) images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). Deep learning-based methods have achieved state-of-the-art performance; however, one of major limitations is that the learning-based methods may suffer from the multi-site issue, that is, the models trained on a dataset from one site may not be applicable to the datasets acquired from other sites with different imaging protocols/scanners. To promote methodological development in the community, iSeg-2019 challenge (http://iseg2019.web.unc.edu) provides a set of 6-month infant subjects from multiple sites with different protocols/scanners for the participating methods. Training/validation subjects are from UNC (MAP) and testing subjects are from UNC/UMN (BCP), Stanford University, and Emory University. By the time of writing, there are 30 automatic segmentation methods participating in iSeg-2019. We review the 8 top-ranked teams by detailing their pipelines/implementations, presenting experimental results and evaluating performance in terms of the whole brain, regions of interest, and gyral landmark curves. We also discuss their limitations and possible future directions for the multi-site issue. We hope that the multi-site dataset in iSeg-2019 and this review article will attract more researchers on the multi-site issue.
Abstract:In this paper, we propose the nonlinearity generation method to speed up and stabilize the training of deep convolutional neural networks. The proposed method modifies a family of activation functions as nonlinearity generators (NGs). NGs make the activation functions linear symmetric for their inputs to lower model capacity, and automatically introduce nonlinearity to enhance the capacity of the model during training. The proposed method can be considered an unusual form of regularization: the model parameters are obtained by training a relatively low-capacity model, that is relatively easy to optimize at the beginning, with only a few iterations, and these parameters are reused for the initialization of a higher-capacity model. We derive the upper and lower bounds of variance of the weight variation, and show that the initial symmetric structure of NGs helps stabilize training. We evaluate the proposed method on different frameworks of convolutional neural networks over two object recognition benchmark tasks (CIFAR-10 and CIFAR-100). Experimental results showed that the proposed method allows us to (1) speed up the convergence of training, (2) allow for less careful weight initialization, (3) improve or at least maintain the performance of the model at negligible extra computational cost, and (4) easily train a very deep model.