Abstract:Lane detection plays a pivotal role in the field of autonomous vehicles and advanced driving assistant systems (ADAS). Over the years, numerous algorithms have emerged, spanning from rudimentary image processing techniques to sophisticated deep neural networks. The performance of deep learning-based models is highly dependent on the quality of their training data. Consequently, these models often experience a decline in performance when confronted with challenging scenarios such as extreme lighting conditions, partially visible lane markings, and sparse lane markings like Botts' dots. To address this, we present an end-to-end lane detection and classification system based on deep learning methodologies. In our study, we introduce a unique dataset meticulously curated to encompass scenarios that pose significant challenges for state-of-the-art (SOTA) models. Through fine-tuning selected models, we aim to achieve enhanced localization accuracy. Moreover, we propose a CNN-based classification branch, seamlessly integrated with the detector, facilitating the identification of distinct lane types. This architecture enables informed lane-changing decisions and empowers more resilient ADAS capabilities. We also investigate the effect of using mixed precision training and testing on different models and batch sizes. Experimental evaluations conducted on the widely-used TuSimple dataset, Caltech lane dataset, and our LVLane dataset demonstrate the effectiveness of our model in accurately detecting and classifying lanes amidst challenging scenarios. Our method achieves state-of-the-art classification results on the TuSimple dataset. The code of the work will be published upon the acceptance of the paper.
Abstract:Can performance on the task of action quality assessment (AQA) be improved by exploiting a description of the action and its quality? Current AQA and skills assessment approaches propose to learn features that serve only one task - estimating the final score. In this paper, we propose to learn spatio-temporal features that explain three related tasks - fine-grained action recognition, commentary generation, and estimating the AQA score. A new multitask-AQA dataset, the largest to date, comprising of 1412 diving samples was collected to evaluate our approach (http://rtis.oit.unlv.edu/datasets.html). We show that our MTL approach outperforms STL approach using two different kinds of architectures: C3D-AVG and MSCADC. The C3D-AVG-MTL approach achieves the new state-of-the-art performance with a rank correlation of 90.44%. Detailed experiments were performed to show that MTL offers better generalization than STL, and representations from action recognition models are not sufficient for the AQA task and instead should be learned.
Abstract:Can learning to measure the quality of an action help in measuring the quality of other actions? If so, can consolidated samples from multiple actions help improve the performance of current approaches? In this paper, we carry out experiments to see if knowledge transfer is possible in the action quality assessment (AQA) setting. Experiments are carried out on our newly released AQA dataset (http://rtis.oit.unlv.edu/datasets.html) consisting of 1106 action samples from seven actions with quality scores as measured by expert human judges. Our experimental results show that there is utility in learning a single model across multiple actions.
Abstract:Predicting trajectories of pedestrians is quintessential for autonomous robots which share the same environment with humans. In order to effectively and safely interact with humans, trajectory prediction needs to be both precise and computationally efficient. In this work, we propose a convolutional neural network (CNN) based human trajectory prediction approach. Unlike more recent LSTM-based moles which attend sequentially to each frame, our model supports increased parallelism and effective temporal representation. The proposed compact CNN model is faster than the current approaches yet still yields competitive results.
Abstract:Estimating action quality, the process of assigning a "score" to the execution of an action, is crucial in areas such as sports and health care. Unlike action recognition, which has millions of examples to learn from, the action quality datasets that are currently available are small -- typically comprised of only a few hundred samples. This work presents three frameworks for evaluating Olympic sports which utilize spatiotemporal features learned using 3D convolutional neural networks (C3D) and perform score regression with i) SVR, ii) LSTM, and iii) LSTM followed by SVR. An efficient training mechanism for the limited data scenarios is presented for clip-based training with LSTM. The proposed systems show significant improvement over existing quality assessment approaches on the task of predicting scores of Olympic events {diving, vault, figure skating}. While the SVR-based frameworks yield better results, LSTM-based frameworks are more natural for describing an action and can be used for improvement feedback.
Abstract:This work explores the problem of exercise quality measurement since it is essential for effective management of diseases like cerebral palsy (CP). This work examines the assessment of quality of large amplitude movement (LAM) exercises designed to treat CP in an automated fashion. Exercise data was collected by trained participants to generate ideal examples to use as a positive samples for machine learning. Following that, subjects were asked to deliberately make subtle errors during the exercise, such as restricting movements, as is commonly seen in cases of patients suffering from CP. The quality measurement problem was then posed as a classification to determine whether an example exercise was either "good" or "bad". Popular machine learning techniques for classification, including support vector machines (SVM), single and doublelayered neural networks (NN), boosted decision trees, and dynamic time warping (DTW), were compared. The AdaBoosted tree performed best with an accuracy of 94.68% demonstrating the feasibility of assessing exercise quality.