Abstract:Human action recognition in low-light environments is crucial for various real-world applications. However, the existing approaches overlook the full utilization of brightness information throughout the training phase, leading to suboptimal performance. To address this limitation, we propose OwlSight, a biomimetic-inspired framework with whole-stage illumination enhancement to interact with action classification for accurate dark video human action recognition. Specifically, OwlSight incorporates a Time-Consistency Module (TCM) to capture shallow spatiotemporal features meanwhile maintaining temporal coherence, which are then processed by a Luminance Adaptation Module (LAM) to dynamically adjust the brightness based on the input luminance distribution. Furthermore, a Reflect Augmentation Module (RAM) is presented to maximize illumination utilization and simultaneously enhance action recognition via two interactive paths. Additionally, we build Dark-101, a large-scale dataset comprising 18,310 dark videos across 101 action categories, significantly surpassing existing datasets (e.g., ARID1.5 and Dark-48) in scale and diversity. Extensive experiments demonstrate that the proposed OwlSight achieves state-of-the-art performance across four low-light action recognition benchmarks. Notably, it outperforms previous best approaches by 5.36% on ARID1.5 and 1.72% on Dark-101, highlighting its effectiveness in challenging dark environments.
Abstract:Powered prosthetic legs must anticipate the user's intent when switching between different locomotion modes (e.g., level walking, stair ascent/descent, ramp ascent/descent). Numerous data-driven classification techniques have demonstrated promising results for predicting user intent, but the performance of these intent prediction models on novel subjects remains undesirable. In other domains (e.g., image classification), transfer learning has improved classification accuracy by using previously learned features from a large dataset (i.e., pre-trained models) and then transferring this learned model to a new task where a smaller dataset is available. In this paper, we develop a deep convolutional neural network with intra-subject (subject-dependent) and inter-subject (subject-independent) validations based on a human locomotion dataset. We then apply transfer learning for the subject-independent model using a small portion (10%) of the data from the left-out subject. We compare the performance of these three models. Our results indicate that the transfer learning (TL) model outperforms the subject-independent (IND) model and is comparable to the subject-dependent (DEP) model (DEP Error: 0.74 $\pm$ 0.002%, IND Error: 11.59 $\pm$ 0.076%, TL Error: 3.57 $\pm$ 0.02% with 10% data). Moreover, as expected, transfer learning accuracy increases with the availability of more data from the left-out subject. We also evaluate the performance of the intent prediction system in various sensor configurations that may be available in a prosthetic leg application. Our results suggest that a thigh IMU on the the prosthesis is sufficient to predict locomotion intent in practice.