Abstract:Ultrasound images of the forearm can be used to classify hand gestures towards developing human machine interfaces. In our previous work, we have demonstrated gesture classification using ultrasound on a single subject without removing the probe before evaluation. This has limitations in usage as once the probe is removed and replaced, the accuracy declines since the classifier performance is sensitive to the probe location on the arm. In this paper, we propose training a model on multiple data collection sessions to create a generalized model, utilizing incremental learning through fine tuning. Ultrasound data was acquired for 5 hand gestures within a session (without removing and putting the probe back on) and across sessions. A convolutional neural network (CNN) with 5 cascaded convolution layers was used for this study. A pre-trained CNN was fine tuned with the convolution blocks acting as a feature extractor, and the parameters of the remaining layers updated in an incremental fashion. Fine tuning was done using different session splits within a session and between multiple sessions. We found that incremental fine tuning can help enhance classification accuracy with more fine tuning sessions. After 2 fine tuning sessions for each experiment, we found an approximate 10% increase in classification accuracy. This work demonstrates that incremental learning through fine tuning on ultrasound based hand gesture classification can be used improves accuracy while saving storage, processing power, and time. It can be expanded to generalize between multiple subjects and towards developing personalized wearable devices.