Abstract:Efficient continual learning techniques have been a topic of significant research over the last few years. A fundamental problem with such learning is severe degradation of performance on previously learned tasks, known also as catastrophic forgetting. This paper introduces a novel method to reduce catastrophic forgetting in the context of incremental class learning called Gradient Correlation Subspace Learning (GCSL). The method detects a subspace of the weights that is least affected by previous tasks and projects the weights to train for the new task into said subspace. The method can be applied to one or more layers of a given network architectures and the size of the subspace used can be altered from layer to layer and task to task. Code will be available at \href{https://github.com/vgthengane/GCSL}{https://github.com/vgthengane/GCSL}
Abstract:This paper introduces a new approach to sound source localization using head-related transfer function (HRTF) characteristics, which enable precise full-sphere localization from raw data. While previous research focused primarily on using extensive microphone arrays in the frontal plane, this arrangement often encountered limitations in accuracy and robustness when dealing with smaller microphone arrays. Our model proposes using both time and frequency domain for sound source localization while utilizing Deep Learning (DL) approach. The performance of our proposed model, surpasses the current state-of-the-art results. Specifically, it boasts an average angular error of $0.24 degrees and an average Euclidean distance of 0.01 meters, while the known state-of-the-art gives average angular error of 19.07 degrees and average Euclidean distance of 1.08 meters. This level of accuracy is of paramount importance for a wide range of applications, including robotics, virtual reality, and aiding individuals with cochlear implants (CI).
Abstract:This paper describes an updated interactive performance system for floor and Aerial Dance that controls visual and sonic aspects of the presentation via a depth sensing camera (MS Kinect). In order to detect, measure and track free movement in space, 3 degree of freedom (3-DOF) tracking in space (on the ground and in the air) is performed using IR markers with a method for multi target tracking capabilities added and described in detail. An improved gesture tracking and recognition system, called Action Graph (AG), is described in the paper. Action Graph uses an efficient incremental construction from a single long sequence of movement features and automatically captures repeated sub-segments in the movement from start to finish with no manual interaction needed with other advanced capabilities discussed as well. By using the new model for the gesture we can unify an entire choreography piece by dynamically tracking and recognizing gestures and sub-portions of the piece. This gives the performer the freedom to improvise based on a set of recorded gestures/portions of the choreography and have the system dynamically respond in relation to the performer within a set of related rehearsed actions, an ability that has not been seen in any other system to date.