Abstract:As autonomous robots move into complex, dynamic real-world environments, they must learn to navigate safely in real time, yet anticipating all possible behaviors is infeasible. We propose a composable, model-free reinforcement learning method that learns a value function and an optimal policy for each individual environment element (e.g., goal or obstacle) and composes them online to achieve goal reaching and collision avoidance. Assuming unknown nonlinear dynamics that evolve in continuous time and are input-affine, we derive a continuous-time Hamilton-Jacobi-Bellman (HJB) equation for the value function and show that the corresponding advantage function is quadratic in the action and optimal policy. Based on this structure, we introduce a model-free actor-critic algorithm that learns policies and value functions for static or moving obstacles using gradient descent. We then compose multiple reach/avoid models via a quadratically constrained quadratic program (QCQP), yielding formal obstacle-avoidance guarantees in terms of value-function level sets, providing a model-free alternative to CLF/CBF-based controllers. Simulations demonstrate improved performance over a PPO baseline applied to a discrete-time approximation.
Abstract:A two-wheeled self-balancing robot (TWSBR) is non-linear and unstable system. This study compares the performance of model-based and data-based control strategies for TWSBRs, with an explicit practical educational approach. Model-based control (MBC) algorithms such as Lead-Lag and PID control require a proficient dynamic modeling and mathematical manipulation to drive the linearized equations of motions and develop the appropriate controller. On the other side, data-based control (DBC) methods, like fuzzy control, provide a simpler and quicker approach to designing effective controllers without needing in-depth understanding of the system model. In this paper, the advantages and disadvantages of both MBC and DBC using a TWSBR are illustrated. All controllers were implemented and tested on the OSOYOO self-balancing kit, including an Arduino microcontroller, MPU-6050 sensor, and DC motors. The control law and the user interface are constructed using the LabVIEW-LINX toolkit. A real-time hardware-in-loop experiment validates the results, highlighting controllers that can be implemented on a cost-effective platform.




Abstract:Due to the increased demand for music streaming/recommender services and the recent developments of music information retrieval frameworks, Music Genre Classification (MGC) has attracted the community's attention. However, convolutional-based approaches are known to lack the ability to efficiently encode and localize temporal features. In this paper, we study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters (about 180k) and investigate twelve variants of broadcast networks discussing the effect of block configuration, pooling method, activation function, normalization mechanism, label smoothing, channel interdependency, LSTM block inclusion, and variants of inception schemes. Our computational experiments using relevant datasets such as GTZAN, Extended Ballroom, HOMBURG, and Free Music Archive (FMA) show state-of-the-art classification accuracies in Music Genre Classification. Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.