Johnny
Abstract:We have developed a high-performance Chinese Chess AI that operates without reliance on search algorithms. This AI has demonstrated the capability to compete at a level commensurate with the top 0.1\% of human players. By eliminating the search process typically associated with such systems, this AI achieves a Queries Per Second (QPS) rate that exceeds those of systems based on the Monte Carlo Tree Search (MCTS) algorithm by over a thousandfold and surpasses those based on the AlphaBeta pruning algorithm by more than a hundredfold. The AI training system consists of two parts: supervised learning and reinforcement learning. Supervised learning provides an initial human-like Chinese chess AI, while reinforcement learning, based on supervised learning, elevates the strength of the entire AI to a new level. Based on this training system, we carried out enough ablation experiments and discovered that 1. The same parameter amount of Transformer architecture has a higher performance than CNN on Chinese chess; 2. Possible moves of both sides as features can greatly improve the training process; 3. Selective opponent pool, compared to pure self-play training, results in a faster improvement curve and a higher strength limit. 4. Value Estimation with Cutoff(VECT) improves the original PPO algorithm training process and we will give the explanation.
Abstract:In this paper, we present a novel algorithm, uniINF, for the Heavy-Tailed Multi-Armed Bandits (HTMAB) problem, demonstrating robustness and adaptability in both stochastic and adversarial environments. Unlike the stochastic MAB setting where loss distributions are stationary with time, our study extends to the adversarial setup, where losses are generated from heavy-tailed distributions that depend on both arms and time. Our novel algorithm `uniINF` enjoys the so-called Best-of-Both-Worlds (BoBW) property, performing optimally in both stochastic and adversarial environments without knowing the exact environment type. Moreover, our algorithm also possesses a Parameter-Free feature, i.e., it operates without the need of knowing the heavy-tail parameters $(\sigma, \alpha)$ a-priori. To be precise, uniINF ensures nearly-optimal regret in both stochastic and adversarial environments, matching the corresponding lower bounds when $(\sigma, \alpha)$ is known (up to logarithmic factors). To our knowledge, uniINF is the first parameter-free algorithm to achieve the BoBW property for the heavy-tailed MAB problem. Technically, we develop innovative techniques to achieve BoBW guarantees for Parameter-Free HTMABs, including a refined analysis for the dynamics of log-barrier, an auto-balancing learning rate scheduling scheme, an adaptive skipping-clipping loss tuning technique, and a stopping-time analysis for logarithmic regret.
Abstract:Our team developed a riding ballbot (called PURE) that is dynamically stable, omnidirectional, and driven by lean-to-steer control. A hands-free admittance control scheme (HACS) was previously integrated to allow riders with different torso functions to control the robot's movements via torso leaning and twisting. Such an interface requires motor coordination skills and could result in collisions with obstacles due to low proficiency. Hence, a shared controller (SC) that limits the speed of PURE could be helpful to ensure the safety of riders. However, the self-balancing dynamics of PURE could result in a weak control authority of its motion, in which the torso motion of the rider could easily result in poor tracking of the command speed dictated by the shared controller. Thus, we proposed an interactive hands-free admittance control scheme (iHACS), which added two modules to HACS to improve the speed-tracking performance of PURE: control gain personalization module and interaction compensation module. Human riding tests of simple tasks, idle-keeping and speed-limiting, were conducted to compare the performance of HACS and iHACS. Two manual wheelchair users and two able-bodied individuals participated in this study. They were instructed to use "adversarial" torso motions that would tax the SC's ability to keep the ballbot idling or below a set speed. In the idle-keeping tasks, iHACS demonstrated minimal translational motion and low command speed tracking RMSE, even with significant torso lean angles. During the speed-limiting task with command speed saturated at 0.5 m/s, the system achieved an average maximum speed of 1.1 m/s with iHACS, compared with that of over 1.9 m/s with HACS. These results suggest that iHACS can enhance PURE's control authority over the rider, which enables PURE to provide physical interactions back to the rider and results in a collaborative rider-robot synergy.
Abstract:This study introduces the development of hands-free control schemes for a riding ballbot, designed to allow riders including manual wheelchair users to control its movement through torso leaning and twisting. The hardware platform, Personal Unique Rolling Experience (PURE), utilizes a ballbot drivetrain, a dynamically stable mobile robot that uses a ball as its wheel to provide omnidirectional maneuverability. To accommodate users with varying torso motion functions, the hanads-free control scheme should be adjustable based on the rider's torso function and personal preferences. Therefore, concepts of (a) impedance control and (b) admittance control were integrated into the control scheme. A duo-agent optimization framework was utilized to assess the efficiency of this rider-ballbot system for a safety-critical task: braking from 1.4 m/s. The candidate control schemes were further implemented in the physical robot hardware and validated with two experienced users, demonstrating the efficiency and robustness of the hands-free admittance control scheme (HACS). This interface, which utilized physical human-robot interaction (pHRI) as the input, resulted in lower braking effort and shorter braking distance and time. Subsequently, 12 novice participants (six able-bodied users and six manual wheelchair users) with different levels of torso motion capability were then recruited to benchmark the braking performance with HACS. The indoor navigation capability of PURE was further demonstrated with these participants in courses simulating narrow hallways, tight turns, and navigation through static and dynamic obstacles. By exploiting pHRI, the proposed admittance-style control scheme provided effective control of the ballbot via torso motions. This interface enables PURE to provide a personal unique rolling experience to manual wheelchair users for safe and agile indoor navigation.
Abstract:Sequential models such as recurrent neural networks or transformer-based models became \textit{de facto} tools for multivariate time series forecasting in a probabilistic fashion, with applications to a wide range of datasets, such as finance, biology, medicine, etc. Despite their adeptness in capturing dependencies, assessing prediction uncertainty, and efficiency in training, challenges emerge in modeling high-dimensional complex distributions and cross-feature dependencies. To tackle these issues, recent works delve into generative modeling by employing diffusion or flow-based models. Notably, the integration of stochastic differential equations or probability flow successfully extends these methods to probabilistic time series imputation and forecasting. However, scalability issues necessitate a computational-friendly framework for large-scale generative model-based predictions. This work proposes a novel approach by blending the computational efficiency of recurrent neural networks with the high-quality probabilistic modeling of the diffusion model, which addresses challenges and advances generative models' application in time series forecasting. Our method relies on the foundation of stochastic interpolants and the extension to a broader conditional generation framework with additional control features, offering insights for future developments in this dynamic field.
Abstract:This study introduces a shared-control approach for collision avoidance in a self-balancing riding ballbot, called PURE, marked by its dynamic stability, omnidirectional movement, and hands-free interface. Integrated with a sensor array and a novel Passive Artificial Potential Field (PAPF) method, PURE provides intuitive navigation with deceleration assistance and haptic/audio feedback, effectively mitigating collision risks. This approach addresses the limitations of traditional APF methods, such as control oscillations and unnecessary speed reduction in challenging scenarios. A human-robot interaction experiment, with 20 manual wheelchair users and able-bodied individuals, was conducted to evaluate the performance of indoor navigation and obstacle avoidance with the proposed shared-control algorithm. Results indicated that shared-control significantly reduced collisions and cognitive load without affecting travel speed, offering intuitive and safe operation. These findings highlight the shared-control system's suitability for enhancing collision avoidance in self-balancing mobility devices, a relatively unexplored area in assistive mobility research.
Abstract:Several upper-limb exoskeleton robots have been developed for stroke rehabilitation, but their rather low level of individualized assistance typically limits their effectiveness and practicability. Individualized assistance involves an upper-limb exoskeleton robot continuously assessing feedback from a stroke patient and then meticulously adjusting interaction forces to suit specific conditions and online changes. This paper describes the development of a new upper-limb exoskeleton robot with a novel online generative capability that allows it to provide individualized assistance to support the rehabilitation training of stroke patients. Specifically, the upper-limb exoskeleton robot exploits generative models to customize the fine and fit trajectory for the patient, as medical conditions, responses, and comfort feedback during training generally differ between patients. This generative capability is integrated into the two working modes of the upper-limb exoskeleton robot: an active mirroring mode for patients who retain motor abilities on one side of the body and a passive following mode for patients who lack motor ability on both sides of the body. The performance of the upper-limb exoskeleton robot was illustrated in experiments involving healthy subjects and stroke patients.
Abstract:In Explainable AI, rule extraction translates model knowledge into logical rules, such as IF-THEN statements, crucial for understanding patterns learned by black-box models. This could significantly aid in fields like disease diagnosis, disease progression estimation, or drug discovery. However, such application domains often contain imbalanced data, with the class of interest underrepresented. Existing methods inevitably compromise the performance of rules for the minor class to maximise the overall performance. As the first attempt in this field, we propose a model-agnostic approach for extracting rules from specific subgroups of data, featuring automatic rule generation for numerical features. This method enhances the regional explainability of machine learning models and offers wider applicability compared to existing methods. We additionally introduce a new method for selecting features to compose rules, reducing computational costs in high-dimensional spaces. Experiments across various datasets and models demonstrate the effectiveness of our methods.
Abstract:In this work, we introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory. Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images. This limits these methods to a low-resolution representation and makes it difficult to scale up to the dense views for better quality. GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms to effectively integrate image features into 3D representations. We implement this solution through a two-stage pipeline: initially, a lightweight proposal network generates a sparse set of 3D anchor points from the posed image inputs; subsequently, a specialized reconstruction transformer refines the geometry and retrieves textural details. Extensive experimental results demonstrate that GeoLRM significantly outperforms existing models, especially for dense view inputs. We also demonstrate the practical applicability of our model with 3D generation tasks, showcasing its versatility and potential for broader adoption in real-world applications.
Abstract:This paper investigates a constrained inverse kinematic (IK) problem that seeks a feasible configuration of an articulated robot under various constraints such as joint limits and obstacle collision avoidance. Due to the high-dimensionality and complex constraints, this problem is often solved numerically via iterative local optimization. Classic local optimization methods take joint angles as the decision variable, which suffers from non-linearity caused by the trigonometric constraints. Recently, distance-based IK methods have been developed as an alternative approach that formulates IK as an optimization over the distances among points attached to the robot and the obstacles. Although distance-based methods have demonstrated unique advantages, they still suffer from low computational efficiency, since these approaches usually ignore the chain structure in the kinematics of serial robots. This paper proposes a new method called propagative distance optimization for constrained inverse kinematics (PDO-IK), which captures and leverages the chain structure in the distance-based formulation and expedites the optimization by computing forward kinematics and the Jacobian propagatively along the kinematic chain. Test results show that PDO-IK runs up to two orders of magnitude faster than the existing distance-based methods under joint limits constraints and obstacle avoidance constraints. It also achieves up to three times higher success rates than the conventional joint-angle-based optimization methods for IK problems. The high runtime efficiency of PDO-IK allows the real-time computation (10$-$1500 Hz) and enables a simulated humanoid robot with 19 degrees of freedom (DoFs) to avoid moving obstacles, which is otherwise hard to achieve with the baselines.