Abstract:Deep Reinforcement Learning (DRL) has achieved remarkable success in solving complex decision-making problems by combining the representation capabilities of deep learning with the decision-making power of reinforcement learning. However, learning in sparse reward environments remains challenging due to insufficient feedback to guide the optimization of agents, especially in real-life environments with high-dimensional states. To tackle this issue, experience replay is commonly introduced to enhance learning efficiency through past experiences. Nonetheless, current methods of experience replay, whether based on uniform or prioritized sampling, frequently struggle with suboptimal learning efficiency and insufficient utilization of samples. This paper proposes a novel approach, diversity-based experience replay (DBER), which leverages the deterministic point process to prioritize diverse samples in state realizations. We conducted extensive experiments on Robotic Manipulation tasks in MuJoCo, Atari games, and realistic in-door environments in Habitat. The results show that our method not only significantly improves learning efficiency but also demonstrates superior performance in sparse reward environments with high-dimensional states, providing a simple yet effective solution for this field.
Abstract:Nowadays transformer-based Large Language Models (LLM) for code generation tasks usually apply sampling and filtering pipelines. Due to the sparse reward problem in code generation tasks caused by one-token incorrectness, transformer-based models will sample redundant programs till they find a correct one, leading to low efficiency. To overcome the challenge, we incorporate Experience Replay (ER) in the fine-tuning phase, where codes and programs produced are stored and will be replayed to give the LLM agent a chance to learn from past experiences. Based on the spirit of ER, we introduce a novel approach called BTP pipeline which consists of three phases: beam search sampling, testing phase, and prioritized experience replay phase. The approach makes use of failed programs collected by code models and replays programs with high Possibility and Pass-rate Prioritized value (P2Value) from the replay buffer to improve efficiency. P2Value comprehensively considers the possibility of transformers' output and pass rate and can make use of the redundant resources caused by the problem that most programs collected by LLMs fail to pass any tests. We empirically apply our approach in several LLMs, demonstrating that it enhances their performance in code generation tasks and surpasses existing baselines.
Abstract:In recent times, there is an increased interest in the identification and re-identification of people at long distances, such as from rooftop cameras, UAV cameras, street cams, and others. Such recognition needs to go beyond face and use whole-body markers such as gait. However, datasets to train and test such recognition algorithms are not widely prevalent, and fewer are labeled. This paper introduces DIOR -- a framework for data collection, semi-automated annotation, and also provides a dataset with 14 subjects and 1.649 million RGB frames with 3D/2D skeleton gait labels, including 200 thousands frames from a long range camera. Our approach leverages advanced 3D computer vision techniques to attain pixel-level accuracy in indoor settings with motion capture systems. Additionally, for outdoor long-range settings, we remove the dependency on motion capture systems and adopt a low-cost, hybrid 3D computer vision and learning pipeline with only 4 low-cost RGB cameras, successfully achieving precise skeleton labeling on far-away subjects, even when their height is limited to a mere 20-25 pixels within an RGB frame. On publication, we will make our pipeline open for others to use.
Abstract:Implicit Neural Representations (INRs) have recently exhibited immense potential in the field of scientific visualization for both data generation and visualization tasks. However, these representations often consist of large multi-layer perceptrons (MLPs), necessitating millions of operations for a single forward pass, consequently hindering interactive visual exploration. While reducing the size of the MLPs and employing efficient parametric encoding schemes can alleviate this issue, it compromises generalizability for unseen parameters, rendering it unsuitable for tasks such as temporal super-resolution. In this paper, we introduce HyperINR, a novel hypernetwork architecture capable of directly predicting the weights for a compact INR. By harnessing an ensemble of multiresolution hash encoding units in unison, the resulting INR attains state-of-the-art inference performance (up to 100x higher inference bandwidth) and can support interactive photo-realistic volume visualization. Additionally, by incorporating knowledge distillation, exceptional data and visualization generation quality is achieved, making our method valuable for real-time parameter exploration. We validate the effectiveness of the HyperINR architecture through a comprehensive ablation study. We showcase the versatility of HyperINR across three distinct scientific domains: novel view synthesis, temporal super-resolution of volume data, and volume rendering with dynamic global shadows. By simultaneously achieving efficiency and generalizability, HyperINR paves the way for applying INR in a wider array of scientific visualization applications.
Abstract:A fundamental challenge in robot perception is the coupling of the sensor pose and robot pose. This has led to research in active vision where robot pose is changed to reorient the sensor to areas of interest for perception. Further, egomotion such as jitter, and external effects such as wind and others affect perception requiring additional effort in software such as image stabilization. This effect is particularly pronounced in micro-air vehicles and micro-robots who typically are lighter and subject to larger jitter but do not have the computational capability to perform stabilization in real-time. We present a novel microelectromechanical (MEMS) mirror LiDAR system to change the field of view of the LiDAR independent of the robot motion. Our design has the potential for use on small, low-power systems where the expensive components of the LiDAR can be placed external to the small robot. We show the utility of our approach in simulation and on prototype hardware mounted on a UAV. We believe that this LiDAR and its compact movable scanning design provide mechanisms to decouple robot and sensor geometry allowing us to simplify robot perception. We also demonstrate examples of motion compensation using IMU and external odometry feedback in hardware.
Abstract:This study verified the effectiveness of Donald Trump's Twitter campaign in guiding agen-da-setting and deflecting political risk and examined Trump's Twitter communication strategy and explores the communication effects of his tweet content during Covid-19 pandemic. We collected all tweets posted by Trump on the Twitter platform from January 1, 2020 to December 31, 2020.We used Ordinary Least Squares (OLS) regression analysis with a fixed effects model to analyze the existence of the Twitter strategy. The correlation between the number of con-firmed daily Covid-19 diagnoses and the number of particular thematic tweets was investigated using time series analysis. Empirical analysis revealed Twitter's strategy is used to divert public attention from negative Covid-19 reports during the epidemic, and it posts a powerful political communication effect on Twitter. However, findings suggest that Trump did not use false claims to divert political risk and shape public opinion.
Abstract:Sepsis is a life-threatening medical emergency, which is a major cause of death worldwide and the second highest cause of mortality in the United States. Researching the optimal control treatment or intervention strategy on the comprehensive sepsis system is key in reducing mortality. For this purpose, first, this paper improves a complex nonlinear sepsis model proposed in our previous work. Then, bifurcation analyses are conducted for each sepsis subsystem to study the model behaviors under some system parameters. The bifurcation analysis results also further indicate the necessity of control treatment and intervention therapy. If the sepsis system is without adding any control under some parameter and initial system value settings, the system will perform persistent inflammation outcomes as time goes by. Therefore, we develop our complex improved nonlinear sepsis model into a sepsis optimal control model, and then use some effective biomarkers recommended in existing clinic practices as optimization objective function to measure the development of sepsis. Besides that, a Bayesian optimization algorithm by combining Recurrent neural network (RNN-BO algorithm) is introduced to predict the optimal control strategy for the studied sepsis optimal control system. The difference between the RNN-BO algorithm from other optimization algorithms is that once given any new initial system value setting (initial value is associated with the initial conditions of patients), the RNN-BO algorithm is capable of quickly predicting a corresponding time-series optimal control based on the historical optimal control data for any new sepsis patient. To demonstrate the effectiveness and efficiency of the RNN-BO algorithm on solving the optimal control solution on the complex nonlinear sepsis system, some numerical simulations are implemented by comparing with other optimization algorithms in this paper.
Abstract:Bayesian Optimization algorithm has become a promising approach for nonlinear global optimization problems and many machine learning applications. Over the past few years, improvements and enhancements have been brought forward and they have shown some promising results in solving the complex dynamic problems, systems of ordinary differential equations where the objective functions are computationally expensive to evaluate. Besides, the straightforward implementation of the Bayesian Optimization algorithm performs well merely for optimization problems with 10-20 dimensions. The study presented in this paper proposes a new high dimensional Bayesian Optimization algorithm combining Recurrent neural networks, which is expected to predict the optimal solution for the global optimization problems with high dimensional or time series decision models. The proposed RNN-BO algorithm can solve the optimal control problems in the lower dimension space and then learn from the historical data using the recurrent neural network to learn the historical optimal solution data and predict the optimal control strategy for any new initial system value setting. In addition, accurately and quickly providing the optimal control strategy is essential to effectively and efficiently control the epidemic spread while minimizing the associated financial costs. Therefore, to verify the effectiveness of the proposed algorithm, computational experiments are carried out on a deterministic SEIR epidemic model and a stochastic SIS optimal control model. Finally, we also discuss the impacts of different numbers of the RNN layers and training epochs on the trade-off between solution quality and related computational efforts.
Abstract:At present, high-dimensional global optimization problems with time-series models have received much attention from engineering fields. Since it was proposed, Bayesian optimization has quickly become a popular and promising approach for solving global optimization problems. However, the standard Bayesian optimization algorithm is insufficient to solving the global optimal solution when the model is high-dimensional. Hence, this paper presents a novel high dimensional Bayesian optimization algorithm by considering dimension reduction and different dimension fill-in strategies. Most existing literature about Bayesian optimization algorithms did not discuss the sampling strategies to optimize the acquisition function. This study proposed a new sampling method based on both the multi-armed bandit and random search methods while optimizing the acquisition function. Besides, based on the time-dependent or dimension-dependent characteristics of the model, the proposed algorithm can reduce the dimension evenly. Then, five different dimension fill-in strategies were discussed and compared in this study. Finally, to increase the final accuracy of the optimal solution, the proposed algorithm adds a local search based on a series of Adam-based steps at the final stage. Our computational experiments demonstrated that the proposed Bayesian optimization algorithm could achieve reasonable solutions with excellent performances for high dimensional global optimization problems with a time-series optimal control model.