Abstract:Humanoid robots are designed to perform diverse loco-manipulation tasks. However, they face challenges due to their high-dimensional and unstable dynamics, as well as the complex contact-rich nature of the tasks. Model-based optimal control methods offer precise and systematic control but are limited by high computational complexity and accurate contact sensing. On the other hand, reinforcement learning (RL) provides robustness and handles high-dimensional spaces but suffers from inefficient learning, unnatural motion, and sim-to-real gaps. To address these challenges, we introduce Opt2Skill, an end-to-end pipeline that combines model-based trajectory optimization with RL to achieve robust whole-body loco-manipulation. We generate reference motions for the Digit humanoid robot using differential dynamic programming (DDP) and train RL policies to track these trajectories. Our results demonstrate that Opt2Skill outperforms pure RL methods in both training efficiency and task performance, with optimal trajectories that account for torque limits enhancing trajectory tracking. We successfully transfer our approach to real-world applications.
Abstract:Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfer to life science tasks is non-trivial due to the different training strategies and data forms. To address this gap, we introduce SES-Adapter, a simple, efficient, and scalable adapter method for enhancing the representation learning of PLMs. SES-Adapter incorporates PLM embeddings with structural sequence embeddings to create structure-aware representations. We show that the proposed method is compatible with different PLM architectures and across diverse tasks. Extensive evaluations are conducted on 2 types of folding structures with notable quality differences, 9 state-of-the-art baselines, and 9 benchmark datasets across distinct downstream tasks. Results show that compared to vanilla PLMs, SES-Adapter improves downstream task performance by a maximum of 11% and an average of 3%, with significantly accelerated training speed by a maximum of 1034% and an average of 362%, the convergence rate is also improved by approximately 2 times. Moreover, positive optimization is observed even with low-quality predicted structures. The source code for SES-Adapter is available at https://github.com/tyang816/SES-Adapter.
Abstract:The escalating prevalence of diabetes globally underscores the need for diabetes management. Recent research highlights the growing focus on digital biomarkers in diabetes management, with innovations in computational frameworks and noninvasive monitoring techniques using personalized glucose metrics. However, they predominantly focus on insulin dosing and specific glucose values, or with limited attention given to overall glycemic control. This leaves a gap in expanding the scope of digital biomarkers for overall glycemic control in diabetes management. To address such a research gap, we propose GluMarker -- an end-to-end framework for modeling digital biomarkers using broader factors sources to predict glycemic control. Through the assessment and refinement of various machine learning baselines, GluMarker achieves state-of-the-art on Anderson's dataset in predicting next-day glycemic control. Moreover, our research identifies key digital biomarkers for the next day's glycemic control prediction. These identified biomarkers are instrumental in illuminating the daily factors that influence glycemic management, offering vital insights for diabetes care.
Abstract:The global diabetes epidemic highlights the importance of maintaining good glycemic control. Glucose prediction is a fundamental aspect of diabetes management, facilitating real-time decision-making. Recent research has introduced models focusing on long-term glucose trend prediction, which are unsuitable for real-time decision-making and result in delayed responses. Conversely, models designed to respond to immediate glucose level changes cannot analyze glucose variability comprehensively. Moreover, contemporary research generally integrates various physiological parameters (e.g. insulin doses, food intake, etc.), which inevitably raises data privacy concerns. To bridge such a research gap, we propose TimeGlu -- an end-to-end pipeline for short-term glucose prediction solely based on CGM time series data. We implement four baseline methods to conduct a comprehensive comparative analysis of the model's performance. Through extensive experiments on two contrasting datasets (CGM Glucose and Colas dataset), TimeGlu achieves state-of-the-art performance without the need for additional personal data from patients, providing effective guidance for real-world diabetic glucose management.
Abstract:The increasing number of diabetic patients is a serious issue in society today, which has significant negative impacts on people's health and the country's financial expenditures. Because diabetes may develop into potential serious complications, early glucose prediction for diabetic patients is necessary for timely medical treatment. Existing glucose prediction methods typically utilize patients' private data (e.g. age, gender, ethnicity) and physiological parameters (e.g. blood pressure, heart rate) as reference features for glucose prediction, which inevitably leads to privacy protection concerns. Moreover, these models generally focus on either long-term (monthly-based) or short-term (minute-based) predictions. Long-term prediction methods are generally inaccurate because of the external uncertainties that can greatly affect the glucose values, while short-term ones fail to provide timely medical guidance. Based on the above issues, we propose CrossGP, a novel machine-learning framework for cross-day glucose prediction solely based on the patient's external activities without involving any physiological parameters. Meanwhile, we implement three baseline models for comparison. Extensive experiments on Anderson's dataset strongly demonstrate the superior performance of CrossGP and prove its potential for future real-life applications.
Abstract:In the landscape of spatio-temporal data analytics, effective trajectory representation learning is paramount. To bridge the gap of learning accurate representations with efficient and flexible mechanisms, we introduce Efflex, a comprehensive pipeline for transformative graph modeling and representation learning of the large-volume spatio-temporal trajectories. Efflex pioneers the incorporation of a multi-scale k-nearest neighbors (KNN) algorithm with feature fusion for graph construction, marking a leap in dimensionality reduction techniques by preserving essential data features. Moreover, the groundbreaking graph construction mechanism and the high-performance lightweight GCN increase embedding extraction speed by up to 36 times faster. We further offer Efflex in two versions, Efflex-L for scenarios demanding high accuracy, and Efflex-B for environments requiring swift data processing. Comprehensive experimentation with the Porto and Geolife datasets validates our approach, positioning Efflex as the state-of-the-art in the domain. Such enhancements in speed and accuracy highlight the versatility of Efflex, underscoring its wide-ranging potential for deployment in time-sensitive and computationally constrained applications.
Abstract:Trajectory similarity search plays an essential role in autonomous driving, as it enables vehicles to analyze the information and characteristics of different trajectories to make informed decisions and navigate safely in dynamic environments. Existing work on the trajectory similarity search task primarily utilizes sequence-processing algorithms or Recurrent Neural Networks (RNNs), which suffer from the inevitable issues of complicated architecture and heavy training costs. Considering the intricate connections between trajectories, using Graph Neural Networks (GNNs) for data modeling is feasible. However, most methods directly use existing mathematical graph structures as the input instead of constructing specific graphs from certain vehicle trajectory data. This ignores such data's unique and dynamic characteristics. To bridge such a research gap, we propose VeTraSS -- an end-to-end pipeline for Vehicle Trajectory Similarity Search. Specifically, VeTraSS models the original trajectory data into multi-scale graphs, and generates comprehensive embeddings through a novel multi-layer attention-based GNN. The learned embeddings can be used for searching similar vehicle trajectories. Extensive experiments on the Porto and Geolife datasets demonstrate the effectiveness of VeTraSS, where our model outperforms existing work and reaches the state-of-the-art. This demonstrates the potential of VeTraSS for trajectory analysis and safe navigation in self-driving vehicles in the real world.
Abstract:Task and Motion Planning (TAMP) integrates high-level task planning and low-level motion planning to equip robots with the autonomy to effectively reason over long-horizon, dynamic tasks. Optimization-based TAMP focuses on hybrid optimization approaches that define goal conditions via objective functions and are capable of handling open-ended goals, robotic dynamics, and physical interaction between the robot and the environment. Therefore, optimization-based TAMP is particularly suited to solve highly complex, contact-rich locomotion and manipulation problems. This survey provides a comprehensive review on optimization-based TAMP, covering (i) planning domain representations, including action description languages and temporal logic, (ii) individual solution strategies for components of TAMP, including AI planning and trajectory optimization (TO), and (iii) the dynamic interplay between logic-based task planning and model-based TO. A particular focus of this survey is to highlight the algorithm structures to efficiently solve TAMP, especially hierarchical and distributed approaches. Additionally, the survey emphasizes the synergy between the classical methods and contemporary learning-based innovations such as large language models. Furthermore, the future research directions for TAMP is discussed in this survey, highlighting both algorithmic and application-specific challenges.
Abstract:Existing benchmarks for fake news detection have significantly contributed to the advancement of models in assessing the authenticity of news content. However, these benchmarks typically focus solely on news pertaining to a single semantic topic or originating from a single platform, thereby failing to capture the diversity of multi-domain news in real scenarios. In order to understand fake news across various domains, the external knowledge and fine-grained annotations are indispensable to provide precise evidence and uncover the diverse underlying strategies for fabrication, which are also ignored by existing benchmarks. To address this gap, we introduce a novel multi-domain knowledge-enhanced benchmark with fine-grained annotations, named \textbf{FineFake}. FineFake encompasses 16,909 data samples spanning six semantic topics and eight platforms. Each news item is enriched with multi-modal content, potential social context, semi-manually verified common knowledge, and fine-grained annotations that surpass conventional binary labels. Furthermore, we formulate three challenging tasks based on FineFake and propose a knowledge-enhanced domain adaptation network. Extensive experiments are conducted on FineFake under various scenarios, providing accurate and reliable benchmarks for future endeavors. The entire FineFake project is publicly accessible as an open-source repository at \url{https://github.com/Accuser907/FineFake}.
Abstract:This study focuses on a layered, experience-based, multi-modal contact planning framework for agile quadrupedal locomotion over a constrained rebar environment. To this end, our hierarchical planner incorporates locomotion-specific modules into the high-level contact sequence planner and solves kinodynamically-aware trajectory optimization as the low-level motion planner. Through quantitative analysis of the experience accumulation process and experimental validation of the kinodynamic feasibility of the generated locomotion trajectories, we demonstrate that the experience planning heuristic offers an effective way of providing candidate footholds for a legged contact planner. Additionally, we introduce a guiding torso path heuristic at the global planning level to enhance the navigation success rate in the presence of environmental obstacles. Our results indicate that the torso-path guided experience accumulation requires significantly fewer offline trials to successfully reach the goal compared to regular experience accumulation. Finally, our planning framework is validated in both dynamics simulations and real hardware implementations on a quadrupedal robot provided by Skymul Inc.