Abstract:Modern orchards are planted in structured rows with distinct panel divisions to improve management. Accurate and efficient joint segmentation of point cloud from Panel to Tree and Branch (P2TB) is essential for robotic operations. However, most current segmentation methods focus on single instance segmentation and depend on a sequence of deep networks to perform joint tasks. This strategy hinders the use of hierarchical information embedded in the data, leading to both error accumulation and increased costs for annotation and computation, which limits its scalability for real-world applications. In this study, we proposed a novel approach that incorporated a Real2Sim L-TreeGen for training data generation and a joint model (J-P2TB) designed for the P2TB task. The J-P2TB model, trained on the generated simulation dataset, was used for joint segmentation of real-world panel point clouds via zero-shot learning. Compared to representative methods, our model outperformed them in most segmentation metrics while using 40% fewer learnable parameters. This Sim2Real result highlighted the efficacy of L-TreeGen in model training and the performance of J-P2TB for joint segmentation, demonstrating its strong accuracy, efficiency, and generalizability for real-world applications. These improvements would not only greatly benefit the development of robots for automated orchard operations but also advance digital twin technology.
Abstract:Small-scale robots offer significant potential in minimally-invasive medical procedures. Due to the nature of soft biological tissues, however, robots are exposed to complex environments with various challenges in locomotion, which is essential to overcome for useful medical tasks. A single mini-robot often provides insufficient force on slippery biological surfaces to carry medical instruments, such as a fluid catheter or an electrical wire. Here, for the first time, we report a team of millirobots (TrainBot) that can generate around two times higher actuating force than a TrainBot unit by forming a convoy to collaboratively carry long and heavy cargos. The feet of each unit are optimized to increase the propulsive force around three times so that it can effectively crawl on slippery biological surfaces. A human-scale permanent magnetic set-up is developed to wirelessly actuate and control the TrainBot to transport heavy and lengthy loads through narrow biological lumens, such as the intestine and the bile duct. We demonstrate the first electrocauterization performed by the TrainBot to relieve a biliary obstruction and open a tunnel for fluid drainage and drug delivery. The developed technology sheds light on the collaborative strategy of small-scale robots for future minimally-invasive surgical procedures.
Abstract:Deep neural networks (DNNs) have numerous applications across various domains. Several optimization techniques, such as ResNet and SENet, have been proposed to improve model accuracy. These techniques improve the model performance by adjusting or calibrating feature responses according to a uniform standard. However, they lack the discriminative calibration for different features, thereby introducing limitations in the model output. Therefore, we propose a method that discriminatively calibrates feature responses. The preliminary experimental results indicate that the neural feature response follows a Gaussian distribution. Consequently, we compute confidence values by employing the Gaussian probability density function, and then integrate these values with the original response values. The objective of this integration is to improve the feature discriminability of the neural feature response. Based on the calibration values, we propose a plugin-based calibration module incorporated into a modified ResNet architecture, termed Response Calibration Networks (ResCNet). Extensive experiments on datasets like CIFAR-10, CIFAR-100, SVHN, and ImageNet demonstrate the effectiveness of the proposed approach. The developed code is publicly available at https://github.com/tcmyxc/ResCNet.
Abstract:Magneto-oscillatory devices have been recently developed as very potent wireless miniature position trackers and sensors with an exceptional accuracy and sensing distance for surgical and robotic applications. However, it is still unclear to which extend a mechanically resonating sub-millimeter magnet interacts with external magnetic fields or gradients, which induce frequency shifts of sub-mHz to several Hz and therefore affect the sensing accuracy. Here, we investigate this effect experimentally on a cantilever-based magneto-oscillatory wireless sensor (MOWS) and build an analytical model concerning magnetic and mechanical interactions. The millimeter-scale MOWS is capable to detect magnetic fields with sub-uT resolution to at least +/- 5 mT, and simultaneously detects magnetic field gradients with a resolution of 65 uT/m to at least +/- 50 mT/m. The magnetic field sensitivity allows direct calculation of mechanical device properties, and by rotation, individual contributions of the magnetic field and gradient can be analyzed. The derived model is general and can be applied to other magneto-oscillatory systems interacting with magnetic environments.
Abstract:Magnetism is widely used for the wireless localization and actuation of robots and devices for medical procedures. However, current static magnetic localization methods suffer from large required magnets and are limited to only five degrees of freedom due to a fundamental constraint of the rotational symmetry around the magnetic axis. We present the small-scale magneto-oscillatory localization (SMOL) method, which is capable of wirelessly localizing a millimeter-scale tracker with full six degrees of freedom in deep biological tissues. The SMOL device uses the temporal oscillation of a mechanically resonant cantilever with a magnetic dipole to break the rotational symmetry, and exploits the frequency-response to achieve a high signal-to-noise ratio with sub-millimeter accuracy over a large distance of up to 12 centimeters and quasi-continuous refresh rates up to 200 Hz. Integration into real-time closed-loop controlled robots and minimally-invasive surgical tools are demonstrated to reveal the vast potential of the SMOL method.
Abstract:The relation extraction (RE) in complex scenarios faces challenges such as diverse relation types and ambiguous relations between entities within a single sentence, leading to the poor performance of pure "text-in, text-out" language models (LMs). To address these challenges, in this paper, we propose an agent-based RE framework, namely AgentRE, which fully leverages the potential of large language models (LLMs) including memory, retrieval and reflection, to achieve RE in complex scenarios. Specifically, three major modules are built in AgentRE serving as the tools to help the agent acquire and process various useful information, thereby obtaining improved RE performance. Our extensive experimental results upon two datasets in English and Chinese demonstrate our AgentRE's superior performance, especially in low-resource scenarios. Additionally, the trajectories generated by AgentRE can be refined to construct a high-quality training dataset incorporating different reasoning methods, which can be used to fine-tune smaller models. Code is available at https://github.com/Lightblues/AgentRE.
Abstract:Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmGPT, a suite of multilingual LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus of hundreds of billions of tokens tailored to the Bio-Pharmaceutical and Chemical sectors. Our evaluation shows that PharmGPT matches or surpasses existing general models on key benchmarks, such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. This advancement establishes a new benchmark for LLMs in the Bio-Pharmaceutical and Chemical fields, addressing the existing gap in specialized language modeling. Furthermore, this suggests a promising path for enhanced research and development in these specialized areas, paving the way for more precise and effective applications of NLP in specialized domains.
Abstract:Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmGPT, a suite of multilingual LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus of hundreds of billions of tokens tailored to the Bio-Pharmaceutical and Chemical sectors. Our evaluation shows that PharmGPT matches or surpasses existing general models on key benchmarks, such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. This advancement establishes a new benchmark for LLMs in the Bio-Pharmaceutical and Chemical fields, addressing the existing gap in specialized language modeling. Furthermore, this suggests a promising path for enhanced research and development in these specialized areas, paving the way for more precise and effective applications of NLP in specialized domains.
Abstract:In recent years, large language models have attracted significant attention due to their exceptional performance across a multitude of natural language process tasks, and have been widely applied in various fields. However, the application of large language models in the Intellectual Property (IP) space is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain. Using this standard process, we have trained the PatentGPT series models based on open-source pretrained models. By evaluating them on the open-source IP-oriented benchmark MOZIP, our domain-specific LLMs outperforms GPT-4, indicating the effectiveness of the proposed training procedure and the expertise of the PatentGPT models in the IP demain. What is impressive is that our model significantly outperformed GPT-4 on the 2019 China Patent Agent Qualification Examination by achieving a score of 65, reaching the level of human experts. Additionally, the PatentGPT model, which utilizes the SMoE architecture, achieves performance comparable to that of GPT-4 in the IP domain and demonstrates a better cost-performance ratio on long-text tasks, potentially serving as an alternative to GPT-4 within the IP domain.
Abstract:Robotic branch pruning is a significantly growing research area to cope with the shortage of labor force in the context of agriculture. One fundamental requirement in robotic pruning is the perception of detailed geometry and topology of branches. However, the point clouds obtained in agricultural settings often exhibit incompleteness due to several constraints, thereby restricting the accuracy of downstream robotic pruning. In this work, we addressed the issue of point cloud quality through a simulation-based deep neural network, leveraging a Real-to-Simulation (Real2Sim) data generation pipeline that not only eliminates the need for manual parameterization but also guarantees the realism of simulated data. The simulation-based neural network was applied to jointly perform point cloud completion and skeletonization on real-world partial branches, without additional real-world training. The Sim2Real qualitative completion and skeletonization results showed the model's remarkable capability for geometry reconstruction and topology prediction. Additionally, we quantitatively evaluated the Sim2Real performance by comparing branch-level trait characterization errors using raw incomplete data and complete data. The Mean Absolute Error (MAE) reduced by 75% and 8% for branch diameter and branch angle estimation, respectively, using the best complete data, which indicates the effectiveness of the Real2Sim data in a zero-shot generalization setting. The characterization improvements contributed to the precision and efficacy of robotic branch pruning.