Abstract:In complex missions such as search and rescue,robots must make intelligent decisions in unknown environments, relying on their ability to perceive and understand their surroundings. High-quality and real-time reconstruction enhances situational awareness and is crucial for intelligent robotics. Traditional methods often struggle with poor scene representation or are too slow for real-time use. Inspired by the efficacy of 3D Gaussian Splatting (3DGS), we propose a hierarchical planning framework for fast and high-fidelity active reconstruction. Our method evaluates completion and quality gain to adaptively guide reconstruction, integrating global and local planning for efficiency. Experiments in simulated and real-world environments show our approach outperforms existing real-time methods.
Abstract:In recent years, the development of robots capable of operating in both aerial and aquatic environments has gained significant attention. This study presents the design and fabrication of a novel aerial-aquatic locomotion robot (AALR). Inspired by the diving beetle, the AALR incorporates a biomimetic propulsion mechanism with power and recovery strokes. The variable stiffness propulsion module (VSPM) uses low melting point alloy (LMPA) and variable stiffness joints (VSJ) to achieve efficient aquatic locomotion while reduce harm to marine life. The AALR's innovative design integrates the VSPM into the arms of a traditional quadrotor, allowing for effective aerial-aquatic locomotion. The VSPM adjusts joint stiffness through temperature control, meeting locomotion requirements in both aerial and aquatic modes. A dynamic model for the VSPM was developed, with optimized dimensional parameters to increase propulsion force. Experiments focused on aquatic mode analysis and demonstrated the AALR's swimming capability, achieving a maximum swimming speed of 77 mm/s underwater. The results confirm the AALR's effective performance in water environment, highlighting its potential for versatile, eco-friendly operations.
Abstract:Recently, many works have proposed various financial large language models (FinLLMs) by pre-training from scratch or fine-tuning open-sourced LLMs on financial corpora. However, existing FinLLMs exhibit unsatisfactory performance in understanding financial text when numeric variables are involved in questions. In this paper, we propose a novel LLM, called numeric-sensitive large language model (NumLLM), for Chinese finance. We first construct a financial corpus from financial textbooks which is essential for improving numeric capability of LLMs during fine-tuning. After that, we train two individual low-rank adaptation (LoRA) modules by fine-tuning on our constructed financial corpus. One module is for adapting general-purpose LLMs to financial domain, and the other module is for enhancing the ability of NumLLM to understand financial text with numeric variables. Lastly, we merge the two LoRA modules into the foundation model to obtain NumLLM for inference. Experiments on financial question-answering benchmark show that NumLLM can boost the performance of the foundation model and can achieve the best overall performance compared to all baselines, on both numeric and non-numeric questions.
Abstract:Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required. Recently, neural implicit representation has provided a promising direction for online interactive mapping. However, implementing open-vocabulary scene understanding capability into online neural implicit mapping still faces three challenges: lack of local scene updating ability, blurry spatial hierarchical semantic segmentation and difficulty in maintaining multi-view consistency. To this end, we proposed O2V-mapping, which utilizes voxel-based language and geometric features to create an open-vocabulary field, thus allowing for local updates during online training process. Additionally, we leverage a foundational model for image segmentation to extract language features on object-level entities, achieving clear segmentation boundaries and hierarchical semantic features. For the purpose of preserving consistency in 3D object properties across different viewpoints, we propose a spatial adaptive voxel adjustment mechanism and a multi-view weight selection method. Extensive experiments on open-vocabulary object localization and semantic segmentation demonstrate that O2V-mapping achieves online construction of language scenes while enhancing accuracy, outperforming the previous SOTA method.
Abstract:Online dense mapping of urban scenes forms a fundamental cornerstone for scene understanding and navigation of autonomous vehicles. Recent advancements in mapping methods are mainly based on NeRF, whose rendering speed is too slow to meet online requirements. 3D Gaussian Splatting (3DGS), with its rendering speed hundreds of times faster than NeRF, holds greater potential in online dense mapping. However, integrating 3DGS into a street-view dense mapping framework still faces two challenges, including incomplete reconstruction due to the absence of geometric information beyond the LiDAR coverage area and extensive computation for reconstruction in large urban scenes. To this end, we propose HGS-Mapping, an online dense mapping framework in unbounded large-scale scenes. To attain complete construction, our framework introduces Hybrid Gaussian Representation, which models different parts of the entire scene using Gaussians with distinct properties. Furthermore, we employ a hybrid Gaussian initialization mechanism and an adaptive update method to achieve high-fidelity and rapid reconstruction. To the best of our knowledge, we are the first to integrate Gaussian representation into online dense mapping of urban scenes. Our approach achieves SOTA reconstruction accuracy while only employing 66% number of Gaussians, leading to 20% faster reconstruction speed.
Abstract:Anatomy-specific RF receive coil arrays routinely adopted in magnetic resonance imaging (MRI) for signal acquisition, are commonly burdened by their bulky, fixed, and rigid configurations, which may impose patient discomfort, bothersome positioning, and suboptimal sensitivity in certain situations. Herein, leveraging coaxial cables' inherent flexibility and electric field confining property, for the first time, we present wireless, ultra-lightweight, coaxially-shielded MRI coils achieving a signal-to-noise ratio (SNR) comparable to or surpassing that of commercially available cutting-edge receive coil arrays with the potential for improved patient comfort, ease of implementation, and significantly reduced costs. The proposed coils demonstrate versatility by functioning both independently in form-fitting configurations, closely adapting to relatively small anatomical sites, and collectively by inductively coupling together as metamaterials, allowing for extension of the field-of-view of their coverage to encompass larger anatomical regions without compromising coil sensitivity. The wireless, coaxially-shielded MRI coils reported herein pave the way toward next generation MRI coils.
Abstract:Recent advancements in metamaterials have yielded the possibility of a wireless solution to improve signal-to-noise ratio (SNR) in magnetic resonance imaging (MRI). Unlike traditional closely packed local coil arrays with rigid designs and numerous components, these lightweight, cost-effective metamaterials eliminate the need for radio frequency (RF) cabling, baluns, adapters, and interfaces. However, their clinical adoption has been limited by their low sensitivity, bulky physical footprint, and limited, specific use cases. Herein, we introduce a wearable metamaterial developed using commercially available coaxial cable, designed for a 3.0 T MRI system. This metamaterial inherits the coaxially-shielded structure of its constituent coaxial cable, effectively containing the electric field within the cable, thereby mitigating the electric coupling to its loading while ensuring safer clinical adoption, lower signal loss, and resistance to frequency shifts. Weighing only 50g, the metamaterial maximizes its sensitivity by conforming to the anatomical region of interest. MRI images acquired using this metamaterial with various pulse sequences demonstrate an up to 2-fold SNR enhancement when compared to a state-of-the-art 16-channel knee coil. This work introduces a novel paradigm for constructing metamaterials in the MRI environment, paving the way for the development of next-generation wireless MRI technology.
Abstract:One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs) to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We overcome the tendency of hallucination in LLMs by incorporating finite-state constraints during decoding; these eliminate invalid outputs without requiring additional training. We discover that LLMs are adaptable to transcripts containing ASR errors through prompt-tuning or fine-tuning. Relative to a state-of-the-art automatic punctuation baseline, our best LLM improves the average BLEU by 2.9 points for English-German, English-Spanish, and English-Arabic TED talk translation in 9 test sets, just by improving segmentation.
Abstract:Ongoing effort has been devoted to applying metamaterials to boost the imaging performance of magnetic resonance imaging owing to their unique capacity for electromagnetic field confinement and enhancement. However, there are still major obstacles to widespread clinical adoption of conventional metamaterials due to several notable restrictions, namely: their typically bulky and rigid structures, deviations in their optimal resonance frequency, and their inevitable interference with the transmission RF field in MRI. Herein, we address these restrictions and report a conformal, smart metamaterial, which may not only be readily tuned to achieve the desired, precise frequency match with MRI by a controlling circuit, but is also capable of selectively amplifying the magnetic field during the RF reception phase by sensing the excitation signal strength passively, thereby remaining off during the RF transmission phase and thereby ensuring its optimal performance when applied to MRI as an additive technology. By addressing a host of current technological challenges, the metamaterial presented herein paves the way toward the wide-ranging utilization of metamaterials in clinical MRI, thereby translating this promising technology to the MRI bedside.
Abstract:Deep learning models have demonstrated impressive performance in various domains. However, the prolonged training time of these models remains a critical problem. Manually designed parallel training strategies could enhance efficiency but require considerable time and deliver little flexibility. Hence, automatic parallelism is proposed to automate the parallel strategy searching process. Even so, existing approaches suffer from sub-optimal strategy space because they treat automatic parallelism as two independent stages, namely inter- and intra-layer parallelism. To address this issue, we propose UniAP, which utilizes mixed integer quadratic programming to unify inter- and intra-layer automatic parallelism. To the best of our knowledge, UniAP is the first work to unify these two categories to search for a globally optimal strategy. The experimental results show that UniAP outperforms state-of-the-art methods by up to 1.70$\times$ in throughput and reduces strategy searching time by up to 16$\times$ across four Transformer-like models.