Abstract:In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in understanding and solving mathematical problems, leading to advancements in various fields. We propose an LLM-embodied path planning framework for mobile agents, focusing on solving high-level coverage path planning issues and low-level control. Our proposed multi-layer architecture uses prompted LLMs in the path planning phase and integrates them with the mobile agents' low-level actuators. To evaluate the performance of various LLMs, we propose a coverage-weighted path planning metric to assess the performance of the embodied models. Our experiments show that the proposed framework improves LLMs' spatial inference abilities. We demonstrate that the proposed multi-layer framework significantly enhances the efficiency and accuracy of these tasks by leveraging the natural language understanding and generative capabilities of LLMs. Our experiments show that this framework can improve LLMs' 2D plane reasoning abilities and complete coverage path planning tasks. We also tested three LLM kernels: gpt-4o, gemini-1.5-flash, and claude-3.5-sonnet. The experimental results show that claude-3.5 can complete the coverage planning task in different scenarios, and its indicators are better than those of the other models.
Abstract:Quantum-inspired Machine Learning (QiML) is a burgeoning field, receiving global attention from researchers for its potential to leverage principles of quantum mechanics within classical computational frameworks. However, current review literature often presents a superficial exploration of QiML, focusing instead on the broader Quantum Machine Learning (QML) field. In response to this gap, this survey provides an integrated and comprehensive examination of QiML, exploring QiML's diverse research domains including tensor network simulations, dequantized algorithms, and others, showcasing recent advancements, practical applications, and illuminating potential future research avenues. Further, a concrete definition of QiML is established by analyzing various prior interpretations of the term and their inherent ambiguities. As QiML continues to evolve, we anticipate a wealth of future developments drawing from quantum mechanics, quantum computing, and classical machine learning, enriching the field further. This survey serves as a guide for researchers and practitioners alike, providing a holistic understanding of QiML's current landscape and future directions.
Abstract:Effective connectivity can describe the causal patterns among brain regions. These patterns have the potential to reveal the pathological mechanism and promote early diagnosis and effective drug development for cognitive disease. However, the current studies mainly focus on using empirical functional time series to calculate effective connections, which may not comprehensively capture the complex causal relationships between brain regions. In this paper, a novel Multi-resolution Spatiotemporal Enhanced Transformer Denoising (MSETD) network with an adversarially functional diffusion model is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment (MCI) analysis. To be specific, the denoising framework leverages a conditional diffusion process that progressively translates the noise and conditioning fMRI to effective connectivity in an end-to-end manner. To ensure reverse diffusion quality and diversity, the multi-resolution enhanced transformer generator is designed to extract local and global spatiotemporal features. Furthermore, a multi-scale diffusive transformer discriminator is devised to capture the temporal patterns at different scales for generation stability. Evaluations of the ADNI datasets demonstrate the feasibility and efficacy of the proposed model. The proposed model not only achieves superior prediction performance compared with other competing methods but also identifies MCI-related causal connections that are consistent with clinical studies.
Abstract:It is valuable to achieve domain adaptation to transfer the learned knowledge from the source labeled CT dataset to the target unlabeled MR dataset for abdominal multi-organ segmentation. Meanwhile, it is highly desirable to avoid high annotation cost of target dataset and protect privacy of source dataset. Therefore, we propose an effective source-free unsupervised domain adaptation method for cross-modality abdominal multi-organ segmentation without accessing the source dataset. The process of the proposed framework includes two stages. At the first stage, the feature map statistics loss is used to align the distributions of the source and target features in the top segmentation network, and entropy minimization loss is used to encourage high confidence segmentations. The pseudo-labels outputted from the top segmentation network is used to guide the style compensation network to generate source-like images. The pseudo-labels outputted from the middle segmentation network is used to supervise the learning of the desired model (the bottom segmentation network). At the second stage, the circular learning and the pixel-adaptive mask refinement are used to further improve the performance of the desired model. With this approach, we achieve satisfactory performances on the segmentations of liver, right kidney, left kidney, and spleen with the dice similarity coefficients of 0.884, 0.891, 0.864, and 0.911, respectively. In addition, the proposed approach can be easily extended to the situation when there exists target annotation data. The performance improves from 0.888 to 0.922 in average dice similarity coefficient, close to the supervised learning (0.929), with only one labeled MR volume.
Abstract:Liver segmentation on images acquired using computed tomography (CT) and magnetic resonance imaging (MRI) plays an important role in clinical management of liver diseases. Compared to MRI, CT images of liver are more abundant and readily available. However, MRI can provide richer quantitative information of the liver compared to CT. Thus, it is desirable to achieve unsupervised domain adaptation for transferring the learned knowledge from the source domain containing labeled CT images to the target domain containing unlabeled MR images. In this work, we report a novel unsupervised domain adaptation framework for cross-modality liver segmentation via joint adversarial learning and self-learning. We propose joint semantic-aware and shape-entropy-aware adversarial learning with post-situ identification manner to implicitly align the distribution of task-related features extracted from the target domain with those from the source domain. In proposed framework, a network is trained with the above two adversarial losses in an unsupervised manner, and then a mean completer of pseudo-label generation is employed to produce pseudo-labels to train the next network (desired model). Additionally, semantic-aware adversarial learning and two self-learning methods, including pixel-adaptive mask refinement and student-to-partner learning, are proposed to train the desired model. To improve the robustness of the desired model, a low-signal augmentation function is proposed to transform MRI images as the input of the desired model to handle hard samples. Using the public data sets, our experiments demonstrated the proposed unsupervised domain adaptation framework outperformed four supervised learning methods with a Dice score 0.912 plus or minus 0.037 (mean plus or minus standard deviation).