Abstract:This paper proposes a unified tree-reweighted belief propagation (BP) and mean field (MF) approach for scalable detection and tracking of extended targets within the framework of factor graph. The factor graph is partitioned into a BP region and an MF region so that the messages in each region are updated according to the corresponding region rules. The BP region exploits the tree-reweighted BP, which offers improved convergence than the standard BP for graphs with massive cycles, to resolve data association. The MF region approximates the posterior densities of the measurement rate, kinematic state and extent. For linear Gaussian target models and gamma Gaussian inverse Wishart distributed state density, the unified approach provides a closed-form recursion for the state density. Hence, the proposed algorithm is more efficient than particle-based BP algorithms for extended target tracking. This method also avoids measurement clustering and gating since it solves the data association problem in a probabilistic fashion. We compare the proposed approach with algorithms such as the Poisson multi-Bernoulli mixture filter and the BP-based Poisson multi-Bernoulli filter. Simulation results demonstrate that the proposed algorithm achieves enhanced tracking performance.
Abstract:Quadrotor motion planning is critical for autonomous flight in complex environments, such as rescue operations. Traditional methods often employ trajectory generation optimization and passive time allocation strategies, which can limit the exploitation of the quadrotor's dynamic capabilities and introduce delays and inaccuracies. To address these challenges, we propose a novel motion planning framework that integrates visibility path searching and reinforcement learning (RL) motion generation. Our method constructs collision-free paths using heuristic search and visibility graphs, which are then refined by an RL policy to generate low-level motion commands. We validate our approach in simulated indoor environments, demonstrating better performance than traditional methods in terms of time span.
Abstract:Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment planning framework that harnesses prior radiation oncology knowledge encoded in multi-modal large language models, such as GPT-4Vision (GPT-4V) from OpenAI. GPT-RadPlan is made aware of planning protocols as context and acts as an expert human planner, capable of guiding a treatment planning process. Via in-context learning, we incorporate clinical protocols for various disease sites as prompts to enable GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan agent is integrated into our in-house inverse treatment planning system through an API. The efficacy of the automated planning system is showcased using multiple prostate and head & neck cancer cases, where we compared GPT-RadPlan results to clinical plans. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and organ-at-risk sparing. Consistently satisfying the dosimetric objectives in the clinical protocol, GPT-RadPlan represents the first multimodal large language model agent that mimics the behaviors of human planners in radiation oncology clinics, achieving remarkable results in automating the treatment planning process without the need for additional training.
Abstract:Object detection in urban scenarios is crucial for autonomous driving in intelligent traffic systems. However, unlike conventional object detection tasks, urban-scene images vary greatly in style. For example, images taken on sunny days differ significantly from those taken on rainy days. Therefore, models trained on sunny day images may not generalize well to rainy day images. In this paper, we aim to solve the single-domain generalizable object detection task in urban scenarios, meaning that a model trained on images from one weather condition should be able to perform well on images from any other weather conditions. To address this challenge, we propose a novel Double AUGmentation (DoubleAUG) method that includes image- and feature-level augmentation schemes. In the image-level augmentation, we consider the variation in color information across different weather conditions and propose a Color Perturbation (CP) method that randomly exchanges the RGB channels to generate various images. In the feature-level augmentation, we propose to utilize a Dual-Style Memory (DSM) to explore the diverse style information on the entire dataset, further enhancing the model's generalization capability. Extensive experiments demonstrate that our proposed method outperforms state-of-the-art methods. Furthermore, ablation studies confirm the effectiveness of each module in our proposed method. Moreover, our method is plug-and-play and can be integrated into existing methods to further improve model performance.
Abstract:The amplitude information of target returns has been incorporated into many tracking algorithms for performance improvements. One of the limitations of employing amplitude feature is that the signal-to-noise ratio (SNR) of the target, i.e., the parameter of amplitude likelihood, is usually assumed to be known and constant. In practice, the target SNR is always unknown, and is dependent on aspect angle hence it will fluctuate. In this paper we propose a hybrid labeled multi-Bernoulli (LMB) filter that introduces the signal amplitude into the LMB filter for tracking targets with unknown and fluctuating SNR. The fluctuation of target SNR is modeled by an autoregressive gamma process and amplitude likelihoods for Swerling 1 and 3 targets are considered. Under Rao-Blackwell decomposition, an approximate Gamma estimator based on Laplace transform and Markov Chain Monte Carlo method is proposed to estimate the target SNR, and the kinematic state is estimated by a Gaussian mixture filter conditioned on the target SNR. The performance of the proposed hybrid filter is analyzed via a tracking scenario including three crossing targets. Simulation results verify the efficacy of the proposed SNR estimator and quantify the benefits of incorporating amplitude information for multi-target tracking.