Abstract:Large language models (LLMs) have demonstrated strong potential and impressive performance in automating the generation and optimization of workflows. However, existing approaches are marked by limited reasoning capabilities, high computational demands, and significant resource requirements. To address these issues, we propose DebFlow, a framework that employs a debate mechanism to optimize workflows and integrates reflexion to improve based on previous experiences. We evaluated our method across six benchmark datasets, including HotpotQA, MATH, and ALFWorld. Our approach achieved a 3\% average performance improvement over the latest baselines, demonstrating its effectiveness in diverse problem domains. In particular, during training, our framework reduces resource consumption by 37\% compared to the state-of-the-art baselines. Additionally, we performed ablation studies. Removing the Debate component resulted in a 4\% performance drop across two benchmark datasets, significantly greater than the 2\% drop observed when the Reflection component was removed. These findings strongly demonstrate the critical role of Debate in enhancing framework performance, while also highlighting the auxiliary contribution of reflexion to overall optimization.
Abstract:Visual Salient Object Detection (SOD) and Camouflaged Object Detection (COD) are two interrelated yet distinct tasks. Both tasks model the human visual system's ability to perceive the presence of objects. The traditional SOD datasets and methods are designed for scenes where only salient objects are present, similarly, COD datasets and methods are designed for scenes where only camouflaged objects are present. However, scenes where both salient and camouflaged objects coexist, or where neither is present, are not considered. This simplifies the existing research on SOD and COD. In this paper, to explore a more generalized approach to SOD and COD, we introduce a benchmark called Unconstrained Salient and Camouflaged Object Detection (USCOD), which supports the simultaneous detection of salient and camouflaged objects in unconstrained scenes, regardless of their presence. Towards this, we construct a large-scale dataset, CS12K, that encompasses a variety of scenes, including four distinct types: only salient objects, only camouflaged objects, both, and neither. In our benchmark experiments, we identify a major challenge in USCOD: distinguishing between salient and camouflaged objects within the same scene. To address this challenge, we propose USCNet, a baseline model for USCOD that decouples the learning of attribute distinction from mask reconstruction. The model incorporates an APG module, which learns both sample-generic and sample-specific features to enhance the attribute differentiation between salient and camouflaged objects. Furthermore, to evaluate models' ability to distinguish between salient and camouflaged objects, we design a metric called Camouflage-Saliency Confusion Score (CSCS). The proposed method achieves state-of-the-art performance on the newly introduced USCOD task. The code and dataset will be publicly available.
Abstract:With the advancement of autonomous driving, ensuring safety during motion planning and navigation is becoming more and more important. However, most end-to-end planning methods suffer from a lack of safety. This research addresses the safety issue in the control optimization problem of autonomous driving, formulated as Constrained Markov Decision Processes (CMDPs). We propose a novel, model-based approach for policy optimization, utilizing a conditional Value-at-Risk based Soft Actor Critic to manage constraints in complex, high-dimensional state spaces effectively. Our method introduces a worst-case actor to guide safe exploration, ensuring rigorous adherence to safety requirements even in unpredictable scenarios. The policy optimization employs the Augmented Lagrangian method and leverages latent diffusion models to predict and simulate future trajectories. This dual approach not only aids in navigating environments safely but also refines the policy's performance by integrating distribution modeling to account for environmental uncertainties. Empirical evaluations conducted in both simulated and real environment demonstrate that our approach outperforms existing methods in terms of safety, efficiency, and decision-making capabilities.
Abstract:In recent advancements in Multi-agent Reinforcement Learning (MARL), its application has extended to various safety-critical scenarios. However, most methods focus on online learning, which presents substantial risks when deployed in real-world settings. Addressing this challenge, we introduce an innovative framework integrating diffusion models within the MARL paradigm. This approach notably enhances the safety of actions taken by multiple agents through risk mitigation while modeling coordinated action. Our framework is grounded in the Centralized Training with Decentralized Execution (CTDE) architecture, augmented by a Diffusion Model for prediction trajectory generation. Additionally, we incorporate a specialized algorithm to further ensure operational safety. We evaluate our model against baselines on the DSRL benchmark. Experiment results demonstrate that our model not only adheres to stringent safety constraints but also achieves superior performance compared to existing methodologies. This underscores the potential of our approach in advancing the safety and efficacy of MARL in real-world applications.