Abstract:Diffusion models have achieved remarkable success in image generation, particularly with the various applications of classifier-free guidance conditional diffusion models. While many diffusion models perform well when controlling for particular aspect among style, character, and interaction, they struggle with fine-grained control due to dataset limitations and intricate model architecture design. This paper introduces a novel algorithm, Aggregation of Multi Diffusion Models (AMDM), which synthesizes features from multiple diffusion models into a specified model, enhancing its learned representations to activate specific features for fine-grained control. AMDM consists of two key components: spherical aggregation and manifold optimization. Spherical aggregation merges intermediate variables from different diffusion models with minimal manifold deviation, while manifold optimization refines these variables to align with the intermediate data manifold, enhancing sampling quality. Experimental results demonstrate that AMDM significantly improves fine-grained control without additional training or inference time, proving its effectiveness. Additionally, it reveals that diffusion models initially focus on features such as position, attributes, and style, with later stages improving generation quality and consistency. AMDM offers a new perspective for tackling the challenges of fine-grained conditional control generation in diffusion models: We can fully utilize existing conditional diffusion models that control specific aspects, or develop new ones, and then aggregate them using the AMDM algorithm. This eliminates the need for constructing complex datasets, designing intricate model architectures, and incurring high training costs. Code is available at: https://github.com/Hammour-steak/AMDM
Abstract:In this correspondence, we investigate an intelligent reflective surface (IRS) assisted downlink ultra-reliable and low-latency communication (URLLC) system, where an access point (AP) sends short packets to multiple devices with the help of an IRS. Specifically, a performance comparison between the frequency division multiple access (FDMA) and time division multiple access (TDMA) is conducted for the considered system, from the perspective of average age of information (AoI). Aiming to minimize the maximum average AoI among all devices by jointly optimizing the resource allocation and passive beamforming. However, the formulated problem is difficult to solve due to the non-convex objective function and coupled variables. Thus, we propose an alternating optimization based algorithm by dividing the original problem into two sub-problems which can be efficiently solved. Simulation results show that TDMA can achieve lower AoI by exploiting the time-selective passive beamforming of IRS for maximizing the signal to noise ratio (SNR) of each device consecutively. Moreover, it also shows that as the length of information bits becomes sufficiently large as compared to the available bandwidth, the proposed FDMA transmission scheme becomes more favorable instead, due to the more effective utilization of bandwidth.
Abstract:Generating low-level robot task plans from high-level natural language instructions remains a challenging problem. Although large language models have shown promising results in generating plans, the accuracy of the output remains unverified. Furthermore, the lack of domain-specific language data poses a limitation on the applicability of these models. In this paper, we propose CLAIRIFY, a novel approach that combines automatic iterative prompting with program verification to ensure programs written in data-scarce domain-specific language are syntactically valid and incorporate environment constraints. Our approach provides effective guidance to the language model on generating structured-like task plans by incorporating any errors as feedback, while the verifier ensures the syntactic accuracy of the generated plans. We demonstrate the effectiveness of CLAIRIFY in planning chemistry experiments by achieving state-of-the-art results. We also show that the generated plans can be executed on a real robot by integrating them with a task and motion planner.
Abstract:In this paper, we study an unmanned aerial vehicle (UAV) enabled data collection system, where an intelligent reflecting surface (IRS) is deployed to assist in the communication from a cluster of Internet-of-Things (IoT) devices to a UAV in the presence of a jammer. We aim to improve the energy efficiency (EE) via the joint design of UAV trajectory, IRS passive beamforming, device power allocation, and communication scheduling. However, the formulated non-linear fractional programming problem is challenging to solve due to its non-convexity and coupled variables. To overcome the difficulty, we propose an alternating optimization based algorithm to solve it sub-optimally by leveraging Dinkelbach's algorithm, successive convex approximation (SCA) technique, and block coordinate descent (BCD) method. Extensive simulation results show that the proposed design can significantly improve the anti-jamming performance. In particular, for the remote jammer case, the proposed design can largely shorten the flight path and thus decrease the energy consumption via the signal enhancement; while for the local jammer case, which is deemed highly challenging in conventional systems without IRS since the retreating away strategy becomes ineffective, our proposed design even achieves a higher performance gain owing to the efficient jamming signal mitigation.
Abstract:In this paper, we study an unmanned aerial vehicle (UAV) communication system, where a ground node (GN) communicate with a UAV assisted by intelligent reflecting surface (IRS) in the presence of a jammer with imperfect location information. We aim to improve the achievable average rate via the joint robust design of UAV trajectory, IRS passive beamforming and GN's power allocation. However, the formulated optimization problem is challenging to solve due to its non-convexity and coupled variables. To overcome the difficulty, we propose an alternating optimization (AO) based algorithm to solve it sub-optimally by leveraging semidefinite relaxation (SDR), successive convex approximation (SCA), and S-procedure methods. Simulation results show that by deploying the IRS near the GN, our proposed algorithm always improves the uplink achievable average rate significantly compared with the benchmark algorithms, while deploying the IRS nearby the jammer is effective only when the jammer's location is perfectly known.
Abstract:In this letter, we investigate an unmanned aerial vehicle (UAV) communication system, where an intelligent reflecting surface (IRS) is deployed to assist in the transmission from a ground node (GN) to the UAV in the presence of a jammer. We aim to maximize the average rate of the UAV communication by jointly optimizing the GN's transmit power, the IRS's passive beamforming and the UAV's trajectory. However, the formulated problem is difficult to solve due to the non-convex objective function and the coupled optimization variables. Thus, to tackle it, we propose an alternating optimization (AO) based algorithm by exploiting the successive convex approximation (SCA) and semidefinite relaxation (SDR) techniques. Simulation results show that the proposed algorithm can significantly improve the average rate compared with the benchmark algorithms. Moreover, it also shows that when the jamming power is large and the number of IRS elements is relatively small, deploying the IRS near the jammer outperforms deploying it near the GN, and vice versa.