Abstract:Recent advancements in text-to-image generation models have dramatically enhanced the generation of photorealistic images from textual prompts, leading to an increased interest in personalized text-to-image applications, particularly in multi-subject scenarios. However, these advances are hindered by two main challenges: firstly, the need to accurately maintain the details of each referenced subject in accordance with the textual descriptions; and secondly, the difficulty in achieving a cohesive representation of multiple subjects in a single image without introducing inconsistencies. To address these concerns, our research introduces the MS-Diffusion framework for layout-guided zero-shot image personalization with multi-subjects. This innovative approach integrates grounding tokens with the feature resampler to maintain detail fidelity among subjects. With the layout guidance, MS-Diffusion further improves the cross-attention to adapt to the multi-subject inputs, ensuring that each subject condition acts on specific areas. The proposed multi-subject cross-attention orchestrates harmonious inter-subject compositions while preserving the control of texts. Comprehensive quantitative and qualitative experiments affirm that this method surpasses existing models in both image and text fidelity, promoting the development of personalized text-to-image generation.
Abstract:In geology, a key activity is the characterisation of geological structures (surface formation topology and rock units) using Planar Orientation measurements such as Strike, Dip and Dip Direction. In general these measurements are collected manually using basic equipment; usually a compass/clinometer and a backboard, recorded on a map by hand. Various computing techniques and technologies, such as Lidar, have been utilised in order to automate this process and update the collection paradigm for these types of measurements. Techniques such as Structure from Motion (SfM) reconstruct of scenes and objects by generating a point cloud from input images, with detailed reconstruction possible on the decimetre scale. SfM-type techniques provide advantages in areas of cost and usability in more varied environmental conditions, while sacrificing the extreme levels of data fidelity. Here is presented a methodology of data acquisition and a Machine Learning-based software system: GeoStructure, developed to automate the measurement of orientation measurements. Rather than deriving measurements using a method applied to the input images, such as the Hough Transform, this method takes measurements directly from the reconstructed point cloud surfaces. Point cloud noise is mitigated using a Mahalanobis distance implementation. Significant structure is characterised using a k-nearest neighbour region growing algorithm, and final surface orientations are quantified using the plane, and normal direction cosines.
Abstract:Distributed machine learning (DML) techniques, such as federated learning, partitioned learning, and distributed reinforcement learning, have been increasingly applied to wireless communications. This is due to improved capabilities of terminal devices, explosively growing data volume, congestion in the radio interfaces, and increasing concern of data privacy. The unique features of wireless systems, such as large scale, geographically dispersed deployment, user mobility, and massive amount of data, give rise to new challenges in the design of DML techniques. There is a clear gap in the existing literature in that the DML techniques are yet to be systematically reviewed for their applicability to wireless systems. This survey bridges the gap by providing a contemporary and comprehensive survey of DML techniques with a focus on wireless networks. Specifically, we review the latest applications of DML in power control, spectrum management, user association, and edge cloud computing. The optimality, scalability, convergence rate, computation cost, and communication overhead of DML are analyzed. We also discuss the potential adversarial attacks faced by DML applications, and describe state-of-the-art countermeasures to preserve privacy and security. Last but not least, we point out a number of key issues yet to be addressed, and collate potentially interesting and challenging topics for future research.
Abstract:This work proposes diffusion normalized least mean M-estimate algorithm based on the modified Huber function, which can equip distributed networks with robust learning capability in the presence of impulsive interference. In order to exploit the system's underlying sparsity to further improve the learning performance, a sparse-aware variant is also developed by incorporating the $l_0$-norm of the estimates into the update process. We then analyze the transient, steady-state and stability behaviors of the algorithms in a unified framework. In particular, we present an analytical method that is simpler than conventional approaches to deal with the score function since it removes the requirements of integrals and Price's theorem. Simulations in various impulsive noise scenarios show that the proposed algorithms are superior to some existing diffusion algorithms and the theoretical results are verifiable.
Abstract:Space-time adaptive processing (STAP) algorithms with coprime arrays can provide good clutter suppression potential with low cost in airborne radar systems as compared with their uniform linear arrays counterparts. However, the performance of these algorithms is limited by the training samples support in practical applications. To address this issue, a robust two-stage reduced-dimension (RD) sparsity-aware STAP algorithm is proposed in this work. In the first stage, an RD virtual snapshot is constructed using all spatial channels but only $m$ adjacent Doppler channels around the target Doppler frequency to reduce the slow-time dimension of the signal. In the second stage, an RD sparse measurement modeling is formulated based on the constructed RD virtual snapshot, where the sparsity of clutter and the prior knowledge of the clutter ridge are exploited to formulate an RD overcomplete dictionary. Moreover, an orthogonal matching pursuit (OMP)-like method is proposed to recover the clutter subspace. In order to set the stopping parameter of the OMP-like method, a robust clutter rank estimation approach is developed. Compared with recently developed sparsity-aware STAP algorithms, the size of the proposed sparse representation dictionary is much smaller, resulting in low complexity. Simulation results show that the proposed algorithm is robust to prior knowledge errors and can provide good clutter suppression performance in low sample support.