Abstract:Federated Learning (FL) that extracts data knowledge while protecting the privacy of multiple clients has achieved remarkable results in distributed privacy-preserving IoT systems, including smart traffic flow monitoring, smart grid load balancing, and so on. Since most data collected from edge devices are unlabeled, unsupervised Federated Clustering (FC) is becoming increasingly popular for exploring pattern knowledge from complex distributed data. However, due to the lack of label guidance, the common Non-Independent and Identically Distributed (Non-IID) issue of clients have greatly challenged FC by posing the following problems: How to fuse pattern knowledge (i.e., cluster distribution) from Non-IID clients; How are the cluster distributions among clients related; and How does this relationship connect with the global knowledge fusion? In this paper, a more tricky but overlooked phenomenon in Non-IID is revealed, which bottlenecks the clustering performance of the existing FC approaches. That is, different clients could fragment a cluster, and accordingly, a more generalized Non-IID concept, i.e., Non-ICD (Non-Independent Completely Distributed), is derived. To tackle the above FC challenges, a new framework named GOLD (Global Oriented Local Distribution Learning) is proposed. GOLD first finely explores the potential incomplete local cluster distributions of clients, then uploads the distribution summarization to the server for global fusion, and finally performs local cluster enhancement under the guidance of the global distribution. Extensive experiments, including significance tests, ablation studies, scalability evaluations, qualitative results, etc., have been conducted to show the superiority of GOLD.
Abstract:Multivariate Time-Series (MTS) clustering is crucial for signal processing and data analysis. Although deep learning approaches, particularly those leveraging Contrastive Learning (CL), are prominent for MTS representation, existing CL-based models face two key limitations: 1) neglecting clustering information during positive/negative sample pair construction, and 2) introducing unreasonable inductive biases, e.g., destroying time dependence and periodicity through augmentation strategies, compromising representation quality. This paper, therefore, proposes a Temporal-Frequency Enhanced Contrastive (TFEC) learning framework. To preserve temporal structure while generating low-distortion representations, a temporal-frequency Co-EnHancement (CoEH) mechanism is introduced. Accordingly, a synergistic dual-path representation and cluster distribution learning framework is designed to jointly optimize cluster structure and representation fidelity. Experiments on six real-world benchmark datasets demonstrate TFEC's superiority, achieving 4.48% average NMI gains over SOTA methods, with ablation studies validating the design. The code of the paper is available at: https://github.com/yueliangy/TFEC.
Abstract:Driven by the growth of Web-scale decentralized services, Federated Clustering (FC) aims to extract knowledge from heterogeneous clients in an unsupervised manner while preserving the clients' privacy, which has emerged as a significant challenge due to the lack of label guidance and the Non-Independent and Identically Distributed (non-IID) nature of clients. In real scenarios such as personalized recommendation and cross-device user profiling, the global cluster may be fragmented and distributed among different clients, and the clusters may exist at different granularities or even nested. Although Hierarchical Clustering (HC) is considered promising for exploring such distributions, the sophisticated recursive clustering process makes it more computationally expensive and vulnerable to privacy exposure, thus relatively unexplored under the federated learning scenario. This paper introduces an efficient one-shot hierarchical FC framework that performs client-end distribution exploration and server-end distribution aggregation through one-way prototype-level communication from clients to the server. A fine partition mechanism is developed to generate successive clusterlets to describe the complex landscape of the clients' clusters. Then, a multi-granular learning mechanism on the server is proposed to fuse the clusterlets, even when they have inconsistent granularities generated from different clients. It turns out that the complex cluster distributions across clients can be efficiently explored, and extensive experiments comparing state-of-the-art methods on ten public datasets demonstrate the superiority of the proposed method.




Abstract:Human motion transfer aims at animating a static source image with a driving video. While recent advances in one-shot human motion transfer have led to significant improvement in results, it remains challenging for methods with 2D body landmarks, skeleton and semantic mask to accurately capture correspondences between source and driving poses due to the large variation in motion and articulation complexity. In addition, the accuracy and precision of DensePose degrade the image quality for neural-rendering-based methods. To address the limitations and by both considering the importance of appearance and geometry for motion transfer, in this work, we proposed a unified framework that combines multi-scale feature warping and neural texture mapping to recover better 2D appearance and 2.5D geometry, partly by exploiting the information from DensePose, yet adapting to its inherent limited accuracy. Our model takes advantage of multiple modalities by jointly training and fusing them, which allows it to robust neural texture features that cope with geometric errors as well as multi-scale dense motion flow that better preserves appearance. Experimental results with full and half-view body video datasets demonstrate that our model can generalize well and achieve competitive results, and that it is particularly effective in handling challenging cases such as those with substantial self-occlusions.