Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianfeng Lu

DaringFed: A Dynamic Bayesian Persuasion Pricing for Online Federated Learning under Two-sided Incomplete Information

May 09, 2025

Yun Xin, Jianfeng Lu, Shuqin Cao, Gang Li, Haozhao Wang, Guanghui Wen

Abstract:Online Federated Learning (OFL) is a real-time learning paradigm that sequentially executes parameter aggregation immediately for each random arriving client. To motivate clients to participate in OFL, it is crucial to offer appropriate incentives to offset the training resource consumption. However, the design of incentive mechanisms in OFL is constrained by the dynamic variability of Two-sided Incomplete Information (TII) concerning resources, where the server is unaware of the clients' dynamically changing computational resources, while clients lack knowledge of the real-time communication resources allocated by the server. To incentivize clients to participate in training by offering dynamic rewards to each arriving client, we design a novel Dynamic Bayesian persuasion pricing for online Federated learning (DaringFed) under TII. Specifically, we begin by formulating the interaction between the server and clients as a dynamic signaling and pricing allocation problem within a Bayesian persuasion game, and then demonstrate the existence of a unique Bayesian persuasion Nash equilibrium. By deriving the optimal design of DaringFed under one-sided incomplete information, we further analyze the approximate optimal design of DaringFed with a specific bound under TII. Finally, extensive evaluation conducted on real datasets demonstrate that DaringFed optimizes accuracy and converges speed by 16.99%, while experiments with synthetic datasets validate the convergence of estimate unknown values and the effectiveness of DaringFed in improving the server's utility by up to 12.6%.

Via

Access Paper or Ask Questions

Bi-Lipschitz Ansatz for Anti-Symmetric Functions

Mar 06, 2025

Nadav Dym, Jianfeng Lu, Matan Mizrachi

Abstract:Motivated by applications for simulating quantum many body functions, we propose a new universal ansatz for approximating anti-symmetric functions. The main advantage of this ansatz over previous alternatives is that it is bi-Lipschitz with respect to a naturally defined metric. As a result, we are able to obtain quantitative approximation results for approximation of Lipschitz continuous antisymmetric functions. Moreover, we provide preliminary experimental evidence to the improved performance of this ansatz for learning antisymmetric functions.

Via

Access Paper or Ask Questions

Multi-Label Test-Time Adaptation with Bound Entropy Minimization

Feb 06, 2025

Xiangyu Wu, Feng Yu, Qing-Guo Chen, Yang Yang, Jianfeng Lu

Figure 1 for Multi-Label Test-Time Adaptation with Bound Entropy Minimization

Figure 2 for Multi-Label Test-Time Adaptation with Bound Entropy Minimization

Figure 3 for Multi-Label Test-Time Adaptation with Bound Entropy Minimization

Figure 4 for Multi-Label Test-Time Adaptation with Bound Entropy Minimization

Abstract:Mainstream test-time adaptation (TTA) techniques endeavor to mitigate distribution shifts via entropy minimization for multi-class classification, inherently increasing the probability of the most confident class. However, when encountering multi-label instances, the primary challenge stems from the varying number of labels per image, and prioritizing only the highest probability class inevitably undermines the adaptation of other positive labels. To address this issue, we investigate TTA within multi-label scenario (ML--TTA), developing Bound Entropy Minimization (BEM) objective to simultaneously increase the confidence of multiple top predicted labels. Specifically, to determine the number of labels for each augmented view, we retrieve a paired caption with yielded textual labels for that view. These labels are allocated to both the view and caption, called weak label set and strong label set with the same size k. Following this, the proposed BEM considers the highest top-k predicted labels from view and caption as a single entity, respectively, learning both view and caption prompts concurrently. By binding top-k predicted labels, BEM overcomes the limitation of vanilla entropy minimization, which exclusively optimizes the most confident class. Across the MSCOCO, VOC, and NUSWIDE multi-label datasets, our ML--TTA framework equipped with BEM exhibits superior performance compared to the latest SOTA methods, across various model architectures, prompt initialization, and varying label scenarios. The code is available at https://github.com/Jinx630/ML-TTA.

* Accepted for publication at ICLR 2025; 17 pages; 3 figures

Via

Access Paper or Ask Questions

Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control

Feb 05, 2025

Xianghui Ze, Zhenbo Song, Qiwei Wang, Jianfeng Lu, Yujiao Shi

Abstract:Generating street-view images from satellite imagery is a challenging task, particularly in maintaining accurate pose alignment and incorporating diverse environmental conditions. While diffusion models have shown promise in generative tasks, their ability to maintain strict pose alignment throughout the diffusion process is limited. In this paper, we propose a novel Iterative Homography Adjustment (IHA) scheme applied during the denoising process, which effectively addresses pose misalignment and ensures spatial consistency in the generated street-view images. Additionally, currently, available datasets for satellite-to-street-view generation are limited in their diversity of illumination and weather conditions, thereby restricting the generalizability of the generated outputs. To mitigate this, we introduce a text-guided illumination and weather-controlled sampling strategy that enables fine-grained control over the environmental factors. Extensive quantitative and qualitative evaluations demonstrate that our approach significantly improves pose accuracy and enhances the diversity and realism of generated street-view images, setting a new benchmark for satellite-to-street-view generation tasks.

Via

Access Paper or Ask Questions

Convergence of two-timescale gradient descent ascent dynamics: finite-dimensional and mean-field perspectives

Jan 29, 2025

Jing An, Jianfeng Lu

Abstract:The two-timescale gradient descent-ascent (GDA) is a canonical gradient algorithm designed to find Nash equilibria in min-max games. We analyze the two-timescale GDA by investigating the effects of learning rate ratios on convergence behavior in both finite-dimensional and mean-field settings. In particular, for finite-dimensional quadratic min-max games, we obtain long-time convergence in near quasi-static regimes through the hypocoercivity method. For mean-field GDA dynamics, we investigate convergence under a finite-scale ratio using a mixed synchronous-reflection coupling technique.

* v2: fixing some minor tex issues

Via

Access Paper or Ask Questions

A Unified Blockwise Measurement Design for Learning Quantum Channels and Lindbladians via Low-Rank Matrix Sensing

Jan 23, 2025

Quanjun Lang, Jianfeng Lu

Abstract:Quantum superoperator learning is a pivotal task in quantum information science, enabling accurate reconstruction of unknown quantum operations from measurement data. We propose a robust approach based on the matrix sensing techniques for quantum superoperator learning that extends beyond the positive semidefinite case, encompassing both quantum channels and Lindbladians. We first introduce a randomized measurement design using a near-optimal number of measurements. By leveraging the restricted isometry property (RIP), we provide theoretical guarantees for the identifiability and recovery of low-rank superoperators in the presence of noise. Additionally, we propose a blockwise measurement design that restricts the tomography to the sub-blocks, significantly enhancing performance while maintaining a comparable scale of measurements. We also provide a performance guarantee for this setup. Our approach employs alternating least squares (ALS) with acceleration for optimization in matrix sensing. Numerical experiments validate the efficiency and scalability of the proposed methods.

Via

Access Paper or Ask Questions

FedCross: Intertemporal Federated Learning Under Evolutionary Games

Dec 22, 2024

Jianfeng Lu, Ying Zhang, Riheng Jia, Shuqin Cao, Jing Liu, Hao Fu

Figure 1 for FedCross: Intertemporal Federated Learning Under Evolutionary Games

Figure 2 for FedCross: Intertemporal Federated Learning Under Evolutionary Games

Figure 3 for FedCross: Intertemporal Federated Learning Under Evolutionary Games

Figure 4 for FedCross: Intertemporal Federated Learning Under Evolutionary Games

Abstract:Federated Learning (FL) mitigates privacy leakage in decentralized machine learning by allowing multiple clients to train collaboratively locally. However, dynamic mobile networks with high mobility, intermittent connectivity, and bandwidth limitation severely hinder model updates to the cloud server. Although previous studies have typically addressed user mobility issue through task reassignment or predictive modeling, frequent migrations may result in high communication overhead. Overcoming this obstacle involves not only dealing with resource constraints, but also finding ways to mitigate the challenges posed by user migrations. We therefore propose an intertemporal incentive framework, FedCross, which ensures the continuity of FL tasks by migrating interrupted training tasks to feasible mobile devices. Specifically, FedCross comprises two distinct stages. In Stage 1, we address the task allocation problem across regions under resource constraints by employing a multi-objective migration algorithm to quantify the optimal task receivers. Moreover, we adopt evolutionary game theory to capture the dynamic decision-making of users, forecasting the evolution of user proportions across different regions to mitigate frequent migrations. In Stage 2, we utilize a procurement auction mechanism to allocate rewards among base stations, ensuring that those providing high-quality models receive optimal compensation. This approach incentivizes sustained user participation, thereby ensuring the overall feasibility of FedCross. Finally, experimental results validate the theoretical soundness of FedCross and demonstrate its significant reduction in communication overhead.

Via

Access Paper or Ask Questions

TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning

Dec 17, 2024

Gangqiang Hu, Jianfeng Lu, Jianmin Han, Shuqin Cao, Jing Liu, Hao Fu

Figure 1 for TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning

Figure 2 for TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning

Figure 3 for TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning

Abstract:Due to the sensitivity of data, federated learning (FL) is employed to enable distributed machine learning while safeguarding data privacy and accommodating the requirements of various devices. However, in the context of semi-decentralized federated learning (SD-FL), clients' communication and training states are dynamic. This variability arises from local training fluctuations, heterogeneous data distributions, and intermittent client participation. Most existing studies primarily focus on stable client states, neglecting the dynamic challenges present in real-world scenarios. To tackle this issue, we propose a trust-aware client scheduling mechanism (TRAIL) that assesses client states and contributions, enhancing model training efficiency through selective client participation. Our focus is on a semi-decentralized federated learning framework where edge servers and clients train a shared global model using unreliable intra-cluster model aggregation and inter-cluster model consensus. First, we develop an adaptive hidden semi-Markov model (AHSMM) to estimate clients' communication states and contributions. Next, we address a client-server association optimization problem to minimize global training loss. Using convergence analysis, we propose a greedy client scheduling algorithm. Finally, our experiments conducted on real-world datasets demonstrate that TRAIL outperforms state-of-the-art baselines, achieving an improvement of 8.7\% in test accuracy and a reduction of 15.3\% in training loss.

Via

Access Paper or Ask Questions

Towards characterizing the value of edge embeddings in Graph Neural Networks

Oct 13, 2024

Dhruv Rohatgi, Tanya Marwah, Zachary Chase Lipton, Jianfeng Lu, Ankur Moitra, Andrej Risteski

Abstract:Graph neural networks (GNNs) are the dominant approach to solving machine learning problems defined over graphs. Despite much theoretical and empirical work in recent years, our understanding of finer-grained aspects of architectural design for GNNs remains impoverished. In this paper, we consider the benefits of architectures that maintain and update edge embeddings. On the theoretical front, under a suitable computational abstraction for a layer in the model, as well as memory constraints on the embeddings, we show that there are natural tasks on graphical models for which architectures leveraging edge embeddings can be much shallower. Our techniques are inspired by results on time-space tradeoffs in theoretical computer science. Empirically, we show architectures that maintain edge embeddings almost always improve on their node-based counterparts -- frequently significantly so in topologies that have ``hub'' nodes.

* 25 pages, 2 figures

Via

Access Paper or Ask Questions

Posterior sampling via Langevin dynamics based on generative priors

Oct 02, 2024

Vishal Purohit, Matthew Repasky, Jianfeng Lu, Qiang Qiu, Yao Xie, Xiuyuan Cheng

Figure 1 for Posterior sampling via Langevin dynamics based on generative priors

Figure 2 for Posterior sampling via Langevin dynamics based on generative priors

Figure 3 for Posterior sampling via Langevin dynamics based on generative priors

Figure 4 for Posterior sampling via Langevin dynamics based on generative priors

Abstract:Posterior sampling in high-dimensional spaces using generative models holds significant promise for various applications, including but not limited to inverse problems and guided generation tasks. Despite many recent developments, generating diverse posterior samples remains a challenge, as existing methods require restarting the entire generative process for each new sample, making the procedure computationally expensive. In this work, we propose efficient posterior sampling by simulating Langevin dynamics in the noise space of a pre-trained generative model. By exploiting the mapping between the noise and data spaces which can be provided by distilled flows or consistency models, our method enables seamless exploration of the posterior without the need to re-run the full sampling chain, drastically reducing computational overhead. Theoretically, we prove a guarantee for the proposed noise-space Langevin dynamics to approximate the posterior, assuming that the generative model sufficiently approximates the prior distribution. Our framework is experimentally validated on image restoration tasks involving noisy linear and nonlinear forward operators applied to LSUN-Bedroom (256 x 256) and ImageNet (64 x 64) datasets. The results demonstrate that our approach generates high-fidelity samples with enhanced semantic diversity even under a limited number of function evaluations, offering superior efficiency and performance compared to existing diffusion-based posterior sampling techniques.

Via

Access Paper or Ask Questions