Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christopher Brinton

Parameter Tracking in Federated Learning with Adaptive Optimization

Feb 04, 2025

Evan Chen. Jianing Zhang, Shiqiang Wang, Chaoyue Liu, Christopher Brinton

Abstract:In Federated Learning (FL), model training performance is strongly impacted by data heterogeneity across clients. Gradient Tracking (GT) has recently emerged as a solution which mitigates this issue by introducing correction terms to local model updates. To date, GT has only been considered under Stochastic Gradient Descent (SGD)-based model training, while modern FL frameworks increasingly employ adaptive optimizers for improved convergence. In this work, we generalize the GT framework to a more flexible Parameter Tracking (PT) paradigm and propose two novel adaptive optimization algorithms, {\tt FAdamET} and {\tt FAdamGT}, that integrate PT into Adam-based FL. We provide a rigorous convergence analysis of these algorithms under non-convex settings. Our experimental results demonstrate that both proposed algorithms consistently outperform existing methods when evaluating total communication cost and total computation cost across varying levels of data heterogeneity, showing the effectiveness of correcting first-order information in federated adaptive optimization.

Via

Access Paper or Ask Questions

Deep Learning Aided Broadcast Codes with Feedback

Oct 22, 2024

Jacqueline Malayter, Christopher Brinton, David Love

Figure 1 for Deep Learning Aided Broadcast Codes with Feedback

Figure 2 for Deep Learning Aided Broadcast Codes with Feedback

Figure 3 for Deep Learning Aided Broadcast Codes with Feedback

Figure 4 for Deep Learning Aided Broadcast Codes with Feedback

Abstract:Deep learning aided codes have been shown to improve code performance in feedback codes in high noise regimes due to the ability to leverage non-linearity in code design. In the additive white Gaussian broadcast channel (AWGN-BC), the addition of feedback may allow the capacity region to extend far beyond the capacity region of the channel without feedback, enabling higher data rates. On the other hand, there are limited deep-learning aided implementations of broadcast codes. In this work, we extend two classes of deep-learning assisted feedback codes to the AWGN-BC channel; the first being an RNN-based architecture and the second being a lightweight MLP-based architecture. Both codes are trained using a global model, and then they are trained using a more realistic vertical federated learning based framework. We first show that in most cases, using an AWGN-BC code outperforms a linear-based concatenated scheme. Second, we show in some regimes, the lightweight architecture far exceeds the RNN-based code, but in especially unreliable conditions, the RNN-based code dominates. The results show the promise of deep-learning aided broadcast codes in unreliable channels, and future research directions are discussed.

Via

Access Paper or Ask Questions

SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Oct 07, 2024

Daoan Zhang, Guangchen Lan, Dong-Jun Han, Wenlin Yao, Xiaoman Pan, Hongming Zhang, Mingxiao Li, Pengcheng Chen, Yu Dong, Christopher Brinton(+1 more)

Figure 1 for SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Figure 2 for SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Figure 3 for SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Figure 4 for SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Abstract:Reinforcement learning from human feedback (RLHF) methods are emerging as a way to fine-tune diffusion models (DMs) for visual generation. However, commonly used on-policy strategies are limited by the generalization capability of the reward model, while off-policy approaches require large amounts of difficult-to-obtain paired human-annotated data, particularly in visual generation tasks. To address the limitations of both on- and off-policy RLHF, we propose a preference optimization method that aligns DMs with preferences without relying on reward models or paired human-annotated data. Specifically, we introduce a Semi-Policy Preference Optimization (SePPO) method. SePPO leverages previous checkpoints as reference models while using them to generate on-policy reference samples, which replace "losing images" in preference pairs. This approach allows us to optimize using only off-policy "winning images." Furthermore, we design a strategy for reference model selection that expands the exploration in the policy space. Notably, we do not simply treat reference samples as negative examples for learning. Instead, we design an anchor-based criterion to assess whether the reference samples are likely to be winning or losing images, allowing the model to selectively learn from the generated reference samples. This approach mitigates performance degradation caused by the uncertainty in reference sample quality. We validate SePPO across both text-to-image and text-to-video benchmarks. SePPO surpasses all previous approaches on the text-to-image benchmarks and also demonstrates outstanding performance on the text-to-video benchmarks. Code will be released in https://github.com/DwanZhang-AI/SePPO.

Via

Access Paper or Ask Questions

Unlocking the Potential of Model Calibration in Federated Learning

Sep 07, 2024

Yun-Wei Chu, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher Brinton

Figure 1 for Unlocking the Potential of Model Calibration in Federated Learning

Figure 2 for Unlocking the Potential of Model Calibration in Federated Learning

Figure 3 for Unlocking the Potential of Model Calibration in Federated Learning

Figure 4 for Unlocking the Potential of Model Calibration in Federated Learning

Abstract:Over the past several years, various federated learning (FL) methodologies have been developed to improve model accuracy, a primary performance metric in machine learning. However, to utilize FL in practical decision-making scenarios, beyond considering accuracy, the trained model must also have a reliable confidence in each of its predictions, an aspect that has been largely overlooked in existing FL research. Motivated by this gap, we propose Non-Uniform Calibration for Federated Learning (NUCFL), a generic framework that integrates FL with the concept of model calibration. The inherent data heterogeneity in FL environments makes model calibration particularly difficult, as it must ensure reliability across diverse data distributions and client conditions. Our NUCFL addresses this challenge by dynamically adjusting the model calibration objectives based on statistical relationships between each client's local model and the global model in FL. In particular, NUCFL assesses the similarity between local and global model relationships, and controls the penalty term for the calibration loss during client-side local training. By doing so, NUCFL effectively aligns calibration needs for the global model in heterogeneous FL settings while not sacrificing accuracy. Extensive experiments show that NUCFL offers flexibility and effectiveness across various FL algorithms, enhancing accuracy as well as model calibration.

Via

Access Paper or Ask Questions

Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

Jun 27, 2024

Byunghyun Lee, Anindya Bijoy Das, David Love, Christopher Brinton, James Krogmeier

Figure 1 for Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

Figure 2 for Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

Figure 3 for Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

Figure 4 for Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

Abstract:Dual-function radar-communication (DFRC) is a key enabler of location-based services for next-generation communication systems. In this paper, we investigate the problem of designing constant modulus waveforms for DFRC systems. For high-precision radar sensing, we consider joint optimization of the correlation properties and spatial beam pattern. For communication, we employ constructive interference-based block-level precoding (CI-BLP) to leverage distortion induced by multiuser multiple-input multiple-output (MU-MIMO) and radar transmission on a block level. We propose two solution algorithms based on the alternating direction method of multipliers (ADMM) and majorization-minimization (MM) principles, which are effective for small and large block sizes, respectively. The proposed ADMM-based solution decomposes the nonconvex formulated problem into multiple tractable subproblems, each of which admits a closed-form solution. To accelerate convergence of the MM-based solution, we propose an improved majorizing function that leverages a novel diagonal matrix structure. After majorization, we decompose the approximated problem into independent subproblems for parallelization, mitigating the complexity that increases with block size. We then evaluate the performance of the proposed algorithms through a series of numerical experiments. Simulation results demonstrate that the proposed methods can substantially enhance spatial/temporal sidelobe suppression through block-level optimization.

* arXiv admin note: text overlap with arXiv:2310.10804

Via

Access Paper or Ask Questions

Differential Privacy in Hierarchical Federated Learning: A Formal Analysis and Evaluation

Jan 21, 2024

Frank Po-Chen Lin, Christopher Brinton

Abstract:While federated learning (FL) eliminates the transmission of raw data over a network, it is still vulnerable to privacy breaches from the communicated model parameters. In this work, we formalize Differentially Private Hierarchical Federated Learning (DP-HFL), a DP-enhanced FL methodology that seeks to improve the privacy-utility tradeoff inherent in FL. Building upon recent proposals for Hierarchical Differential Privacy (HDP), one of the key concepts of DP-HFL is adapting DP noise injection at different layers of an established FL hierarchy -- edge devices, edge servers, and cloud servers -- according to the trust models within particular subnetworks. We conduct a comprehensive analysis of the convergence behavior of DP-HFL, revealing conditions on parameter tuning under which the model training process converges sublinearly to a stationarity gap, with this gap depending on the network hierarchy, trust model, and target privacy level. Subsequent numerical evaluations demonstrate that DP-HFL obtains substantial improvements in convergence speed over baselines for different privacy budgets, and validate the impact of network configuration on training.

Via

Access Paper or Ask Questions

Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy

May 09, 2023

Andrew Katz, Siqing Wei, Gaurav Nanda, Christopher Brinton, Matthew Ohland

Abstract:Teamwork is a critical component of many academic and professional settings. In those contexts, feedback between team members is an important element to facilitate successful and sustainable teamwork. However, in the classroom, as the number of teams and team members and frequency of evaluation increase, the volume of comments can become overwhelming for an instructor to read and track, making it difficult to identify patterns and areas for student improvement. To address this challenge, we explored the use of generative AI models, specifically ChatGPT, to analyze student comments in team based learning contexts. Our study aimed to evaluate ChatGPT's ability to accurately identify topics in student comments based on an existing framework consisting of positive and negative comments. Our results suggest that ChatGPT can achieve over 90\% accuracy in labeling student comments, providing a potentially valuable tool for analyzing feedback in team projects. This study contributes to the growing body of research on the use of AI models in educational contexts and highlights the potential of ChatGPT for facilitating analysis of student comments.

* 22 pages, 7 tables, 1 figure

Via

Access Paper or Ask Questions

Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Apr 25, 2023

Junghoon Kim, Taejoon Kim, David Love, Christopher Brinton

Figure 1 for Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Figure 2 for Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Figure 3 for Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Figure 4 for Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Abstract:The design of codes for feedback-enabled communications has been a long-standing open problem. Recent research on non-linear, deep learning-based coding schemes have demonstrated significant improvements in communication reliability over linear codes, but are still vulnerable to the presence of forward and feedback noise over the channel. In this paper, we develop a new family of non-linear feedback codes that greatly enhance robustness to channel noise. Our autoencoder-based architecture is designed to learn codes based on consecutive blocks of bits, which obtains de-noising advantages over bit-by-bit processing to help overcome the physical separation between the encoder and decoder over a noisy channel. Moreover, we develop a power control layer at the encoder to explicitly incorporate hardware constraints into the learning optimization, and prove that the resulting average power constraint is satisfied asymptotically. Numerical experiments demonstrate that our scheme outperforms state-of-the-art feedback codes by wide margins over practical forward and feedback noise regimes, and provide information-theoretic insights on the behavior of our non-linear codes. Moreover, we observe that, in a long blocklength regime, canonical error correction codes are still preferable to feedback codes when the feedback noise becomes high.

* To appear in International Conference on Machine Learning (ICML) 2023

Via

Access Paper or Ask Questions

Delay-Aware Hierarchical Federated Learning

Mar 23, 2023

Frank Po-Chen Lin, Seyyedali Hosseinalipour, Christopher Brinton, Nicolò Michelusi

Figure 1 for Delay-Aware Hierarchical Federated Learning

Figure 2 for Delay-Aware Hierarchical Federated Learning

Figure 3 for Delay-Aware Hierarchical Federated Learning

Figure 4 for Delay-Aware Hierarchical Federated Learning

Abstract:Federated learning has gained popularity as a means of training models distributed across the wireless edge. The paper introduces delay-aware federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training by addressing communication delays between edge and cloud. DFL employs multiple stochastic gradient descent iterations on device datasets during each global aggregation interval and intermittently aggregates model parameters through edge servers in local subnetworks. The cloud server synchronizes the local models with the global deployed model computed via a local-global combiner at global synchronization. The convergence behavior of DFL is theoretically investigated under a generalized data heterogeneity metric. A set of conditions is obtained to achieve the sub-linear convergence rate of O(1/k). Based on these findings, an adaptive control algorithm is developed for DFL, implementing policies to mitigate energy consumption and edge-to-cloud communication latency while aiming for a sublinear convergence rate. Numerical evaluations show DFL's superior performance in terms of faster global model convergence, reduced resource consumption, and robustness against communication delays compared to existing FL algorithms. In summary, this proposed method offers improved efficiency and satisfactory results when dealing with both convex and non-convex loss functions.

* A condensed version of this paper was presented at IEEE Globecom 2020

Via

Access Paper or Ask Questions

Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Dec 05, 2022

Yun-Wei Chu, Seyyedali Hosseinalipour, Elizabeth Tenorio, Laura Cruz, Kerrie Douglas, Andrew Lan, Christopher Brinton

Figure 1 for Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Figure 2 for Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Figure 3 for Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Figure 4 for Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Abstract:Traditional learning-based approaches to student modeling (e.g., predicting grades based on measured activities) generalize poorly to underrepresented/minority student groups due to biases in data availability. In this paper, we propose a Multi-Layer Personalized Federated Learning (MLPFL) methodology which optimizes inference accuracy over different layers of student grouping criteria, such as by course and by demographic subgroups within each course. In our approach, personalized models for individual student subgroups are derived from a global model, which is trained in a distributed fashion via meta-gradient updates that account for subgroup heterogeneity while preserving modeling commonalities that exist across the full dataset. To evaluate our methodology, we consider case studies of two popular downstream student modeling tasks, knowledge tracing and outcome prediction, which leverage multiple modalities of student behavior (e.g., visits to lecture videos and participation on forums) in model training. Experiments on three real-world datasets from online courses demonstrate that our approach obtains substantial improvements over existing student modeling baselines in terms of increasing the average and decreasing the variance of prediction quality across different student subgroups. Visual analysis of the resulting students' knowledge state embeddings confirm that our personalization methodology extracts activity patterns which cluster into different student subgroups, consistent with the performance enhancements we obtain over the baselines.

* arXiv admin note: substantial text overlap with arXiv:2208.01182

Via

Access Paper or Ask Questions