Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carlee Joe-Wong

Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

May 25, 2025

Jingyuan Liu, Zeyu Zhang, Xuchuang Wang, Xutong Liu, John C. S. Lui, Mohammad Hajiesmaili, Carlee Joe-Wong

Abstract:Contextual linear multi-armed bandits are a learning framework for making a sequence of decisions, e.g., advertising recommendations for a sequence of arriving users. Recent works have shown that clustering these users based on the similarity of their learned preferences can significantly accelerate the learning. However, prior work has primarily focused on the online setting, which requires continually collecting user data, ignoring the offline data widely available in many applications. To tackle these limitations, we study the offline clustering of bandits (Off-ClusBand) problem, which studies how to use the offline dataset to learn cluster properties and improve decision-making across multiple users. The key challenge in Off-ClusBand arises from data insufficiency for users: unlike the online case, in the offline case, we have a fixed, limited dataset to work from and thus must determine whether we have enough data to confidently cluster users together. To address this challenge, we propose two algorithms: Off-C$^2$LUB, which we analytically show performs well for arbitrary amounts of user data, and Off-CLUB, which is prone to bias when data is limited but, given sufficient data, matches a theoretical lower bound that we derive for the offline clustered MAB problem. We experimentally validate these results on both real and synthetic datasets.

Via

Access Paper or Ask Questions

TACO: Tackling Over-correction in Federated Learning with Tailored Adaptive Correction

Apr 24, 2025

Weijie Liu, Ziwei Zhan, Carlee Joe-Wong, Edith Ngai, Jingpu Duan, Deke Guo, Xu Chen, Xiaoxi Zhang

Abstract:Non-independent and identically distributed (Non-IID) data across edge clients have long posed significant challenges to federated learning (FL) training in edge computing environments. Prior works have proposed various methods to mitigate this statistical heterogeneity. While these works can achieve good theoretical performance, in this work we provide the first investigation into a hidden over-correction phenomenon brought by the uniform model correction coefficients across clients adopted by existing methods. Such over-correction could degrade model performance and even cause failures in model convergence. To address this, we propose TACO, a novel algorithm that addresses the non-IID nature of clients' data by implementing fine-grained, client-specific gradient correction and model aggregation, steering local models towards a more accurate global optimum. Moreover, we verify that leading FL algorithms generally have better model accuracy in terms of communication rounds rather than wall-clock time, resulting from their extra computation overhead imposed on clients. To enhance the training efficiency, TACO deploys a lightweight model correction and tailored aggregation approach that requires minimum computation overhead and no extra information beyond the synchronized model parameters. To validate TACO's effectiveness, we present the first FL convergence analysis that reveals the root cause of over-correction. Extensive experiments across various datasets confirm TACO's superior and stable performance in practice.

* 11 pages, 7 figures, accepted by ICDCS 2025

Via

Access Paper or Ask Questions

Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training

Apr 13, 2025

Yi Hu, Jinhang Zuo, Eddie Zhang, Bob Iannucci, Carlee Joe-Wong

Abstract:Recent advancements in machine learning (ML) have enabled its deployment on resource-constrained edge devices, fostering innovative applications such as intelligent environmental sensing. However, these devices, particularly microcontrollers (MCUs), face substantial challenges due to limited memory, computing capabilities, and the absence of dedicated floating-point units (FPUs). These constraints hinder the deployment of complex ML models, especially those requiring lifelong learning capabilities. To address these challenges, we propose Tin-Tin, an integer-based on-device training framework designed specifically for low-power MCUs. Tin-Tin introduces novel integer rescaling techniques to efficiently manage dynamic ranges and facilitate efficient weight updates using integer data types. Unlike existing methods optimized for devices with FPUs, GPUs, or FPGAs, Tin-Tin addresses the unique demands of tiny MCUs, prioritizing energy efficiency and optimized memory utilization. We validate the effectiveness of Tin-Tin through end-to-end application examples on real-world tiny devices, demonstrating its potential to support energy-efficient and sustainable ML applications on edge platforms.

Via

Access Paper or Ask Questions

Towards Optimal Heterogeneous Client Sampling in Multi-Model Federated Learning

Apr 08, 2025

Haoran Zhang, Zejun Gong, Zekai Li, Marie Siew, Carlee Joe-Wong, Rachid El-Azouzi

Abstract:Federated learning (FL) allows edge devices to collaboratively train models without sharing local data. As FL gains popularity, clients may need to train multiple unrelated FL models, but communication constraints limit their ability to train all models simultaneously. While clients could train FL models sequentially, opportunistically having FL clients concurrently train different models -- termed multi-model federated learning (MMFL) -- can reduce the overall training time. Prior work uses simple client-to-model assignments that do not optimize the contribution of each client to each model over the course of its training. Prior work on single-model FL shows that intelligent client selection can greatly accelerate convergence, but na\"ive extensions to MMFL can violate heterogeneous resource constraints at both the server and the clients. In this work, we develop a novel convergence analysis of MMFL with arbitrary client sampling methods, theoretically demonstrating the strengths and limitations of previous well-established gradient-based methods. Motivated by this analysis, we propose MMFL-LVR, a loss-based sampling method that minimizes training variance while explicitly respecting communication limits at the server and reducing computational costs at the clients. We extend this to MMFL-StaleVR, which incorporates stale updates for improved efficiency and stability, and MMFL-StaleVRE, a lightweight variant suitable for low-overhead deployment. Experiments show our methods improve average accuracy by up to 19.1% over random sampling, with only a 5.4% gap from the theoretical optimum (full client participation).

* 10 pages, solved a hyperlink problem

Via

Access Paper or Ask Questions

Interference-Aware Edge Runtime Prediction with Conformal Matrix Completion

Mar 09, 2025

Tianshu Huang, Arjun Ramesh, Emily Ruppel, Nuno Pereira, Anthony Rowe, Carlee Joe-Wong

Abstract:Accurately estimating workload runtime is a longstanding goal in computer systems, and plays a key role in efficient resource provisioning, latency minimization, and various other system management tasks. Runtime prediction is particularly important for managing increasingly complex distributed systems in which more sophisticated processing is pushed to the edge in search of better latency. Previous approaches for runtime prediction in edge systems suffer from poor data efficiency or require intensive instrumentation; these challenges are compounded in heterogeneous edge computing environments, where historical runtime data may be sparsely available and instrumentation is often challenging. Moreover, edge computing environments often feature multi-tenancy due to limited resources at the network edge, potentially leading to interference between workloads and further complicating the runtime prediction problem. Drawing from insights across machine learning and computer systems, we design a matrix factorization-inspired method that generates accurate interference-aware predictions with tight provably-guaranteed uncertainty bounds. We validate our method on a novel WebAssembly runtime dataset collected from 24 unique devices, achieving a prediction error of 5.2% -- 2x better than a naive application of existing methods.

* To appear at MLSys 2025

Via

Access Paper or Ask Questions

LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning

Feb 08, 2025

Hanqing Yang, Jingdi Chen, Marie Siew, Tania Lorido-Botran, Carlee Joe-Wong

Abstract:Developing intelligent agents for long-term cooperation in dynamic open-world scenarios is a major challenge in multi-agent systems. Traditional Multi-agent Reinforcement Learning (MARL) frameworks like centralized training decentralized execution (CTDE) struggle with scalability and flexibility. They require centralized long-term planning, which is difficult without custom reward functions, and face challenges in processing multi-modal data. CTDE approaches also assume fixed cooperation strategies, making them impractical in dynamic environments where agents need to adapt and plan independently. To address decentralized multi-agent cooperation, we propose Decentralized Adaptive Knowledge Graph Memory and Structured Communication System (DAMCS) in a novel Multi-agent Crafter environment. Our generative agents, powered by Large Language Models (LLMs), are more scalable than traditional MARL agents by leveraging external knowledge and language for long-term planning and reasoning. Instead of fully sharing information from all past experiences, DAMCS introduces a multi-modal memory system organized as a hierarchical knowledge graph and a structured communication protocol to optimize agent cooperation. This allows agents to reason from past interactions and share relevant information efficiently. Experiments on novel multi-agent open-world tasks show that DAMCS outperforms both MARL and LLM baselines in task efficiency and collaboration. Compared to single-agent scenarios, the two-agent scenario achieves the same goal with 63% fewer steps, and the six-agent scenario with 74% fewer steps, highlighting the importance of adaptive memory and structured communication in achieving long-term goals. We publicly release our project at: https://happyeureka.github.io/damcs.

Via

Access Paper or Ask Questions

Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy

Jan 17, 2025

Ishank Juneja, Carlee Joe-Wong, Osman Yağan

Abstract:Multi-armed bandits (MAB) are commonly used in sequential online decision-making when the reward of each decision is an unknown random variable. In practice however, the typical goal of maximizing total reward may be less important than minimizing the total cost of the decisions taken, subject to a reward constraint. For example, we may seek to make decisions that have at least the reward of a reference ``default'' decision, with as low a cost as possible. This problem was recently introduced in the Multi-Armed Bandits with Cost Subsidy (MAB-CS) framework. MAB-CS is broadly applicable to problem domains where a primary metric (cost) is constrained by a secondary metric (reward), and the rewards are unknown. In our work, we address variants of MAB-CS including ones with reward constrained by the reward of a known reference arm or by the subsidized best reward. We introduce the Pairwise-Elimination (PE) algorithm for the known reference arm variant and generalize PE to PE-CS for the subsidized best reward variant. Our instance-dependent analysis of PE and PE-CS reveals that both algorithms have an order-wise logarithmic upper bound on Cost and Quality Regret, making our policies the first with such a guarantee. Moreover, by comparing our upper and lower bound results we establish that PE is order-optimal for all known reference arm problem instances. Finally, experiments are conducted using the MovieLens 25M and Goodreads datasets for both PE and PE-CS revealing the effectiveness of PE and the superior balance between performance and reliability offered by PE-CS compared to baselines from the literature.

Via

Access Paper or Ask Questions

FedTLU: Federated Learning with Targeted Layer Updates

Dec 23, 2024

Jong-Ik Park, Carlee Joe-Wong

Abstract:Federated learning (FL) addresses privacy concerns in language modeling by enabling multiple clients to contribute to training language models. However, non-IID (identically and independently distributed) data across clients often limits FL's performance. This issue is especially challenging during model fine-tuning, as noise due to variations in clients' data distributions can harm model convergence near the optimum. This paper proposes a targeted layer update strategy for fine-tuning in FL. Instead of randomly updating layers of the language model, as often done in practice, we use a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers. We show in extensive experiments that our method improves convergence and performance in non-IID settings, offering a more efficient approach to fine-tuning federated language models.

Via

Access Paper or Ask Questions

FedGAT: A Privacy-Preserving Federated Approximation Algorithm for Graph Attention Networks

Dec 20, 2024

Siddharth Ambekar, Yuhang Yao, Ryan Li, Carlee Joe-Wong

Figure 1 for FedGAT: A Privacy-Preserving Federated Approximation Algorithm for Graph Attention Networks

Figure 2 for FedGAT: A Privacy-Preserving Federated Approximation Algorithm for Graph Attention Networks

Figure 3 for FedGAT: A Privacy-Preserving Federated Approximation Algorithm for Graph Attention Networks

Figure 4 for FedGAT: A Privacy-Preserving Federated Approximation Algorithm for Graph Attention Networks

Abstract:Federated training methods have gained popularity for graph learning with applications including friendship graphs of social media sites and customer-merchant interaction graphs of huge online marketplaces. However, privacy regulations often require locally generated data to be stored on local clients. The graph is then naturally partitioned across clients, with no client permitted access to information stored on another. Cross-client edges arise naturally in such cases and present an interesting challenge to federated training methods, as training a graph model at one client requires feature information of nodes on the other end of cross-client edges. Attempting to retain such edges often incurs significant communication overhead, and dropping them altogether reduces model performance. In simpler models such as Graph Convolutional Networks, this can be fixed by communicating a limited amount of feature information across clients before training, but GATs (Graph Attention Networks) require additional information that cannot be pre-communicated, as it changes from training round to round. We introduce the Federated Graph Attention Network (FedGAT) algorithm for semi-supervised node classification, which approximates the behavior of GATs with provable bounds on the approximation error. FedGAT requires only one pre-training communication round, significantly reducing the communication overhead for federated GAT training. We then analyze the error in the approximation and examine the communication overhead and computational complexity of the algorithm. Experiments show that FedGAT achieves nearly the same accuracy as a GAT model in a centralised setting, and its performance is robust to the number of clients as well as data distribution.

Via

Access Paper or Ask Questions

HumekaFL: Automated Detection of Neonatal Asphyxia Using Federated Learning

Dec 02, 2024

Pamely Zantou, Blessed Guda, Bereket Retta, Gladys Inabeza, Carlee Joe-Wong, Assane Gueye

Abstract:Birth Apshyxia (BA) is a severe condition characterized by insufficient supply of oxygen to a newborn during the delivery. BA is one of the primary causes of neonatal death in the world. Although there has been a decline in neonatal deaths over the past two decades, the developing world, particularly sub-Saharan Africa, continues to experience the highest under-five (<5) mortality rates. While evidence-based methods are commonly used to detect BA in African healthcare settings, they can be subject to physician errors or delays in diagnosis, preventing timely interventions. Centralized Machine Learning (ML) methods demonstrated good performance in early detection of BA but require sensitive health data to leave their premises before training, which does not guarantee privacy and security. Healthcare institutions are therefore reluctant to adopt such solutions in Africa. To address this challenge, we suggest a federated learning (FL)-based software architecture, a distributed learning method that prioritizes privacy and security by design. We have developed a user-friendly and cost-effective mobile application embedding the FL pipeline for early detection of BA. Our Federated SVM model outperformed centralized SVM pipelines and Neural Networks (NN)-based methods in the existing literature

* Poster at ACM compass 2024

Via

Access Paper or Ask Questions