Abstract:Federated learning (FL) is now recognized as a key framework for communication-efficient collaborative learning. Most theoretical and empirical studies, however, rely on the assumption that clients have access to pre-collected data sets, with limited investigation into scenarios where clients continuously collect data. In many real-world applications, particularly when data is generated by physical or biological processes, client data streams are often modeled by non-stationary Markov processes. Unlike standard i.i.d. sampling, the performance of FL with Markovian data streams remains poorly understood due to the statistical dependencies between client samples over time. In this paper, we investigate whether FL can still support collaborative learning with Markovian data streams. Specifically, we analyze the performance of Minibatch SGD, Local SGD, and a variant of Local SGD with momentum. We answer affirmatively under standard assumptions and smooth non-convex client objectives: the sample complexity is proportional to the inverse of the number of clients with a communication complexity comparable to the i.i.d. scenario. However, the sample complexity for Markovian data streams remains higher than for i.i.d. sampling.
Abstract:Grant Free Random Access (GFRA) is a popular protocol in the Internet of Things (IoT) to reduce the control signaling. GFRA is a framed protocol where each frame is split into two parts: device identification; and data transmission part which can be viewed as a form of Frame Slotted ALOHA (FSA). A common assumption in FSA is device homogeneity; that is the probability that a device seeks to transmit data in a particular frame is common for all devices and independent of the other devices. Recent work has investigated the possibility of tuning the FSA protocol to the statistics of the network by changing the probability for a particular device to access a particular slot. However, power control with a successive interference cancellation (SIC) receiver has not yet been considered to further increase the performance of the tuned FSA protocols. In this paper, we propose algorithms to jointly optimize both the slot selection and the transmit power of the devices to minimize the outage of the devices in the network. We show via a simulation study that our algorithms can outperform baselines (including slotted ALOHA) in terms of expected number of devices transmitting without outage and in term of transmit power.
Abstract:This document serves as a technical report for the analysis of on-demand transport dataset. Moreover we show how the dataset can be used to develop a market formation algorithm based on machine learning. Data used in this work comes from Liftago, a Prague based company which connects taxi drivers and customers through a smartphone app. The dataset is analysed from the machine-learning perspective: we give an overview of features available as well as results of feature ranking. Later we propose the SImple Data-driven MArket Formation (SIDMAF) algorithm which aims to improve a relevance while connecting customers with relevant drivers. We compare the heuristics currently used by Liftago with SIDMAF using two key performance indicators.