Abstract:In class-incremental learning (class-IL), models must classify all previously seen classes at test time without task-IDs, leading to task confusion. Despite being a key challenge, task confusion lacks a theoretical understanding. We present a novel mathematical framework for class-IL and prove the Infeasibility Theorem, showing optimal class-IL is impossible with discriminative modeling due to task confusion. However, we establish the Feasibility Theorem, demonstrating that generative modeling can achieve optimal class-IL by overcoming task confusion. We then assess popular class-IL strategies, including regularization, bias-correction, replay, and generative classifier, using our framework. Our analysis suggests that adopting generative modeling, either for generative replay or direct classification (generative classifier), is essential for optimal class-IL.
Abstract:The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team. These challenges aim to advance research across multiple themes in football, including broadcast video understanding, field understanding, and player understanding. This year, the challenges encompass four vision-based tasks. (1) Ball Action Spotting, focusing on precisely localizing when and which soccer actions related to the ball occur, (2) Dense Video Captioning, focusing on describing the broadcast with natural language and anchored timestamps, (3) Multi-View Foul Recognition, a novel task focusing on analyzing multiple viewpoints of a potential foul incident to classify whether a foul occurred and assess its severity, (4) Game State Reconstruction, another novel task focusing on reconstructing the game state from broadcast videos onto a 2D top-view map of the field. Detailed information about the tasks, challenges, and leaderboards can be found at https://www.soccer-net.org, with baselines and development kits available at https://github.com/SoccerNet.
Abstract:In conventional multi-user multiple-input multiple-output (MU-MIMO) systems with frequency division duplexing (FDD), channel acquisition and precoder optimization processes have been designed separately although they are highly coupled. This paper studies an end-to-end design of downlink MU-MIMO systems which include pilot sequences, limited feedback, and precoding. To address this problem, we propose a novel deep learning (DL) framework which jointly optimizes the feedback information generation at users and the precoder design at a base station (BS). Each procedure in the MU-MIMO systems is replaced by intelligently designed multiple deep neural networks (DNN) units. At the BS, a neural network generates pilot sequences and helps the users obtain accurate channel state information. At each user, the channel feedback operation is carried out in a distributed manner by an individual user DNN. Then, another BS DNN collects feedback information from the users and determines the MIMO precoding matrices. A joint training algorithm is proposed to optimize all DNN units in an end-to-end manner. In addition, a training strategy which can avoid retraining for different network sizes for a scalable design is proposed. Numerical results demonstrate the effectiveness of the proposed DL framework compared to classical optimization techniques and other conventional DNN schemes.
Abstract:Federated Learning (FL) has recently received a lot of attention for large-scale privacy-preserving machine learning. However, high communication overheads due to frequent gradient transmissions decelerate FL. To mitigate the communication overheads, two main techniques have been studied: (i) local update of weights characterizing the trade-off between communication and computation and (ii) gradient compression characterizing the trade-off between communication and precision. To the best of our knowledge, studying and balancing those two trade-offs jointly and dynamically while considering their impacts on convergence has remained unresolved even though it promises significantly faster FL. In this paper, we first formulate our problem to minimize learning error with respect to two variables: local update coefficients and sparsity budgets of gradient compression who characterize trade-offs between communication and computation/precision, respectively. We then derive an upper bound of the learning error in a given wall-clock time considering the interdependency between the two variables. Based on this theoretical analysis, we propose an enhanced FL scheme, namely Fast FL (FFL), that jointly and dynamically adjusts the two variables to minimize the learning error. We demonstrate that FFL consistently achieves higher accuracies faster than similar schemes existing in the literature.