Abstract:Deep Learning (DL) is increasingly being integrated into Web applications through a method known as "in-browser inference", where the DL processes occur directly within Web browsers. However, the actual performance of this method and its effect on user experience quality (QoE) is not well-understood. This gap in knowledge necessitates new forms of QoE measurement, going beyond traditional metrics such as page load time. To address this, we conducted the first extensive performance evaluation of in-browser inference. We introduced new metrics for this purpose: responsiveness, smoothness, and inference accuracy. Our thorough study included 9 widely-used DL models and tested them across 50 popular PC Web browsers. The findings show a significant latency issue with in-browser inference: it's on average 16.9 times slower on CPU and 4.9 times slower on GPU than native inference methods. Several factors contribute to this latency, including underused hardware instruction sets, inherent delays in the runtime environment, resource competition within the browser, and inefficiencies in software libraries and GPU abstractions. Moreover, in-browser inference demands a lot of memory, sometimes up to 334.6 times more than the size of the DL models themselves. This excessive memory usage is partly due to suboptimal memory management. Additionally, we noticed that in-browser inference increases the time it takes for graphical user interface (GUI) components to load in web browsers by a significant 67.2\%, which severely impacts the overall QoE for users of web applications that depend on this technology.
Abstract:Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment. However, the substantial advancements in versatility and performance these models offer come at a significant cost in terms of hardware resources. To support the growth of these large models in a scalable and environmentally sustainable way, there has been a considerable focus on developing resource-efficient strategies. This survey delves into the critical importance of such research, examining both algorithmic and systemic aspects. It offers a comprehensive analysis and valuable insights gleaned from existing literature, encompassing a broad array of topics from cutting-edge model architectures and training/serving algorithms to practical system designs and implementations. The goal of this survey is to provide an overarching understanding of how current approaches are tackling the resource challenges posed by large foundation models and to potentially inspire future breakthroughs in this field.
Abstract:Based on the signals received across its antennas, a multi-antenna base station (BS) can apply the classic multiple signal classification (MUSIC) algorithm for estimating the angle of arrivals (AOAs) of its incident signals. This method can be leveraged to localize the users if their line-of-sight (LOS) paths to the BS are available. In this paper, we consider a more challenging AOA estimation setup in the intelligent reflecting surface (IRS) assisted integrated sensing and communication (ISAC) system, where LOS paths do not exist between the BS and the users, while the users' signals can be transmitted to the BS merely via their LOS paths to the IRS as well as the LOS path from the IRS to the BS. Specifically, we treat the IRS as the anchor and are interested in estimating the AOAs of the incident signals from the users to the IRS. Note that we have to achieve the above goal based on the signals received by the BS, because the passive IRS cannot process its received signals. However, the signals received across different antennas of the BS only contain AOA information of its incident signals via the LOS path from the IRS to the BS. To tackle this challenge arising from the spatial-domain received signals, we propose an innovative approach to create temporal-domain multi-dimension received signals for estimating the AOAs of the paths from the users to the IRS. Specifically, via a proper design of the user message pattern and the IRS reflecting pattern, we manage to show that our designed temporal-domain multi-dimension signals can be surprisingly expressed as a function of the virtual steering vectors of the IRS towards the users. This amazing result implies that the classic MUSIC algorithm can be applied to our designed temporal-domain multi-dimension signals for accurately estimating the AOAs of the signals from the users to the IRS.
Abstract:In the future 6G integrated sensing and communication (ISAC) cellular systems, networked sensing is a promising technique that can leverage the cooperation among the base stations (BSs) to perform high-resolution localization. However, a dense deployment of BSs to fully reap the networked sensing gain is not a cost-efficient solution in practice. Motivated by the advance in the intelligent reflecting surface (IRS) technology for 6G communication, this paper examines the feasibility of deploying the low-cost IRSs to enhance the anchor density for networked sensing. Specifically, we propose a novel heterogeneous networked sensing architecture, which consists of both the active anchors, i.e., the BSs, and the passive anchors, i.e., the IRSs. Under this framework, the BSs emit the orthogonal frequency division multiplexing (OFDM) communication signals in the downlink for localizing the targets based on their echoes reflected via/not via the IRSs. However, there are two challenges for using passive anchors in localization. First, it is impossible to utilize the round-trip signal between a passive IRS and a passive target for estimating their distance. Second, before localizing a target, we do not know which IRS is closest to it and serves as its anchor. In this paper, we show that the distance between a target and its associated IRS can be indirectly estimated based on the length of the BS-target-BS path and the BS-target-IRS-BS path. Moreover, we propose an efficient data association method to match each target to its associated IRS. Numerical results are given to validate the feasibility and effectiveness of our proposed heterogeneous networked sensing architecture with both active and passive anchors.
Abstract:The classic trilateration technique can localize each target based on its distances to three anchors with known coordinates. Usually, this technique requires all the anchors and targets, e.g., the satellites and the mobile phones in Global Navigation Satellite System (GNSS), to actively transmit/receive radio signals such that the delay of the one-way radio signal propagated between each anchor and each target can be measured. Excitingly, this paper will show that the trilateration technique can be generalized to the scenario where one of the three anchors and all the targets merely reflect the radio signals passively as in radar networks, even if the propagation delay between the passive IRS and the passive targets is difficult to be measured directly, and the data association issue for multi-sensor multi-target tracking arises. Specifically, we consider device-free sensing in a cellular network consisting of two base stations (BSs), one passive intelligent reflecting surface (IRS), and multiple passive targets, to realize integrated sensing and communication (ISAC). The two BSs transmit the orthogonal frequency division multiplexing (OFDM) signals in the downlink and estimate the locations of the targets based on their reflected signals via/not via the IRS. We propose an efficient trilateration-based strategy that can first estimate the distances of each target to the two BSs and the IRS and then localize the targets. Numerical results show that the considered networked sensing architecture with heterogenous anchors can outperform its counterpart with three BSs.
Abstract:This paper considers the joint device activity detection and channel estimation problem in a massive Internet of Things (IoT) connectivity system, where a large number of IoT devices exist but merely a random subset of them become active for short-packet transmission in each coherence block. In particular, we propose to leverage the temporal correlation in device activity, e.g., a device active in the previous coherence block is more likely to be still active in the current coherence block, to improve the detection and estimation performance. However, it is challenging to utilize this temporal correlation as side information (SI), which relies on the knowledge about the exact statistical relation between the estimated activity pattern for the previous coherence block (which may be imperfect with unknown error) and the true activity pattern in the current coherence block. To tackle this challenge, we establish a novel SI-aided multiple measurement vector approximate message passing (MMV-AMP) framework. Specifically, thanks to the state evolution of the MMV-AMP algorithm, the correlation between the activity pattern estimated by the MMV-AMP algorithm in the previous coherence block and the real activity pattern in the current coherence block is quantified explicitly. Based on the well-defined temporal correlation, we further manage to embed this useful SI into the denoiser design under the MMV-AMP framework. Specifically, the SI-based soft-thresholding denoisers with binary thresholds and the SI-based minimum mean-squared error (MMSE) denoisers are characterized for the cases without and with the knowledge of the channel distribution, respectively. Numerical results are given to show the significant gain in device activity detection and channel estimation performance brought by our proposed SI-aided MMV-AMP framework.
Abstract:This paper considers joint device activity detection and channel estimation in Internet of Things (IoT) networks, where a large number of IoT devices exist but merely a random subset of them become active for short-packet transmission at each time slot. In particular, we propose to leverage the temporal correlation in user activity, i.e., a device active at the previous time slot is more likely to be still active at the current moment, to improve the detection performance. Despite the temporally-correlated user activity in consecutive time slots, it is challenging to unveil the connection between the activity pattern estimated previously, which is imperfect but the only available side information (SI), and the true activity pattern at the current moment due to the unknown estimation error. In this work, we manage to tackle this challenge under the framework of approximate message passing (AMP). Specifically, thanks to the state evolution, the correlation between the activity pattern estimated by AMP at the previous time slot and the real activity pattern at the previous and current moment is quantified explicitly. Based on the well-defined temporal correlation, we further manage to embed this useful SI into the design of the minimum mean-squared error (MMSE) denoisers and log-likelihood ratio (LLR) test based activity detectors under the AMP framework. Theoretical comparison between the SI-aided AMP algorithm and its counterpart without utilizing temporal correlation is provided. Moreover, numerical results are given to show the significant gain in activity detection accuracy brought by the SI-aided algorithm.
Abstract:We study the combinatorial sleeping multi-armed semi-bandit problem with long-term fairness constraints~(CSMAB-F). To address the problem, we adopt Thompson Sampling~(TS) to maximize the total rewards and use virtual queue techniques to handle the fairness constraints, and design an algorithm called \emph{TS with beta priors and Bernoulli likelihoods for CSMAB-F~(TSCSF-B)}. Further, we prove TSCSF-B can satisfy the fairness constraints, and the time-averaged regret is upper bounded by $\frac{N}{2\eta} + O\left(\frac{\sqrt{mNT\ln T}}{T}\right)$, where $N$ is the total number of arms, $m$ is the maximum number of arms that can be pulled simultaneously in each round~(the cardinality constraint) and $\eta$ is the parameter trading off fairness for rewards. By relaxing the fairness constraints (i.e., let $\eta \rightarrow \infty$), the bound boils down to the first problem-independent bound of TS algorithms for combinatorial sleeping multi-armed semi-bandit problems. Finally, we perform numerical experiments and use a high-rating movie recommendation application to show the effectiveness and efficiency of the proposed algorithm.