Abstract:Optical link tomography (OLT) is a rapidly evolving field that allows the multi-span, end-to-end visualization of optical power along fiber links in multiple dimensions from network endpoints, solely by processing signals received at coherent receivers. This paper has two objectives: (1) to report the first field trial of OLT, using a commercial transponder under standard DWDM transmission, and (2) to extend its capability to visualize across 4D (distance, time, frequency, and polarization), allowing for locating and measuring multiple QoT degradation causes, including time-varying power anomalies, spectral anomalies, and excessive polarization dependent loss. We also address a critical aspect of OLT, i.e., its need for high fiber launch power, by improving power profile signal-to-noise ratio through averaging across all available dimensions. Consequently, multiple loss anomalies in a field-deployed link are observed even at launch power lower than the system-optimal level. The applications and use cases of OLT from network commissioning to provisioning and operation for current and near-term network scenarios are also discussed.
Abstract:Next-generation cellular networks are envisioned to integrate sensing capabilities with communication, particularly in the millimeter-wave (mmWave) spectrum, where beamforming using large-scale antenna arrays enables directional signal transmissions for improved spatial multiplexing. In current 5G networks, however, beamforming is typically designed either for communication or sensing (e.g., beam training during link establishment). In this paper, we present Chameleon, a novel framework that augments and rapidly switches beamformers during each demodulation reference signal (DMRS) symbol to achieve integrated sensing and communication (ISAC) in 5G mmWave networks. Each beamformer introduces an additional sensing beam toward target angles while maintaining the communication beams toward multiple users. We implement Chameleon on a 28 GHz software-defined radio testbed supporting over-the-air 5G physical downlink shared channel (PDSCH) transmissions. Extensive experiments in open environments show that Chameleon achieves multi-user communication with a sum data rate of up to 0.80 Gbps across two users. Simultaneously, Chameleon employs a beamformer switching interval of only 0.24 {\mu}s, therefore producing a 31x31-point 2D imaging within just 0.875 ms. Leveraging machine learning, Chameleon further enables object localization with median errors of 0.14 m (distance) and 0.24{\deg} (angle), and material classification with 99.0% accuracy.
Abstract:The coexistence between incumbent radar signals and commercial 5G signals necessitates a versatile and ubiquitous radar sensing for efficient and adaptive spectrum sharing. In this context, leveraging the densely deployed 5G base stations (BS) for radar sensing is particularly promising, offering both wide coverage and immediate feedback to 5G scheduling. However, the targeting radar signals are superimposed with concurrent 5G uplink transmissions received by the BS, and practical deployment also demands a lightweight, portable radar sensing model. This paper presents BatStation, a lightweight, in-situ radar sensing framework seamlessly integrated into 5G BSs. BatStation leverages uplink resource grids to extract radar signals through three key components: (i) radar signal separation to cancel concurrent 5G transmissions and reveal the radar signals, (ii) resource grid reshaping to align time-frequency resolution with radar pulse characteristics, and (iii) zero-shot template correlation based on a portable model trained purely on synthetic data that supports detection, classification, and localization of radar pulses without fine-tuning using experimental data. We implement BatStation on a software-defined radio (SDR) testbed and evaluate its performance with real 5G traffic in the CBRS band. Results show robust performance across diverse radar types, achieving detection probabilities of 97.02% (PUCCH) and 79.23% (PUSCH), classification accuracy up to 97.00%, and median localization errors of 2.68-6.20 MHz (frequency) and 24.6-32.4 microseconds (time). Notably, BatStation achieves this performance with a runtime latency of only 0.11/0.94 ms on GPU/CPU, meeting the real-time requirement of 5G networks.
Abstract:Low-cost indoor mobile robots have gained popularity with the increasing adoption of automation in homes and commercial spaces. However, existing lidar and camera-based solutions have limitations such as poor performance in visually obscured environments, high computational overhead for data processing, and high costs for lidars. In contrast, mmWave radar sensors offer a cost-effective and lightweight alternative, providing accurate ranging regardless of visibility. However, existing radar-based localization suffers from sparse point cloud generation, noise, and false detections. Thus, in this work, we introduce RaGNNarok, a real-time, lightweight, and generalizable graph neural network (GNN)-based framework to enhance radar point clouds, even in complex and dynamic environments. With an inference time of just 7.3 ms on the low-cost Raspberry Pi 5, RaGNNarok runs efficiently even on such resource-constrained devices, requiring no additional computational resources. We evaluate its performance across key tasks, including localization, SLAM, and autonomous navigation, in three different environments. Our results demonstrate strong reliability and generalizability, making RaGNNarok a robust solution for low-cost indoor mobile robots.
Abstract:Deep neural network (DNN) inference on power-constrained edge devices is bottlenecked by costly weight storage and data movement. We introduce MIWEN, a radio-frequency (RF) analog architecture that ``disaggregates'' memory by streaming weights wirelessly and performing classification in the analog front end of standard transceivers. By encoding weights and activations onto RF carriers and using native mixers as computation units, MIWEN eliminates local weight memory and the overhead of analog-to-digital and digital-to-analog conversion. We derive the effective number of bits of radio-frequency analog computation under thermal noise, quantify the energy--precision trade-off, and demonstrate digital-comparable MNIST accuracy at orders-of-magnitude lower energy, unlocking real-time inference on low-power, memory-free edge devices.
Abstract:To accommodate ever-increasing model complexity, modern machine learning (ML) systems have to scale to large GPU clusters. Changes in ML model architecture, ML system implementation, and cluster configuration can significantly affect overall ML system performance. However, quantifying the performance impact before deployment is challenging. Existing performance estimation methods use performance modeling or static workload simulation. These techniques are not general: they requires significant human effort and computation capacity to generate training data or a workload. It is also difficult to adapt ML systems to use these techniques. This paper introduces, Phantora, a live GPU cluster simulator for performance estimation. Phantora runs minimally modified ML models and frameworks, intercepting and simulating GPU-related operations to enable high-fidelity performance estimation. Phantora overcomes several research challenges in integrating an event-driven network simulator with live system execution, and introduces a set of techniques to improve simulation speed, scalability, and accuracy. Our evaluation results show that Phantora can deliver similar estimation accuracy to the state-of-the-art workload simulation approach with only one GPU, while reducing human effort and increasing generalizability.
Abstract:Modern edge devices, such as cameras, drones, and Internet-of-Things nodes, rely on deep learning to enable a wide range of intelligent applications, including object recognition, environment perception, and autonomous navigation. However, deploying deep learning models directly on the often resource-constrained edge devices demands significant memory footprints and computational power for real-time inference using traditional digital computing architectures. In this paper, we present WISE, a novel computing architecture for wireless edge networks designed to overcome energy constraints in deep learning inference. WISE achieves this goal through two key innovations: disaggregated model access via wireless broadcasting and in-physics computation of general complex-valued matrix-vector multiplications directly at radio frequency. Using a software-defined radio platform with wirelessly broadcast model weights over the air, we demonstrate that WISE achieves 95.7% image classification accuracy with ultra-low operation power of 6.0 fJ/MAC per client, corresponding to a computation efficiency of 165.8 TOPS/W. This approach enables energy-efficient deep learning inference on wirelessly connected edge devices, achieving more than two orders of magnitude improvement in efficiency compared to traditional digital computing.




Abstract:Next generation wireless and mobile networks will utilize millimeter-wave (mmWave) communication to achieve significantly increased data rates. However, since mmWave radio signals experience high path loss, the operation of mmWave networks will require accurate channel models designed for specific deployment sites. In this paper, we focus on the deployment area of the PAWR COSMOS testbed in New York City and report extensive 28 GHz channel measurements. These include over 46 million power measurements collected from over 3,000 links on 24 sidewalks at 4 different sites and in different settings. Using these measurements, we study the effects of the setup and environments (e.g., transmitter height and seasonal effects). We then discuss the obtained path gain values and their fitted lines, and the resulting effective azimuth beamforming gain. Based on these results, we also study the link SNR values that can be supported on individual sidewalks and the corresponding theoretically achievable data rates. Finally, we develop a process to transform the measurements and generate Spectrum Consumption Models (SCMs) based on the IEEE 1900.5.2 standard. The generated SCMs facilitate the evaluation of spectrum sharing and interference management scenarios since they capture all the directional propagation effects reflected in the measurements and provide a way to easily share the main propagation characterization results derived from the measurements. We believe that the results can inform the COSMOS testbed deployment process and provide a benchmark for other deployment efforts in dense urban areas.




Abstract:The complexity of large language model (LLM) serving workloads has substantially increased due to the integration with external tool invocations, such as ChatGPT plugins. In this paper, we identify a new opportunity for efficient LLM serving for requests that trigger tools: tool partial execution alongside LLM decoding. To this end, we design Conveyor, an efficient LLM serving system optimized for handling requests involving external tools. We introduce a novel interface for tool developers to expose partial execution opportunities to the LLM serving system and a request scheduler that facilitates partial tool execution. Our results demonstrate that tool partial execution can improve request completion latency by up to 38.8%.




Abstract:In this work, we present RadCloud, a novel real time framework for directly obtaining higher-resolution lidar-like 2D point clouds from low-resolution radar frames on resource-constrained platforms commonly used in unmanned aerial and ground vehicles (UAVs and UGVs, respectively); such point clouds can then be used for accurate environmental mapping, navigating unknown environments, and other robotics tasks. While high-resolution sensing using radar data has been previously reported, existing methods cannot be used on most UAVs, which have limited computational power and energy; thus, existing demonstrations focus on offline radar processing. RadCloud overcomes these challenges by using a radar configuration with 1/4th of the range resolution and employing a deep learning model with 2.25x fewer parameters. Additionally, RadCloud utilizes a novel chirp-based approach that makes obtained point clouds resilient to rapid movements (e.g., aggressive turns or spins), which commonly occur during UAV flights. In real-world experiments, we demonstrate the accuracy and applicability of RadCloud on commercially available UAVs and UGVs, with off-the-shelf radar platforms on-board.