Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stephen Lee

University of Pittsburgh

PCAP-Backdoor: Backdoor Poisoning Generator for Network Traffic in CPS/IoT Environments

Jan 26, 2025

Ajesh Koyatan Chathoth, Stephen Lee

Abstract:The rapid expansion of connected devices has made them prime targets for cyberattacks. To address these threats, deep learning-based, data-driven intrusion detection systems (IDS) have emerged as powerful tools for detecting and mitigating such attacks. These IDSs analyze network traffic to identify unusual patterns and anomalies that may indicate potential security breaches. However, prior research has shown that deep learning models are vulnerable to backdoor attacks, where attackers inject triggers into the model to manipulate its behavior and cause misclassifications of network traffic. In this paper, we explore the susceptibility of deep learning-based IDS systems to backdoor attacks in the context of network traffic analysis. We introduce \texttt{PCAP-Backdoor}, a novel technique that facilitates backdoor poisoning attacks on PCAP datasets. Our experiments on real-world Cyber-Physical Systems (CPS) and Internet of Things (IoT) network traffic datasets demonstrate that attackers can effectively backdoor a model by poisoning as little as 1\% or less of the entire training dataset. Moreover, we show that an attacker can introduce a trigger into benign traffic during model training yet cause the backdoored model to misclassify malicious traffic when the trigger is present. Finally, we highlight the difficulty of detecting this trigger-based backdoor, even when using existing backdoor defense techniques.

Via

Access Paper or Ask Questions

BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos

May 29, 2024

Isla Duporge, Maksim Kholiavchenko, Roi Harel, Scott Wolf, Dan Rubenstein, Meg Crofoot, Tanya Berger-Wolf, Stephen Lee, Julie Barreau, Jenna Kline(+2 more)

Figure 1 for BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos

Figure 2 for BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos

Figure 3 for BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos

Figure 4 for BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos

Abstract:Using drones to track multiple individuals simultaneously in their natural environment is a powerful approach for better understanding group primate behavior. Previous studies have demonstrated that it is possible to automate the classification of primate behavior from video data, but these studies have been carried out in captivity or from ground-based cameras. To understand group behavior and the self-organization of a collective, the whole troop needs to be seen at a scale where behavior can be seen in relation to the natural environment in which ecological decisions are made. This study presents a novel dataset from drone videos for baboon detection, tracking, and behavior recognition. The baboon detection dataset was created by manually annotating all baboons in drone videos with bounding boxes. A tiling method was subsequently applied to create a pyramid of images at various scales from the original 5.3K resolution images, resulting in approximately 30K images used for baboon detection. The tracking dataset is derived from the detection dataset, where all bounding boxes are assigned the same ID throughout the video. This process resulted in half an hour of very dense tracking data. The behavior recognition dataset was generated by converting tracks into mini-scenes, a video subregion centered on each animal; each mini-scene was manually annotated with 12 distinct behavior types, resulting in over 20 hours of data. Benchmark results show mean average precision (mAP) of 92.62\% for the YOLOv8-X detection model, multiple object tracking precision (MOTA) of 63.81\% for the BotSort tracking algorithm, and micro top-1 accuracy of 63.97\% for the X3D behavior recognition model. Using deep learning to classify wildlife behavior from drone footage facilitates non-invasive insight into the collective behavior of an entire group.

* Dataset will be published shortly

Via

Access Paper or Ask Questions

Streaming Video Analytics On The Edge With Asynchronous Cloud Support

Oct 04, 2022

Anurag Ghosh, Srinivasan Iyengar, Stephen Lee, Anuj Rathore, Venkat N Padmanabhan

Figure 1 for Streaming Video Analytics On The Edge With Asynchronous Cloud Support

Figure 2 for Streaming Video Analytics On The Edge With Asynchronous Cloud Support

Figure 3 for Streaming Video Analytics On The Edge With Asynchronous Cloud Support

Figure 4 for Streaming Video Analytics On The Edge With Asynchronous Cloud Support

Abstract:Emerging Internet of Things (IoT) and mobile computing applications are expected to support latency-sensitive deep neural network (DNN) workloads. To realize this vision, the Internet is evolving towards an edge-computing architecture, where computing infrastructure is located closer to the end device to help achieve low latency. However, edge computing may have limited resources compared to cloud environments and thus, cannot run large DNN models that often have high accuracy. In this work, we develop REACT, a framework that leverages cloud resources to execute large DNN models with higher accuracy to improve the accuracy of models running on edge devices. To do so, we propose a novel edge-cloud fusion algorithm that fuses edge and cloud predictions, achieving low latency and high accuracy. We extensively evaluate our approach and show that our approach can significantly improve the accuracy compared to baseline approaches. We focus specifically on object detection in videos (applicable in many video analytics scenarios) and show that the fused edge-cloud predictions can outperform the accuracy of edge-only and cloud-only scenarios by as much as 50%. We also show that REACT can achieve good performance across tradeoff points by choosing a wide range of system parameters to satisfy use-case specific constraints, such as limited network bandwidth or GPU cycles.

* 12 pages

Via

Access Paper or Ask Questions

sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite Images

May 25, 2022

Yoones Rezaei, Stephen Lee

Figure 1 for sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite Images

Figure 2 for sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite Images

Figure 3 for sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite Images

Figure 4 for sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite Images

Abstract:Three-dimensional (3D) urban models have gained interest because of their applications in many use-cases such as urban planning and virtual reality. However, generating these 3D representations requires LiDAR data, which are not always readily available. Thus, the applicability of automated 3D model generation algorithms is limited to a few locations. In this paper, we propose sat2pc, a deep learning architecture that predicts the point cloud of a building roof from a single 2D satellite image. Our architecture combines Chamfer distance and EMD loss, resulting in better 2D to 3D performance. We extensively evaluate our model and perform ablation studies on a building roof dataset. Our results show that sat2pc was able to outperform existing baselines by at least 18.6%. Further, we show that the predicted point cloud captures more detail and geometric characteristics than other baselines.

Via

Access Paper or Ask Questions

Energy-Efficient Parking Analytics System using Deep Reinforcement Learning

Feb 15, 2022

Yoones Rezaei, Stephen Lee, Daniel Mosse

Figure 1 for Energy-Efficient Parking Analytics System using Deep Reinforcement Learning

Figure 2 for Energy-Efficient Parking Analytics System using Deep Reinforcement Learning

Figure 3 for Energy-Efficient Parking Analytics System using Deep Reinforcement Learning

Figure 4 for Energy-Efficient Parking Analytics System using Deep Reinforcement Learning

Abstract:Advances in deep vision techniques and ubiquity of smart cameras will drive the next generation of video analytics. However, video analytics applications consume vast amounts of energy as both deep learning techniques and cameras are power-hungry. In this paper, we focus on a parking video analytics platform and propose RL-CamSleep, a deep reinforcement learning-based technique, to actuate the cameras to reduce the energy footprint while retaining the system's utility. Our key insight is that many video-analytics applications do not always need to be operational, and we can design policies to activate video analytics only when necessary. Moreover, our work is complementary to existing work that focuses on improving hardware and software efficiency. We evaluate our approach on a city-scale parking dataset having 76 streets spread across the city. Our analysis demonstrates how streets have various parking patterns, highlighting the importance of an adaptive policy. Our approach can learn such an adaptive policy that can reduce the average energy consumption by 76.38% and achieve an average accuracy of more than 98% in performing video analytics.

* Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation November 2021 Pages 81-90

Via

Access Paper or Ask Questions

Federated Intrusion Detection for IoT with Heterogeneous Cohort Privacy

Jan 25, 2021

Ajesh Koyatan Chathoth, Abhyuday Jagannatha, Stephen Lee

Figure 1 for Federated Intrusion Detection for IoT with Heterogeneous Cohort Privacy

Figure 2 for Federated Intrusion Detection for IoT with Heterogeneous Cohort Privacy

Figure 3 for Federated Intrusion Detection for IoT with Heterogeneous Cohort Privacy

Figure 4 for Federated Intrusion Detection for IoT with Heterogeneous Cohort Privacy

Abstract:Internet of Things (IoT) devices are becoming increasingly popular and are influencing many application domains such as healthcare and transportation. These devices are used for real-world applications such as sensor monitoring, real-time control. In this work, we look at differentially private (DP) neural network (NN) based network intrusion detection systems (NIDS) to detect intrusion attacks on networks of such IoT devices. Existing NN training solutions in this domain either ignore privacy considerations or assume that the privacy requirements are homogeneous across all users. We show that the performance of existing differentially private stochastic methods degrade for clients with non-identical data distributions when clients' privacy requirements are heterogeneous. We define a cohort-based $(\epsilon,\delta)$-DP framework that models the more practical setting of IoT device cohorts with non-identical clients and heterogeneous privacy requirements. We propose two novel continual-learning based DP training methods that are designed to improve model performance in the aforementioned setting. To the best of our knowledge, ours is the first system that employs a continual learning-based approach to handle heterogeneity in client privacy requirements. We evaluate our approach on real datasets and show that our techniques outperform the baselines. We also show that our methods are robust to hyperparameter changes. Lastly, we show that one of our proposed methods can easily adapt to post-hoc relaxations of client privacy requirements.

Via

Access Paper or Ask Questions

WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale

Jul 02, 2020

Srinivasan Iyengar, Stephen Lee, David Irwin, Prashant Shenoy, Benjamin Weil

Figure 1 for WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale

Figure 2 for WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale

Figure 3 for WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale

Figure 4 for WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale

Abstract:Buildings consume over 40% of the total energy in modern societies, and improving their energy efficiency can significantly reduce our energy footprint. In this paper, we present \texttt{WattScale}, a data-driven approach to identify the least energy-efficient buildings from a large population of buildings in a city or a region. Unlike previous methods such as least-squares that use point estimates, \texttt{WattScale} uses Bayesian inference to capture the stochasticity in the daily energy usage by estimating the distribution of parameters that affect a building. Further, it compares them with similar homes in a given population. \texttt{WattScale} also incorporates a fault detection algorithm to identify the underlying causes of energy inefficiency. We validate our approach using ground truth data from different geographical locations, which showcases its applicability in various settings. \texttt{WattScale} has two execution modes -- (i) individual, and (ii) region-based, which we highlight using two case studies. For the individual execution mode, we present results from a city containing >10,000 buildings and show that more than half of the buildings are inefficient in one way or another indicating a significant potential from energy improvement measures. Additionally, we provide probable cause of inefficiency and find that 41\%, 23.73\%, and 0.51\% homes have poor building envelope, heating, and cooling system faults, respectively. For the region-based execution mode, we show that \texttt{WattScale} can be extended to millions of homes in the US due to the recent availability of representative energy datasets.

* This paper appeared in the Journal ACM Transactions on Data Science

Via

Access Paper or Ask Questions

Making Contextual Decisions with Low Technical Debt

May 09, 2017

Alekh Agarwal, Sarah Bird, Markus Cozowicz, Luong Hoang, John Langford, Stephen Lee, Jiaji Li, Dan Melamed, Gal Oshri, Oswaldo Ribas(+2 more)

Figure 1 for Making Contextual Decisions with Low Technical Debt

Figure 2 for Making Contextual Decisions with Low Technical Debt

Figure 3 for Making Contextual Decisions with Low Technical Debt

Figure 4 for Making Contextual Decisions with Low Technical Debt

Abstract:Applications and systems are constantly faced with decisions that require picking from a set of actions based on contextual information. Reinforcement-based learning algorithms such as contextual bandits can be very effective in these settings, but applying them in practice is fraught with technical debt, and no general system exists that supports them completely. We address this and create the first general system for contextual learning, called the Decision Service. Existing systems often suffer from technical debt that arises from issues like incorrect data collection and weak debuggability, issues we systematically address through our ML methodology and system abstractions. The Decision Service enables all aspects of contextual bandit learning using four system abstractions which connect together in a loop: explore (the decision space), log, learn, and deploy. Notably, our new explore and log abstractions ensure the system produces correct, unbiased data, which our learner uses for online learning and to enable real-time safeguards, all in a fully reproducible manner. The Decision Service has a simple user interface and works with a variety of applications: we present two live production deployments for content recommendation that achieved click-through improvements of 25-30%, another with 18% revenue lift in the landing page, and ongoing applications in tech support and machine failure handling. The service makes real-time decisions and learns continuously and scalably, while significantly lowering technical debt.

Via

Access Paper or Ask Questions