Abstract:RF fingerprinting leverages circuit-level variability of transmitters to identify them using signals they send. Signals used for identification are impacted by a wireless channel and receiver circuitry, creating additional impairments that can confuse transmitter identification. Eliminating these impairments or just evaluating them, requires data captured over a prolonged period of time, using many spatially separated transmitters and receivers. In this paper, we present WiSig; a large scale WiFi dataset containing 10 million packets captured from 174 off-the-shelf WiFi transmitters and 41 USRP receivers over 4 captures spanning a month. WiSig is publicly available, not just as raw captures, but as conveniently pre-processed subsets of limited size, along with the scripts and examples. A preliminary evaluation performed using WiSig shows that changing receivers, or using signals captured on a different day can significantly degrade a trained classifier's performance. While capturing data over more days or more receivers limits the degradation, it is not always feasible and novel data-driven approaches are needed. WiSig provides the data to develop and evaluate these approaches towards channel and receiver agnostic transmitter fingerprinting.
Abstract:As the Internet of Things (IoT) continues to grow, ensuring the security of systems that rely on wireless IoT devices has become critically important. Deep learning-based passive physical layer transmitter authorization systems have been introduced recently for this purpose, as they accommodate the limited computational and power budget of such devices. These systems have been shown to offer excellent outlier detection accuracies when trained and tested on a fixed authorized transmitter set. However in a real-life deployment, a need may arise for transmitters to be added and removed as the authorized set of transmitters changes. In such cases, the system could experience long down-times, as retraining the underlying deep learning model is often a time-consuming process. In this paper, we draw inspiration from information retrieval to address this problem: by utilizing feature vectors as RF fingerprints, we first demonstrate that training could be simplified to indexing those feature vectors into a database using locality sensitive hashing (LSH). Then we show that approximate nearest neighbor search could be performed on the database to perform transmitter authorization that matches the accuracy of deep learning models, while allowing for more than 100x faster retraining. Furthermore, dimensionality reduction techniques are used on the feature vectors to show that the authorization latency of our technique could be reduced to approach that of traditional deep learning-based systems.
Abstract:RF devices can be identified by unique imperfections embedded in the signals they transmit called RF fingerprints. The closed set classification of such devices, where the identification must be made among an authorized set of transmitters, has been well explored. However, the much more difficult problem of open set classification, where the classifier needs to reject unauthorized transmitters while recognizing authorized transmitters, has only been recently visited. So far, efforts at open set classification have largely relied on the utilization of signal samples captured from a known set of unauthorized transmitters to aid the classifier learn unauthorized transmitter fingerprints. Since acquiring new transmitters to use as known transmitters is highly expensive, we propose to use generative deep learning methods to emulate unauthorized signal samples for the augmentation of training datasets. We develop two different data augmentation techniques, one that exploits a limited number of known unauthorized transmitters and the other that does not require any unauthorized transmitters. Experiments conducted on a dataset captured from a WiFi testbed indicate that data augmentation allows for significant increases in open set classification accuracy, especially when the authorized set is small.