Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mark McLean

Learning with Holographic Reduced Representations

Sep 05, 2021

Ashwinkumar Ganesan, Hang Gao, Sunil Gandhi, Edward Raff, Tim Oates, James Holt, Mark McLean

Figure 1 for Learning with Holographic Reduced Representations

Figure 2 for Learning with Holographic Reduced Representations

Figure 3 for Learning with Holographic Reduced Representations

Figure 4 for Learning with Holographic Reduced Representations

Abstract:Holographic Reduced Representations (HRR) are a method for performing symbolic AI on top of real-valued vectors \cite{Plate1995} by associating each vector with an abstract concept, and providing mathematical operations to manipulate vectors as if they were classic symbolic objects. This method has seen little use outside of older symbolic AI work and cognitive science. Our goal is to revisit this approach to understand if it is viable for enabling a hybrid neural-symbolic approach to learning as a differentiable component of a deep learning architecture. HRRs today are not effective in a differentiable solution due to numerical instability, a problem we solve by introducing a projection step that forces the vectors to exist in a well behaved point in space. In doing so we improve the concept retrieval efficacy of HRRs by over $100\times$. Using multi-label classification we demonstrate how to leverage the symbolic HRR properties to develop an output layer and loss function that is able to learn effectively, and allows us to investigate some of the pros and cons of an HRR neuro-symbolic learning approach.

Via

Access Paper or Ask Questions

Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Dec 17, 2020

Edward Raff, William Fleshman, Richard Zak, Hyrum S. Anderson, Bobby Filar, Mark McLean

Figure 1 for Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Figure 2 for Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Figure 3 for Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Figure 4 for Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Abstract:Recent works within machine learning have been tackling inputs of ever-increasing size, with cybersecurity presenting sequence classification problems of particularly extreme lengths. In the case of Windows executable malware detection, inputs may exceed $100$ MB, which corresponds to a time series with $T=100,000,000$ steps. To date, the closest approach to handling such a task is MalConv, a convolutional neural network capable of processing up to $T=2,000,000$ steps. The $\mathcal{O}(T)$ memory of CNNs has prevented further application of CNNs to malware. In this work, we develop a new approach to temporal max pooling that makes the required memory invariant to the sequence length $T$. This makes MalConv $116\times$ more memory efficient, and up to $25.8\times$ faster to train on its original dataset, while removing the input length restrictions to MalConv. We re-invest these gains into improving the MalConv architecture by developing a new Global Channel Gating design, giving us an attention mechanism capable of learning feature interactions across 100 million time steps in an efficient manner, a capability lacked by the original MalConv CNN. Our implementation can be found at https://github.com/NeuromorphicComputationResearchProgram/MalConv2

* To appear in AAAI 2021

Via

Access Paper or Ask Questions

A New Burrows Wheeler Transform Markov Distance

Dec 30, 2019

Edward Raff, Charles Nicholas, Mark McLean

Figure 1 for A New Burrows Wheeler Transform Markov Distance

Figure 2 for A New Burrows Wheeler Transform Markov Distance

Figure 3 for A New Burrows Wheeler Transform Markov Distance

Figure 4 for A New Burrows Wheeler Transform Markov Distance

Abstract:Prior work inspired by compression algorithms has described how the Burrows Wheeler Transform can be used to create a distance measure for bioinformatics problems. We describe issues with this approach that were not widely known, and introduce our new Burrows Wheeler Markov Distance (BWMD) as an alternative. The BWMD avoids the shortcomings of earlier efforts, and allows us to tackle problems in variable length DNA sequence clustering. BWMD is also more adaptable to other domains, which we demonstrate on malware classification tasks. Unlike other compression-based distance metrics known to us, BWMD works by embedding sequences into a fixed-length feature vector. This allows us to provide significantly improved clustering performance on larger malware corpora, a weakness of prior methods.

* To appear in: The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)

Via

Access Paper or Ask Questions

KiloGrams: Very Large N-Grams for Malware Classification

Aug 01, 2019

Edward Raff, William Fleming, Richard Zak, Hyrum Anderson, Bill Finlayson, Charles Nicholas, Mark McLean

Figure 1 for KiloGrams: Very Large N-Grams for Malware Classification

Figure 2 for KiloGrams: Very Large N-Grams for Malware Classification

Figure 3 for KiloGrams: Very Large N-Grams for Malware Classification

Figure 4 for KiloGrams: Very Large N-Grams for Malware Classification

Abstract:N-grams have been a common tool for information retrieval and machine learning applications for decades. In nearly all previous works, only a few values of $n$ are tested, with $n > 6$ being exceedingly rare. Larger values of $n$ are not tested due to computational burden or the fear of overfitting. In this work, we present a method to find the top-$k$ most frequent $n$-grams that is 60$\times$ faster for small $n$, and can tackle large $n\geq1024$. Despite the unprecedented size of $n$ considered, we show how these features still have predictive ability for malware classification tasks. More important, large $n$-grams provide benefits in producing features that are interpretable by malware analysis, and can be used to create general purpose signatures compatible with industry standard tools like Yara. Furthermore, the counts of common $n$-grams in a file may be added as features to publicly available human-engineered features that rival efficacy of professionally-developed features when used to train gradient-boosted decision tree models on the EMBER dataset.

* Appearing in LEMINCS @ KDD'19, August 5th, 2019, Anchorage, Alaska, United States

Via

Access Paper or Ask Questions

Non-Negative Networks Against Adversarial Attacks

Jun 15, 2018

William Fleshman, Edward Raff, Jared Sylvester, Steven Forsyth, Mark McLean

Figure 1 for Non-Negative Networks Against Adversarial Attacks

Figure 2 for Non-Negative Networks Against Adversarial Attacks

Figure 3 for Non-Negative Networks Against Adversarial Attacks

Figure 4 for Non-Negative Networks Against Adversarial Attacks

Abstract:Adversarial attacks against Neural Networks are a problem of considerable importance, for which effective defenses are not yet readily available. We make progress toward this problem by showing that non-negative weight constraints can be used to improve resistance in specific scenarios. In particular, we show that they can provide an effective defense for binary classification problems with asymmetric cost, such as malware or spam detection. We also show how non-negativity can be leveraged to reduce an attacker's ability to perform targeted misclassification attacks in other domains such as image processing.

Via

Access Paper or Ask Questions

Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

Jun 12, 2018

William Fleshman, Edward Raff, Richard Zak, Mark McLean, Charles Nicholas

Figure 1 for Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

Figure 2 for Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

Figure 3 for Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

Figure 4 for Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

Abstract:As machine-learning (ML) based systems for malware detection become more prevalent, it becomes necessary to quantify the benefits compared to the more traditional anti-virus (AV) systems widely used today. It is not practical to build an agreed upon test set to benchmark malware detection systems on pure classification performance. Instead we tackle the problem by creating a new testing methodology, where we evaluate the change in performance on a set of known benign & malicious files as adversarial modifications are performed. The change in performance combined with the evasion techniques then quantifies a system's robustness against that approach. Through these experiments we are able to show in a quantifiable way how purely ML based systems can be more robust than AV products at detecting malware that attempts evasion through modification, but may be slower to adapt in the face of significantly novel attacks.

Via

Access Paper or Ask Questions