Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

John Brock

Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization

Jun 17, 2019

Ari Azarafrooz, John Brock

Figure 1 for Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization

Abstract:We describe a novel extension of soft actor-critics for hierarchical Deep Q-Networks (HDQN) architectures using mutual information metric. The proposed extension provides a suitable framework for encouraging explorations in such hierarchical networks. A natural utilization of this framework is an adversarial setting, where meta-controller and controller play minimax over the mutual information objective but cooperate on maximizing expected rewards.

* Presented at the ICML 2019 workshop on Imitation, Intent, and Interaction, Long Beach, CA, USA

Via

Access Paper or Ask Questions

Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Dec 17, 2018

Ari Azarafrooz, John Brock

Figure 1 for Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Figure 2 for Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Figure 3 for Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Figure 4 for Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Abstract:Measuring the similarity of two files is an important task in malware analysis, with fuzzy hash functions being a popular approach. Traditional fuzzy hash functions are data agnostic: they do not learn from a particular dataset how to determine similarity; their behavior is fixed across all datasets. In this paper, we demonstrate that fuzzy hash functions can be learned in a novel minimax training framework and that these learned fuzzy hash functions outperform traditional fuzzy hash functions at the file similarity task for Portable Executable files. In our approach, hash digests can be extracted from the kernel embeddings of two kernel networks, trained in a minimax framework, where the roles of players during training (i.e adversary versus generator) alternate along with the input data. We refer to this new minimax architecture as perturbation-consistent. The similarity score for a pair of files is the utility of the minimax game in equilibrium. Our experiments show that learned fuzzy hash functions generalize well, capable of determining that two files are similar even when one of those files was generated using insertion and deletion operations.

Via

Access Paper or Ask Questions