Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Remi Eyraud

Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach

Sep 28, 2020

Remi Eyraud, Stephane Ayache

Figure 1 for Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach

Figure 2 for Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach

Figure 3 for Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach

Figure 4 for Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach

Abstract:This paper is an attempt to bridge the gap between deep learning and grammatical inference. Indeed, it provides an algorithm to extract a (stochastic) formal language from any recurrent neural network trained for language modelling. In detail, the algorithm uses the already trained network as an oracle -- and thus does not require the access to the inner representation of the black-box -- and applies a spectral approach to infer a weighted automaton. As weighted automata compute linear functions, they are computationally more efficient than neural networks and thus the nature of the approach is the one of knowledge distillation. We detail experiments on 62 data sets (both synthetic and from real-world applications) that allow an in-depth study of the abilities of the proposed algorithm. The results show the WA we extract are good approximations of the RNN, validating the approach. Moreover, we show how the process provides interesting insights toward the behavior of RNN learned on data, enlarging the scope of this work to the one of explainability of deep learning models.

Via

Access Paper or Ask Questions

Learning with Partially Ordered Representations

Jun 23, 2019

Jane Chandlee, Remi Eyraud, Jeffrey Heinz, Adam Jardine, Jonathan Rawski

Figure 1 for Learning with Partially Ordered Representations

Figure 2 for Learning with Partially Ordered Representations

Figure 3 for Learning with Partially Ordered Representations

Figure 4 for Learning with Partially Ordered Representations

Abstract:This paper examines the characterization and learning of grammars defined with enriched representational models. Model-theoretic approaches to formal language theory traditionally assume that each position in a string belongs to exactly one unary relation. We consider unconventional string models where positions can have multiple, shared properties, which are arguably useful in many applications. We show the structures given by these models are partially ordered, and present a learning algorithm that exploits this ordering relation to effectively prune the hypothesis space. We prove this learning algorithm, which takes positive examples as input, finds the most general grammar which covers the data.

* to appear in Proceedings of Mathematics of Language (ACL SIGMOL 2019)

Via

Access Paper or Ask Questions

Explaining Black Boxes on Sequential Data using Weighted Automata

Oct 12, 2018

Stephane Ayache, Remi Eyraud, Noe Goudian

Figure 1 for Explaining Black Boxes on Sequential Data using Weighted Automata

Figure 2 for Explaining Black Boxes on Sequential Data using Weighted Automata

Figure 3 for Explaining Black Boxes on Sequential Data using Weighted Automata

Figure 4 for Explaining Black Boxes on Sequential Data using Weighted Automata

Abstract:Understanding how a learned black box works is of crucial interest for the future of Machine Learning. In this paper, we pioneer the question of the global interpretability of learned black box models that assign numerical values to symbolic sequential data. To tackle that task, we propose a spectral algorithm for the extraction of weighted automata (WA) from such black boxes. This algorithm does not require the access to a dataset or to the inner representation of the black box: the inferred model can be obtained solely by querying the black box, feeding it with inputs and analyzing its outputs. Experiments using Recurrent Neural Networks (RNN) trained on a wide collection of 48 synthetic datasets and 2 real datasets show that the obtained approximation is of great quality.

* Published in the Proceedings of the International Conference in Grammatical Inference, September 2018

Via

Access Paper or Ask Questions