Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Illia Polosukhin

NAPS: Natural Program Synthesis Dataset

Jul 06, 2018

Maksym Zavershynskyi, Alex Skidanov, Illia Polosukhin

Figure 1 for NAPS: Natural Program Synthesis Dataset

Figure 2 for NAPS: Natural Program Synthesis Dataset

Figure 3 for NAPS: Natural Program Synthesis Dataset

Figure 4 for NAPS: Natural Program Synthesis Dataset

Abstract:We present a program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems. The problem statements were collected via crowdsourcing and the program solutions were extracted from human-written solutions in programming competitions, accompanied by input/output examples. We propose using this dataset for the program synthesis tasks aimed for working with real user-generated data. As a baseline we present few models, with the best model achieving 8.8% accuracy, showcasing both the complexity of the dataset and large room for future research.

* 4 pages, 5 tables in 2nd Workshop on Neural Abstract Machines & Program Induction (NAMPI), @ICML 2018

Via

Access Paper or Ask Questions

Neural Program Search: Solving Programming Tasks from Description and Examples

Feb 12, 2018

Illia Polosukhin, Alexander Skidanov

Figure 1 for Neural Program Search: Solving Programming Tasks from Description and Examples

Figure 2 for Neural Program Search: Solving Programming Tasks from Description and Examples

Figure 3 for Neural Program Search: Solving Programming Tasks from Description and Examples

Figure 4 for Neural Program Search: Solving Programming Tasks from Description and Examples

Abstract:We present a Neural Program Search, an algorithm to generate programs from natural language description and a small number of input/output examples. The algorithm combines methods from Deep Learning and Program Synthesis fields by designing rich domain-specific language (DSL) and defining efficient search algorithm guided by a Seq2Tree model on it. To evaluate the quality of the approach we also present a semi-synthetic dataset of descriptions with test examples and corresponding programs. We show that our algorithm significantly outperforms a sequence-to-sequence model with attention baseline.

* 9 pages, 3 figures, ICLR workshop

Via

Access Paper or Ask Questions

Attention Is All You Need

Dec 06, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

Abstract:The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

* 15 pages, 5 figures

Via

Access Paper or Ask Questions

TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks

Aug 08, 2017

Heng-Tze Cheng, Zakaria Haque, Lichan Hong, Mustafa Ispir, Clemens Mewald, Illia Polosukhin, Georgios Roumpos, D Sculley, Jamie Smith, David Soergel(+5 more)

Figure 1 for TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks

Figure 2 for TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks

Figure 3 for TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks

Figure 4 for TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks

Abstract:We present a framework for specifying, training, evaluating, and deploying machine learning models. Our focus is on simplifying cutting edge machine learning for practitioners in order to bring such technologies into production. Recognizing the fast evolution of the field of deep learning, we make no attempt to capture the design space of all possible model architectures in a domain- specific language (DSL) or similar configuration language. We allow users to write code to define their models, but provide abstractions that guide develop- ers to write models in ways conducive to productionization. We also provide a unifying Estimator interface, making it possible to write downstream infrastructure (e.g. distributed training, hyperparameter tuning) independent of the model implementation. We balance the competing demands for flexibility and simplicity by offering APIs at different levels of abstraction, making common model architectures available out of the box, while providing a library of utilities designed to speed up experimentation with model architectures. To make out of the box models flexible and usable across a wide range of problems, these canned Estimators are parameterized not only over traditional hyperparameters, but also using feature columns, a declarative specification describing how to interpret input data. We discuss our experience in using this framework in re- search and production environments, and show the impact on code health, maintainability, and development speed.

* 8 pages, Appeared at KDD 2017, August 13--17, 2017, Halifax, NS, Canada

Via

Access Paper or Ask Questions

WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

Mar 15, 2017

Daniel Hewlett, Alexandre Lacoste, Llion Jones, Illia Polosukhin, Andrew Fandrianto, Jay Han, Matthew Kelcey, David Berthelot

Figure 1 for WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

Figure 2 for WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

Figure 3 for WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

Figure 4 for WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

Abstract:We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end models such as deep neural networks (DNNs). We compare various state-of-the-art DNN-based architectures for document classification, information extraction, and question answering. We find that models supporting a rich answer space, such as word or character sequences, perform best. Our best-performing model, a word-level sequence to sequence model with a mechanism to copy out-of-vocabulary words, obtains an accuracy of 71.8%.

* Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016, pp. 1535-1545

Via

Access Paper or Ask Questions

Hierarchical Question Answering for Long Documents

Feb 08, 2017

Eunsol Choi, Daniel Hewlett, Alexandre Lacoste, Illia Polosukhin, Jakob Uszkoreit, Jonathan Berant

Figure 1 for Hierarchical Question Answering for Long Documents

Figure 2 for Hierarchical Question Answering for Long Documents

Figure 3 for Hierarchical Question Answering for Long Documents

Figure 4 for Hierarchical Question Answering for Long Documents

Abstract:We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models. While most successful approaches for reading comprehension rely on recurrent neural networks (RNNs), running them over long documents is prohibitively slow because it is difficult to parallelize over sequences. Inspired by how people first skim the document, identify relevant parts, and carefully read these parts to produce an answer, we combine a coarse, fast model for selecting relevant sentences and a more expensive RNN for producing the answer from those sentences. We treat sentence selection as a latent variable trained jointly from the answer only using reinforcement learning. Experiments demonstrate the state of the art performance on a challenging subset of the Wikireading and on a new dataset, while speeding up the model by 3.5x-6.7x.

Via

Access Paper or Ask Questions