Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Pust

Rethinking Reflection in Pre-Training

Apr 05, 2025

Essential AI, :, Darsh J Shah, Peter Rushton, Somanshu Singla, Mohit Parmar, Kurt Smith, Yash Vanjani, Ashish Vaswani, Adarsh Chaluvaraju(+19 more)

Abstract:A language model's ability to reflect on its own reasoning provides a key advantage for solving complex problems. While most recent research has focused on how this ability develops during reinforcement learning, we show that it actually begins to emerge much earlier - during the model's pre-training. To study this, we introduce deliberate errors into chains-of-thought and test whether the model can still arrive at the correct answer by recognizing and correcting these mistakes. By tracking performance across different stages of pre-training, we observe that this self-correcting ability appears early and improves steadily over time. For instance, an OLMo2-7B model pre-trained on 4 trillion tokens displays self-correction on our six self-reflection tasks.

Via

Access Paper or Ask Questions

Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words

Aug 16, 2018

Nelson F. Liu, Jonathan May, Michael Pust, Kevin Knight

Figure 1 for Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words

Figure 2 for Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words

Figure 3 for Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words

Figure 4 for Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words

Abstract:Most statistical machine translation systems cannot translate words that are unseen in the training data. However, humans can translate many classes of out-of-vocabulary (OOV) words (e.g., novel morphological variants, misspellings, and compounds) without context by using orthographic clues. Following this observation, we describe and evaluate several general methods for OOV translation that use only subword information. We pose the OOV translation problem as a standalone task and intrinsically evaluate our approaches on fourteen typologically diverse languages across varying resource levels. Adding OOV translators to a statistical machine translation system yields consistent BLEU gains (0.5 points on average, and up to 2.0) for all fourteen languages, especially in low-resource scenarios.

* 7 pages

Via

Access Paper or Ask Questions

Using Syntax-Based Machine Translation to Parse English into Abstract Meaning Representation

Apr 28, 2015

Michael Pust, Ulf Hermjakob, Kevin Knight, Daniel Marcu, Jonathan May

Figure 1 for Using Syntax-Based Machine Translation to Parse English into Abstract Meaning Representation

Figure 2 for Using Syntax-Based Machine Translation to Parse English into Abstract Meaning Representation

Figure 3 for Using Syntax-Based Machine Translation to Parse English into Abstract Meaning Representation

Figure 4 for Using Syntax-Based Machine Translation to Parse English into Abstract Meaning Representation

Abstract:We present a parser for Abstract Meaning Representation (AMR). We treat English-to-AMR conversion within the framework of string-to-tree, syntax-based machine translation (SBMT). To make this work, we transform the AMR structure into a form suitable for the mechanics of SBMT and useful for modeling. We introduce an AMR-specific language model and add data and features drawn from semantic resources. Our resulting AMR parser improves upon state-of-the-art results by 7 Smatch points.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions