Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Gordon

Prevalence and prevention of large language model use in crowd work

Oct 24, 2023

Veniamin Veselovsky, Manoel Horta Ribeiro, Philip Cozzolino, Andrew Gordon, David Rothschild, Robert West

Figure 1 for Prevalence and prevention of large language model use in crowd work

Figure 2 for Prevalence and prevention of large language model use in crowd work

Abstract:We show that the use of large language models (LLMs) is prevalent among crowd workers, and that targeted mitigation strategies can significantly reduce, but not eliminate, LLM use. On a text summarization task where workers were not directed in any way regarding their LLM use, the estimated prevalence of LLM use was around 30%, but was reduced by about half by asking workers to not use LLMs and by raising the cost of using them, e.g., by disabling copy-pasting. Secondary analyses give further insight into LLM use and its prevention: LLM use yields high-quality but homogeneous responses, which may harm research concerned with human (rather than model) behavior and degrade future models trained with crowdsourced data. At the same time, preventing LLM use may be at odds with obtaining high-quality responses; e.g., when requesting workers not to use LLMs, summaries contained fewer keywords carrying essential information. Our estimates will likely change as LLMs increase in popularity or capabilities, and as norms around their usage change. Yet, understanding the co-evolution of LLM-based tools and users is key to maintaining the validity of research done using crowdsourcing, and we provide a critical baseline before widespread adoption ensues.

* VV and MHR equal contribution. 14 pages, 1 figure, 1 table

Via

Access Paper or Ask Questions

Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Apr 14, 2022

Carina Negreanu, Alperen Karaoglu, Jack Williams, Shuang Chen, Daniel Fabian, Andrew Gordon, Chin-Yew Lin

Figure 1 for Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Figure 2 for Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Figure 3 for Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Figure 4 for Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Abstract:Row completion is the task of augmenting a given table of text and numbers with additional, relevant rows. The task divides into two steps: subject suggestion, the task of populating the main column; and gap filling, the task of populating the remaining columns. We present state-of-the-art results for subject suggestion and gap filling measured on a standard benchmark (WikiTables). Our idea is to solve this task by harmoniously combining knowledge base table interpretation and free text generation. We interpret the table using the knowledge base to suggest new rows and generate metadata like headers through property linking. To improve candidate diversity, we synthesize additional rows using free text generation via GPT-3, and crucially, we exploit the metadata we interpret to produce better prompts for text generation. Finally, we verify that the additional synthesized content can be linked to the knowledge base or a trusted web source such as Wikipedia.

Via

Access Paper or Ask Questions

The Wreath Process: A totally generative model of geometric shape based on nested symmetries

Jun 09, 2015

Diana Borsa, Thore Graepel, Andrew Gordon

Figure 1 for The Wreath Process: A totally generative model of geometric shape based on nested symmetries

Figure 2 for The Wreath Process: A totally generative model of geometric shape based on nested symmetries

Figure 3 for The Wreath Process: A totally generative model of geometric shape based on nested symmetries

Figure 4 for The Wreath Process: A totally generative model of geometric shape based on nested symmetries

Abstract:We consider the problem of modelling noisy but highly symmetric shapes that can be viewed as hierarchies of whole-part relationships in which higher level objects are composed of transformed collections of lower level objects. To this end, we propose the stochastic wreath process, a fully generative probabilistic model of drawings. Following Leyton's "Generative Theory of Shape", we represent shapes as sequences of transformation groups composed through a wreath product. This representation emphasizes the maximization of transfer --- the idea that the most compact and meaningful representation of a given shape is achieved by maximizing the re-use of existing building blocks or parts. The proposed stochastic wreath process extends Leyton's theory by defining a probability distribution over geometric shapes in terms of noise processes that are aligned with the generative group structure of the shape. We propose an inference scheme for recovering the generative history of given images in terms of the wreath process using reversible jump Markov chain Monte Carlo methods and Approximate Bayesian Computation. In the context of sketching we demonstrate the feasibility and limitations of this approach on model-generated and real data.

* 10 pages(double-column), 60+ figures

Via

Access Paper or Ask Questions