Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Guillermo Moncecchi

A Crowd-Annotated Spanish Corpus for Humor Analysis

Jul 19, 2018

Santiago Castro, Luis Chiruzzo, Aiala Rosá, Diego Garat, Guillermo Moncecchi

Figure 1 for A Crowd-Annotated Spanish Corpus for Humor Analysis

Figure 2 for A Crowd-Annotated Spanish Corpus for Humor Analysis

Figure 3 for A Crowd-Annotated Spanish Corpus for Humor Analysis

Figure 4 for A Crowd-Annotated Spanish Corpus for Humor Analysis

Abstract:Computational Humor involves several tasks, such as humor recognition, humor generation, and humor scoring, for which it is useful to have human-curated data. In this work we present a corpus of 27,000 tweets written in Spanish and crowd-annotated by their humor value and funniness score, with about four annotations per tweet, tagged by 1,300 people over the Internet. It is equally divided between tweets coming from humorous and non-humorous accounts. The inter-annotator agreement Krippendorff's alpha value is 0.5710. The dataset is available for general use and can serve as a basis for humor detection and as a first step to tackle subjectivity.

* Camera-ready version of the paper submitted to SocialNLP 2018, with a fixed typo

Via

Access Paper or Ask Questions

Is This a Joke? Detecting Humor in Spanish Tweets

Mar 28, 2017

Santiago Castro, Matías Cubero, Diego Garat, Guillermo Moncecchi

Figure 1 for Is This a Joke? Detecting Humor in Spanish Tweets

Figure 2 for Is This a Joke? Detecting Humor in Spanish Tweets

Figure 3 for Is This a Joke? Detecting Humor in Spanish Tweets

Figure 4 for Is This a Joke? Detecting Humor in Spanish Tweets

Abstract:While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Computational Linguistics. There exist some previous works, but a characterization of humor that allows its automatic recognition and generation is far from being specified. In this work we build a crowdsourced corpus of labeled tweets, annotated according to its humor value, letting the annotators subjectively decide which are humorous. A humor classifier for Spanish tweets is assembled based on supervised learning, reaching a precision of 84% and a recall of 69%.

* Presented in Iberamia 2016. The final publication is available at link.springer.com: https://link.springer.com/chapter/10.1007%2F978-3-319-47955-2_12
* Preprint version, without referral

Via

Access Paper or Ask Questions