Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oliver Johnson

Saved You A Click: Automatically Answering Clickbait Titles

Dec 15, 2022

Oliver Johnson, Beicheng Lou, Janet Zhong, Andrey Kurenkov

Abstract:Often clickbait articles have a title that is phrased as a question or vague teaser that entices the user to click on the link and read the article to find the explanation. We developed a system that will automatically find the answer or explanation of the clickbait hook from the website text so that the user does not need to read through the text themselves. We fine-tune an extractive question and answering model (RoBERTa) and an abstractive one (T5), using data scraped from the 'StopClickbait' Facebook pages and Reddit's 'SavedYouAClick' subforum. We find that both extractive and abstractive models improve significantly after finetuning. We find that the extractive model performs slightly better according to ROUGE scores, while the abstractive one has a slight edge in terms of BERTscores.

Via

Access Paper or Ask Questions

A strong converse bound for multiple hypothesis testing, with applications to high-dimensional estimation

Apr 04, 2018

Ramji Venkataramanan, Oliver Johnson

Abstract:In statistical inference problems, we wish to obtain lower bounds on the minimax risk, that is to bound the performance of any possible estimator. A standard technique to obtain risk lower bounds involves the use of Fano's inequality. In an information-theoretic setting, it is known that Fano's inequality typically does not give a sharp converse result (error lower bound) for channel coding problems. Moreover, recent work has shown that an argument based on binary hypothesis testing gives tighter results. We adapt this technique to the statistical setting, and argue that Fano's inequality can always be replaced by this approach to obtain tighter lower bounds that can be easily computed and are asymptotically sharp. We illustrate our technique in three applications: density estimation, active learning of a binary classifier, and compressed sensing, obtaining tighter risk lower bounds in each case.

* Electronic Journal of Statistics, Vol. 12, No. 1, pp. 1126-1149, 2018
* In the latest version, the value of $\lambda$ in the statements of Lemma 4.1 and Proposition 4.2 is restricted to the interval $(0,1]$. This is the correct condition, rather than $\lambda>0$ stated in the journal version below

Via

Access Paper or Ask Questions