Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:(Almost) All of Entity Resolution

Aug 10, 2020

Olivier Binette, Rebecca C. Steorts

Figure 1 for (Almost) All of Entity Resolution

Figure 2 for (Almost) All of Entity Resolution

Figure 3 for (Almost) All of Entity Resolution

Share this with someone who'll enjoy it:

Abstract:Whether the goal is to estimate the number of people that live in a congressional district, to estimate the number of individuals that have died in an armed conflict, or to disambiguate individual authors using bibliographic data, all these applications have a common theme - integrating information from multiple sources. Before such questions can be answered, databases must be cleaned and integrated in a systematic and accurate way, commonly known as record linkage, de-duplication, or entity resolution. In this article, we review motivational applications and seminal papers that have led to the growth of this area. Specifically, we review the foundational work that began in the 1940's and 50's that have led to modern probabilistic record linkage. We review clustering approaches to entity resolution, semi- and fully supervised methods, and canonicalization, which are being used throughout industry and academia in applications such as human rights, official statistics, medicine, citation networks, among others. Finally, we discuss current research topics of practical importance.

* 53 pages, includes supplementary materials

View paper on

Share this with someone who'll enjoy it:

Title:(Almost) All of Entity Resolution

Paper and Code