Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tzvika Hartman

A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy

May 22, 2019

Genady Beryozkin, Yoel Drori, Oren Gilon, Tzvika Hartman, Idan Szpektor

Figure 1 for A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy

Figure 2 for A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy

Figure 3 for A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy

Figure 4 for A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy

Abstract:We study a variant of domain adaptation for named-entity recognition where multiple, heterogeneously tagged training sets are available. Furthermore, the test tag-set is not identical to any individual training tag-set. Yet, the relations between all tags are provided in a tag hierarchy, covering the test tags as a combination of training tags. This setting occurs when various datasets are created using different annotation schemes. This is also the case of extending a tag-set with a new tag by annotating only the new tag in a new dataset. We propose to use the given tag hierarchy to jointly learn a neural network that shares its tagging layer among all tag-sets. We compare this model to combining independent models and to a model based on the multitasking approach. Our experiments show the benefit of the tag-hierarchy model, especially when facing non-trivial consolidation of tag-sets.

* Accepted at ACL 2019

Via

Access Paper or Ask Questions

Audio De-identification: A New Entity Recognition Task

Mar 17, 2019

Ido Cohn, Itay Laish, Genady Beryozkin, Gang Li, Izhak Shafran, Idan Szpektor, Tzvika Hartman, Avinatan Hassidim, Yossi Matias

Figure 1 for Audio De-identification: A New Entity Recognition Task

Figure 2 for Audio De-identification: A New Entity Recognition Task

Figure 3 for Audio De-identification: A New Entity Recognition Task

Figure 4 for Audio De-identification: A New Entity Recognition Task

Abstract:Named Entity Recognition (NER) has been mostly studied in the context of written text. Specifically, NER is an important step in de-identification (de-ID) of medical records, many of which are recorded conversations between a patient and a doctor. In such recordings, audio spans with personal information should be redacted, similar to the redaction of sensitive character spans in de-ID for written text. The application of NER in the context of audio de-identification has yet to be fully investigated. To this end, we define the task of audio de-ID, in which audio spans with entity mentions should be detected. We then present our pipeline for this task, which involves Automatic Speech Recognition (ASR), NER on the transcript text, and text-to-audio alignment. Finally, we introduce a novel metric for audio de-ID and a new evaluation benchmark consisting of a large labeled segment of the Switchboard and Fisher audio datasets and detail our pipeline's results on it.

* Accepted to NAACL 2019 Industry Track

Via

Access Paper or Ask Questions