Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Douglas Oard

HC4: A New Suite of Test Collections for Ad Hoc CLIR

Jan 24, 2022

Dawn Lawrie, James Mayfield, Douglas Oard, Eugene Yang

Figure 1 for HC4: A New Suite of Test Collections for Ad Hoc CLIR

Figure 2 for HC4: A New Suite of Test Collections for Ad Hoc CLIR

Figure 3 for HC4: A New Suite of Test Collections for Ad Hoc CLIR

Figure 4 for HC4: A New Suite of Test Collections for Ad Hoc CLIR

Abstract:HC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval (CLIR), with Common Crawl News documents in Chinese, Persian, and Russian, topics in English and in the document languages, and graded relevance judgments. New test collections are needed because existing CLIR test collections built using pooling of traditional CLIR runs have systematic gaps in their relevance judgments when used to evaluate neural CLIR methods. The HC4 collections contain 60 topics and about half a million documents for each of Chinese and Persian, and 54 topics and five million documents for Russian. Active learning was used to determine which documents to annotate after being seeded using interactive search and judgment. Documents were judged on a three-grade relevance scale. This paper describes the design and construction of the new test collections and provides baseline results for demonstrating their utility for evaluating systems.

* 16 pages, 2 figures, accepted at ECIR 2022

Via

Access Paper or Ask Questions

Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

Apr 27, 2021

Han-Chin Shing, Chaitanya Shivade, Nima Pourdamghani, Feng Nan, Philip Resnik, Douglas Oard, Parminder Bhatia

Figure 1 for Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

Figure 2 for Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

Figure 3 for Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

Figure 4 for Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

Abstract:The records of a clinical encounter can be extensive and complex, thus placing a premium on tools that can extract and summarize relevant information. This paper introduces the task of generating discharge summaries for a clinical encounter. Summaries in this setting need to be faithful, traceable, and scale to multiple long documents, motivating the use of extract-then-abstract summarization cascades. We introduce two new measures, faithfulness and hallucination rate for evaluation in this task, which complement existing measures for fluency and informativeness. Results across seven medical sections and five models show that a summarization architecture that supports traceability yields promising results, and that a sentence-rewriting approach performs consistently on the measure used for faithfulness (faithfulness-adjusted $F_3$) over a diverse range of generated sections.

Via

Access Paper or Ask Questions