Abstract:Textual documents need to be of good quality to ensure effective asynchronous communication in remote areas, especially during the COVID-19 pandemic. However, defining a preferred document structure (content and arrangement) for improving lay readers' decision-making is challenging. First, the types of useful content for various readers cannot be determined simply by gathering expert knowledge. Second, methodologies to evaluate the document's usefulness from the user's perspective have not been established. This study proposed the experimental framework to identify useful contents of documents by aggregating lay readers' insights. This study used 200 online recipes as research subjects and recruited 1,340 amateur cooks as lay readers. The proposed framework identified six useful contents of recipes. Multi-level modeling then showed that among the six identified contents, suitable ingredients or notes arranged with a subheading at the end of each cooking step significantly increased recipes' usefulness. Our framework contributes to the communication design via documents.
Abstract:Pre-editing is the process of modifying the source text (ST) so that it can be translated by machine translation (MT) in a better quality. Despite the unpredictability of black-box neural MT (NMT), pre-editing has been deployed in various practical MT use cases. Although many studies have demonstrated the effectiveness of pre-editing methods for particular settings, thus far, a deep understanding of what pre-editing is and how it works for black-box NMT is lacking. To elicit such understanding, we extensively investigated human pre-editing practices. We first implemented a protocol to incrementally record the minimum edits for each ST and collected 6,652 instances of pre-editing across three translation directions, two MT systems, and four text domains. We then analysed the instances from three perspectives: the characteristics of the pre-edited ST, the diversity of pre-editing operations, and the impact of the pre-editing operations on NMT outputs. Our findings include the following: (1) enhancing the explicitness of the meaning of an ST and its syntactic structure is more important for obtaining better translations than making the ST shorter and simpler, and (2) although the impact of pre-editing on NMT is generally unpredictable, there are some tendencies of changes in the NMT outputs depending on the editing operation types.