DFKI Saarbruecken
Abstract:Despite recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional speech in noisy and reverberant acoustic environments. This poses a particular challenge in the search and rescue (SAR) domain, where transcribing conversations among rescue team members is crucial to support real-time decision-making. The scarcity of speech data and associated background noise in SAR scenarios make it difficult to deploy robust speech recognition systems. To address this issue, we have created and made publicly available a German speech dataset called RescueSpeech. This dataset includes real speech recordings from simulated rescue exercises. Additionally, we have released competitive training recipes and pre-trained models. Our study indicates that the current level of performance achieved by state-of-the-art methods is still far from being acceptable.
Abstract:We present VOnDA, a framework to implement the dialogue management functionality in dialogue systems. Although domain-independent, VOnDA is tailored towards dialogue systems with a focus on social communication, which implies the need of long-term memory and high user adaptivity. For these systems, which are used in health environments or elderly care, margin of error is very low and control over the dialogue process is of topmost importance. The same holds for commercial applications, where customer trust is at risk. VOnDA's specification and memory layer relies upon (extended) RDF/OWL, which provides a universal and uniform representation, and facilitates interoperability with external data sources, e.g., from physical sensors.
Abstract:We present an implemented compilation algorithm that translates HPSG into lexicalized feature-based TAG, relating concepts of the two theories. While HPSG has a more elaborated principle-based theory of possible phrase structures, TAG provides the means to represent lexicalized structures more explicitly. Our objectives are met by giving clear definitions that determine the projection of structures from the lexicon, and identify maximal projections, auxiliary trees and foot nodes.
Abstract:The natural language system DISCO is described. It combines o a powerful and flexible grammar development system; o linguistic competence for German including morphology, syntax and semantics; o new methods for linguistic performance modelling on the basis of high-level competence grammars; o new methods for modelling multi-agent dialogue competence; o an interesting sample application for appointment scheduling and calendar management.