Abstract:Extraction of concepts and entities of interest from non-formal texts such as social media posts and informal communication is an important capability for decision support systems in many domains, including healthcare, customer relationship management, and others. Despite the recent advances in training large language models for a variety of natural language processing tasks, the developed models and techniques have mainly focused on formal texts and do not perform as well on colloquial data, which is characterized by a number of distinct challenges. In our research, we focus on the healthcare domain and investigate the problem of symptom recognition from colloquial texts by designing and evaluating several training strategies for BERT-based model fine-tuning. These strategies are distinguished by the choice of the base model, the training corpora, and application of term perturbations in the training data. The best-performing models trained using these strategies outperform the state-of-the-art specialized symptom recognizer by a large margin. Through a series of experiments, we have found specific patterns of model behavior associated with the training strategies we designed. We present design principles for training strategies for effective entity recognition in colloquial texts based on our findings.
Abstract:We consider the problem of reasoning and planning with incomplete knowledge and deterministic actions. We introduce a knowledge representation scheme called PSIPLAN that can effectively represent incompleteness of an agent's knowledge while allowing for sound, complete and tractable entailment in domains where the set of all objects is either unknown or infinite. We present a procedure for state update resulting from taking an action in PSIPLAN that is correct, complete and has only polynomial complexity. State update is performed without considering the set of all possible worlds corresponding to the knowledge state. As a result, planning with PSIPLAN is done without direct manipulation of possible worlds. PSIPLAN representation underlies the PSIPOP planning algorithm that handles quantified goals with or without exceptions that no other domain independent planner has been shown to achieve. PSIPLAN has been implemented in Common Lisp and used in an application on planning in a collaborative interface.