Abstract:The COVID-19 pandemic has put immense pressure on health systems which are further strained due to the misinformation surrounding it. Under such a situation, providing the right information at the right time is crucial. There is a growing demand for the management of information spread using Artificial Intelligence. Hence, we have exploited the potential of Natural Language Processing for identifying relevant information that needs to be disseminated amongst the masses. In this work, we present a novel Cross-lingual Natural Language Processing framework to provide relevant information by matching daily news with trusted guidelines from the World Health Organization. The proposed pipeline deploys various techniques of NLP such as summarizers, word embeddings, and similarity metrics to provide users with news articles along with a corresponding healthcare guideline. A total of 36 models were evaluated and a combination of LexRank based summarizer on Word2Vec embedding with Word Mover distance metric outperformed all other models. This novel open-source approach can be used as a template for proactive dissemination of relevant healthcare information in the midst of misinformation spread associated with epidemics.
Abstract:Proactive management of an Infodemic that grows faster than the underlying epidemic is a modern-day challenge. This requires raising awareness and sensitization with the correct information in order to prevent and contain outbreaks such as the ongoing COVID-19 pandemic. Therefore, there is a fine balance between continuous awareness-raising by providing new information and the risk of misinformation. In this work, we address this gap by creating a life-long learning application that delivers authentic information to users in Hindi and English, the most widely used languages in India. It does this by matching sources of verified and authentic information such as the WHO reports against daily news by using machine learning and natural language processing. It delivers the narrated content in Hindi by using state-of-the-art text to speech engines. Finally, the approach allows user input for continuous improvement of news feed relevance daily. We demonstrate this approach for Water, Sanitation, Hygiene information for containment of the COVID-19 pandemic. Thirteen combinations of pre-processing strategies, word-embeddings, and similarity metrics were evaluated by eight human users via calculation of agreement statistics. The best performing combination achieved a Cohen's Kappa of 0.54 and was deployed as On AIr, WashKaro's AI-powered back-end. We introduced a novel way of contact tracing, deploying the Bluetooth sensors of an individual's smartphone and automatic recording of physical interactions with other users. Additionally, the application also features a symptom self-assessment tool based on WHO-approved guidelines, human-curated and vetted information to reach out to the community as audio-visual content in local languages. WashKaro - http://tiny.cc/WashKaro