Abstract:In this work, we modify and apply self-supervision techniques to the domain of medical health insurance claims. We model patients' healthcare claims history analogous to free-text narratives, and introduce pre-trained `prior knowledge', later utilized for patient outcome predictions on a challenging task: predicting Covid-19 hospitalization, given a patient's pre-Covid-19 insurance claims history. Results suggest that pre-training on insurance claims not only produces better prediction performance, but, more importantly, improves the model's `clinical trustworthiness' and model stability/reliability.
Abstract:We are losing biodiversity at an unprecedented scale and in many cases, we do not even know the basic data for the species. Traditional methods for wildlife monitoring are inadequate. Development of new computer vision tools enables the use of images as the source of information about wildlife. Social media is the rich source of wildlife images, which come with a huge bias, thus thwarting traditional population size estimate approaches. Here, we present a new framework to take into account the social media bias when using this data source to provide wildlife population size estimates. We show that, surprisingly, this is a learnable and potentially solvable problem.
Abstract:Activity recognition and, more generally, behavior inference tasks are gaining a lot of interest. Much of it is work in the context of human behavior. New available tracking technologies for wild animals are generating datasets that indirectly may provide information about animal behavior. In this work, we propose a method for classifying these data into behavioral annotation, particularly collective behavior of a social group. Our method is based on sequence analysis with a direct encoding of the interactions of a group of wild animals. We evaluate our approach on a real world dataset, showing significant accuracy improvements over baseline methods.