Abstract:In this research, we continuously collect data from the RSS feeds of traditional news sources. We apply several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of each implementation. We also perform sentiment analysis of each news article at the document, paragraph and sentence level, with the goal of creating a corpus of tagged news articles that is made available to the public through a web interface. Finally, we show how the data in this corpus could be used to identify bias in news reporting.
Abstract:Leveraging over 30,000 images each with up to 89 labels collected by Recology---an integrated resource recovery company with both residential and commercial trash, recycling and composting services---the authors develop ContamiNet, a convolutional neural network, to identify contaminating material in residential recycling and compost bins. When training the model on a subset of labels that meet a minimum frequency threshold, ContamiNet preforms almost as well human experts in detecting contamination (0.86 versus 0.88 AUC). Recology is actively piloting ContamiNet in their daily municipal solid waste (MSW) collection to identify contaminants in recycling and compost bins to subsequently inform and educate customers about best sorting practices.