Abstract:Expanding a dictionary of pre-selected keywords is crucial for tasks in information retrieval, such as database query and online data collection. Here we propose Local Graph-based Dictionary Expansion (LGDE), a method that uses tools from manifold learning and network science for the data-driven discovery of keywords starting from a seed dictionary. At the heart of LGDE lies the creation of a word similarity graph derived from word embeddings and the application of local community detection based on graph diffusion to discover semantic neighbourhoods of pre-defined seed keywords. The diffusion in the local graph manifold allows the exploration of the complex nonlinear geometry of word embeddings and can capture word similarities based on paths of semantic association. We validate our method on a corpus of hate speech-related posts from Reddit and Gab and show that LGDE enriches the list of keywords and achieves significantly better performance than threshold methods based on direct word similarities. We further demonstrate the potential of our method through a real-world use case from communication science, where LGDE is evaluated quantitatively on data collected and analysed by domain experts by expanding a conspiracy-related dictionary.
Abstract:Automating pavement maintenance suggestions is challenging,especially for actionable recommendations such as patching location,depth and priority.It is common practice among State agencies to manually inspect road segments of interest and decide maintenance requirements based on the pavement condition index (PCI).However,standalone PCI only evaluates the pavement surface condition and coupled with the variability in human perception of pavement distress,limits the accuracy and quality of current pavement maintenance practices.Here,a need for multi-sensor data integrated with standardized pavement distress condition ratings is required.This study explores the possibility of estimating the appropriate pavement patching strategy (i.e.,patching location,depth,and quantity) by integrating pavement structural and surface condition assessment with pavement specific ratings of distress.Especially,it combines pavement structural condition assessment parameter;falling weight deflectometer deflections along with surface condition assessment parameters;international roughness index,and cracking density for a better representation of overall pavement distress condition.Then,a pavement specific threshold-based patching suggestion algorithm is implemented to evaluate the pavement overall distress condition into a priority-based patching suggestion.The novelty in the use of pavement specific thresholds is placed on its data-driven ability to determine threshold values from current road condition measurements using a reliability concept validated by the theoretical pavement condition rating,pavement structural number.A web-based patching manager tool (PMT) was implemented to automate the patching suggestion procedure and visualize the results.Validated with road surface images obtained from three-dimensional laser sensors,PMT could successfully capture localized distresses in existing pavements.