Abstract:Nowadays, one of the critical challenges in forensics is analyzing the enormous amounts of unstructured digital evidence, such as images. Often, unstructured digital evidence contains precious information for forensic investigations. Therefore, a retrieval system that can effectively identify forensically relevant images is paramount. In this work, we explored the effectiveness of interactive learning in improving image retrieval performance in the forensic domain by proposing Excalibur - a zero-shot cross-modal image retrieval system extended with interactive learning. Excalibur was evaluated using both simulations and a user study. The simulations reveal that interactive learning is highly effective in improving retrieval performance in the forensic domain. Furthermore, user study participants could effectively leverage the power of interactive learning. Finally, they considered Excalibur effective and straightforward to use and expressed interest in using it in their daily practice.
Abstract:Expert search aims to find and rank experts based on a user's query. In academia, retrieving experts is an efficient way to navigate through a large amount of academic knowledge. Here, we study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval. We use the Microsoft Academic Graph dataset and experiment with different configurations of a document-centric voting model for retrieval. In particular, we explore the impact of the use of contextualized embeddings on search performance. We also present results for paper embeddings that incorporate citation information through retrofitting. Additionally, experiments are conducted using different techniques for assigning author weights based on author order. We observe that using contextual embeddings produced by a transformer model trained for sentence similarity tasks produces the most effective paper representations for document-centric expert retrieval. However, retrofitting the paper embeddings and using elaborate author contribution weighting strategies did not improve retrieval performance.