Abstract:With the growing success of Large Language models (LLMs) in information-seeking scenarios, search engines are now adopting generative approaches to provide answers along with in-line citations as attribution. While existing work focuses mainly on attributed question answering, in this paper, we target information-seeking scenarios which are often more challenging due to the open-ended nature of the queries and the size of the label space in terms of the diversity of candidate-attributed answers per query. We propose a reproducible framework to evaluate and benchmark attributed information seeking, using any backbone LLM, and different architectural designs: (1) Generate (2) Retrieve then Generate, and (3) Generate then Retrieve. Experiments using HAGRID, an attributed information-seeking dataset, show the impact of different scenarios on both the correctness and attributability of answers.
Abstract:This paper discusses the problem of marrying structural similarity with semantic relatedness for Information Extraction from text. Aiming at accurate recognition of relations, we introduce local alignment kernels and explore various possibilities of using them for this task. We give a definition of a local alignment (LA) kernel based on the Smith-Waterman score as a sequence similarity measure and proceed with a range of possibilities for computing similarity between elements of sequences. We show how distributional similarity measures obtained from unlabeled data can be incorporated into the learning task as semantic knowledge. Our experiments suggest that the LA kernel yields promising results on various biomedical corpora outperforming two baselines by a large margin. Additional series of experiments have been conducted on the data sets of seven general relation types, where the performance of the LA kernel is comparable to the current state-of-the-art results.