Abstract:Information retrieval systems have historically relied on explicit query formulation, requiring users to translate their information needs into text. This process is particularly disruptive during reading tasks, where users must interrupt their natural flow to formulate queries. We present DEEPER (Dense Electroencephalography Passage Retrieval), a novel framework that enables direct retrieval of relevant passages from users' neural signals during naturalistic reading without intermediate text translation. Building on dense retrieval architectures, DEEPER employs a dual-encoder approach with specialised components for processing neural data, mapping EEG signals and text passages into a shared semantic space. Through careful architecture design and cross-modal negative sampling strategies, our model learns to align neural patterns with their corresponding textual content. Experimental results on the ZuCo dataset demonstrate that direct brain-to-passage retrieval significantly outperforms current EEG-to-text baselines, achieving a 571% improvement in Precision@1. Our ablation studies reveal that the model successfully learns aligned representations between EEG and text modalities (0.29 cosine similarity), while our hard negative sampling strategy contributes to overall performance increases.
Abstract:One of the foundational goals of Information Retrieval (IR) is to satisfy searchers' Information Needs (IN). Understanding how INs physically manifest has long been a complex and elusive process. However, recent studies utilising Electroencephalography (EEG) data have provided real-time insights into the neural processes associated with INs. Unfortunately, they have yet to demonstrate how this insight can practically benefit the search experience. As such, within this study, we explore the ability to predict the realisation of IN within EEG data across 14 subjects whilst partaking in a Question-Answering (Q/A) task. Furthermore, we investigate the combinations of EEG features that yield optimal predictive performance, as well as identify regions within the Q/A queries where a subject's realisation of IN is more pronounced. The findings from this work demonstrate that EEG data is sufficient for the real-time prediction of the realisation of an IN across all subjects with an accuracy of 73.5\% (SD 2.6\%) and on a per-subject basis with an accuracy of 90.1\% (SD 22.1\%). This work helps to close the gap by bridging theoretical neuroscientific advancements with tangible improvements in information retrieval practices, paving the way for real-time prediction of the realisation of IN.