Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ankita Sharma

Hierarchical corpus encoder: Fusing generative retrieval and dense indices

Feb 26, 2025

Tongfei Chen, Ankita Sharma, Adam Pauls, Benjamin Van Durme

Abstract:Generative retrieval employs sequence models for conditional generation of document IDs based on a query (DSI (Tay et al., 2022); NCI (Wang et al., 2022); inter alia). While this has led to improved performance in zero-shot retrieval, it is a challenge to support documents not seen during training. We identify the performance of generative retrieval lies in contrastive training between sibling nodes in a document hierarchy. This motivates our proposal, the hierarchical corpus encoder (HCE), which can be supported by traditional dense encoders. Our experiments show that HCE achieves superior results than generative retrieval models under both unsupervised zero-shot and supervised settings, while also allowing the easy addition and removal of documents to the index.

Via

Access Paper or Ask Questions

Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

Oct 14, 2021

Ji Xin, Chenyan Xiong, Ashwin Srinivasan, Ankita Sharma, Damien Jose, Paul N. Bennett

Figure 1 for Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

Figure 2 for Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

Figure 3 for Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

Figure 4 for Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

Abstract:Dense retrieval (DR) methods conduct text retrieval by first encoding texts in the embedding space and then matching them by nearest neighbor search. This requires strong locality properties from the representation space, i.e, the close allocations of each small group of relevant texts, which are hard to generalize to domains without sufficient training data. In this paper, we aim to improve the generalization ability of DR models from source training domains with rich supervision signals to target domains without any relevant labels, in the zero-shot setting. To achieve that, we propose Momentum adversarial Domain Invariant Representation learning (MoDIR), which introduces a momentum method in the DR training process to train a domain classifier distinguishing source versus target, and then adversarially updates the DR encoder to learn domain invariant representations. Our experiments show that MoDIR robustly outperforms its baselines on 10+ ranking datasets from the BEIR benchmark in the zero-shot setup, with more than 10% relative gains on datasets with enough sensitivity for DR models' evaluation. Source code of this paper will be released.

Via

Access Paper or Ask Questions

Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer

Feb 05, 2021

Kanchan Chowdhury, Ankita Sharma, Arun Deepak Chandrasekar

Figure 1 for Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer

Figure 2 for Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer

Figure 3 for Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer

Figure 4 for Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer

Abstract:Increasing the batch size of a deep learning model is a challenging task. Although it might help in utilizing full available system memory during training phase of a model, it results in significant loss of test accuracy most often. LARS solved this issue by introducing an adaptive learning rate for each layer of a deep learning model. However, there are doubts on how popular distributed machine learning systems such as SystemML or MLlib will perform with this optimizer. In this work, we apply LARS optimizer to a deep learning model implemented using SystemML.We perform experiments with various batch sizes and compare the performance of LARS optimizer with \textit{Stochastic Gradient Descent}. Our experimental results show that LARS optimizer performs significantly better than Stochastic Gradient Descent for large batch sizes even with the distributed machine learning framework, SystemML.

Via

Access Paper or Ask Questions