Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Sep 12, 2022

Abhinav Java, Shripad Deshmukh, Milan Aggarwal, Surgan Jandial, Mausoom Sarkar, Balaji Krishnamurthy

Figure 1 for One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Figure 2 for One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Figure 3 for One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Figure 4 for One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Share this with someone who'll enjoy it:

Abstract:Active consumption of digital documents has yielded scope for research in various applications, including search. Traditionally, searching within a document has been cast as a text matching problem ignoring the rich layout and visual cues commonly present in structured documents, forms, etc. To that end, we ask a mostly unexplored question: "Can we search for other similar snippets present in a target document page given a single query instance of a document snippet?". We propose MONOMER to solve this as a one-shot snippet detection task. MONOMER fuses context from visual, textual, and spatial modalities of snippets and documents to find query snippet in target documents. We conduct extensive ablations and experiments showing MONOMER outperforms several baselines from one-shot object detection (BHRL), template matching, and document understanding (LayoutLMv3). Due to the scarcity of relevant data for the task at hand, we train MONOMER on programmatically generated data having many visually similar query snippets and target document pairs from two datasets - Flamingo Forms and PubLayNet. We also do a human study to validate the generated data.

View paper on

Share this with someone who'll enjoy it:

Title:One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Paper and Code