Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SpotEM: Efficient Video Search for Episodic Memory

Jun 28, 2023

Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman

Figure 1 for SpotEM: Efficient Video Search for Episodic Memory

Figure 2 for SpotEM: Efficient Video Search for Episodic Memory

Figure 3 for SpotEM: Efficient Video Search for Episodic Memory

Figure 4 for SpotEM: Efficient Video Search for Episodic Memory

Share this with someone who'll enjoy it:

Abstract:The goal in episodic memory (EM) is to search a long egocentric video to answer a natural language query (e.g., "where did I leave my purse?"). Existing EM methods exhaustively extract expensive fixed-length clip features to look everywhere in the video for the answer, which is infeasible for long wearable-camera videos that span hours or even days. We propose SpotEM, an approach to achieve efficiency for a given EM method while maintaining good accuracy. SpotEM consists of three key ideas: 1) a novel clip selector that learns to identify promising video regions to search conditioned on the language query; 2) a set of low-cost semantic indexing features that capture the context of rooms, objects, and interactions that suggest where to look; and 3) distillation losses that address the optimization issues arising from end-to-end joint training of the clip selector and EM model. Our experiments on 200+ hours of video from the Ego4D EM Natural Language Queries benchmark and three different EM models demonstrate the effectiveness of our approach: computing only 10% - 25% of the clip features, we preserve 84% - 97% of the original EM model's accuracy. Project page: https://vision.cs.utexas.edu/projects/spotem

* Published in ICML 2023

View paper on

Share this with someone who'll enjoy it:

Title:SpotEM: Efficient Video Search for Episodic Memory

Paper and Code