Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

Aug 07, 2018

Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang

Figure 1 for S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

Figure 2 for S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

Figure 3 for S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

Figure 4 for S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

Share this with someone who'll enjoy it:

Abstract:In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network. Our architecture, named S3D, encodes the entire video stream and discretizes the output space of temporal activity spans into a set of default spans over different temporal locations and scales. At prediction time, S3D predicts scores for the presence of activity categories in each default span and produces temporal adjustments relative to the span location to predict the precise activity duration. Unlike many state-of-the-art systems that require a separate proposal and classification stage, our S3D is intrinsically simple and dedicatedly designed for single-shot, end-to-end temporal activity detection. When evaluating on THUMOS'14 detection benchmark, S3D achieves state-of-the-art performance and is very efficient and can operate at 1271 FPS.

* BMVC 2018 Oral

View paper on

Share this with someone who'll enjoy it:

Title:S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

Paper and Code