Abstract:Detection of illegal and threatening items in baggage is one of the utmost security concern nowadays. Even for experienced security personnel, manual detection is a time-consuming and stressful task. Many academics have created automated frameworks for detecting suspicious and contraband data from X-ray scans of luggage. However, to our knowledge, no framework exists that utilizes temporal baggage X-ray imagery to effectively screen highly concealed and occluded objects which are barely visible even to the naked eye. To address this, we present a novel temporal fusion driven multi-scale residual fashioned encoder-decoder that takes series of consecutive scans as input and fuses them to generate distinct feature representations of the suspicious and non-suspicious baggage content, leading towards a more accurate extraction of the contraband data. The proposed methodology has been thoroughly tested using the publicly accessible GDXray dataset, which is the only dataset containing temporally linked grayscale X-ray scans showcasing extremely concealed contraband data. The proposed framework outperforms its competitors on the GDXray dataset on various metrics.