https://github.com/BasitAlawode/WCEBleedGen.
Informed by the success of the transformer model in various computer vision tasks, we design an end-to-end trainable model for the automatic detection and classification of bleeding and non-bleeding frames extracted from Wireless Capsule Endoscopy (WCE) videos. Based on the DETR model, our model uses the Resnet50 for feature extraction, the transformer encoder-decoder for bleeding and non-bleeding region detection, and a feedforward neural network for classification. Trained in an end-to-end approach on the Auto-WCEBleedGen Version 1 challenge training set, our model performs both detection and classification tasks as a single unit. Our model achieves an accuracy, recall, and F1-score classification percentage score of 98.28, 96.79, and 98.37 respectively, on the Auto-WCEBleedGen version 1 validation set. Further, we record an average precision (AP @ 0.5), mean-average precision (mAP) of 0.7447 and 0.7328 detection results. This earned us a 3rd place position in the challenge. Our code is publicly available via