Abstract:Pneumonia, a severe respiratory disease, poses significant diagnostic challenges, especially in underdeveloped regions. Traditional diagnostic methods, such as chest X-rays, suffer from variability in interpretation among radiologists, necessitating reliable automated tools. In this study, we propose a novel approach combining deep learning and transformer-based attention mechanisms to enhance pneumonia detection from chest X-rays. Our method begins with lung segmentation using a TransUNet model that integrates our specialized transformer module, which has fewer parameters compared to common transformers while maintaining performance. This model is trained on the "Chest Xray Masks and Labels" dataset and then applied to the Kermany and Cohen datasets to isolate lung regions, enhancing subsequent classification tasks. For classification, we employ pre-trained ResNet models (ResNet-50 and ResNet-101) to extract multi-scale feature maps, processed through our modified transformer module. By employing our specialized transformer, we attain superior results with significantly fewer parameters compared to common transformer models. Our approach achieves high accuracy rates of 92.79% on the Kermany dataset and 95.11% on the Cohen dataset, ensuring robust and efficient performance suitable for resource-constrained environments. "https://github.com/amirrezafateh/Multi-Scale-Transformer-Pneumonia"