We suggested a unified system with core components of data augmentation, ImageNet-pretrained ResNet-50, cost-sensitive loss, deep ensemble learning, and uncertainty estimation to quickly and consistently detect COVID-19 using acoustic evidence. To increase the model's capacity to identify a minority class, data augmentation and cost-sensitive loss are incorporated (infected samples). In the COVID-19 detection challenge, ImageNet-pretrained ResNet-50 has been found to be effective. The unified framework also integrates deep ensemble learning and uncertainty estimation to integrate predictions from various base classifiers for generalisation and reliability. We ran a series of tests using the DiCOVA2021 challenge dataset to assess the efficacy of our proposed method, and the results show that our method has an AUC-ROC of 85.43 percent, making it a promising method for COVID-19 detection. The unified framework also demonstrates that audio may be used to quickly diagnose different respiratory disorders.