Skin cancer (SC) stands out as one of the most life-threatening forms of cancer, with its danger amplified if not diagnosed and treated promptly. Early intervention is critical, as it allows for more effective treatment approaches. In recent years, Deep Learning (DL) has emerged as a powerful tool in the early detection and skin cancer diagnosis (SCD). Although the DL seems promising for the diagnosis of skin cancer, still ample scope exists for improving model efficiency and accuracy. This paper proposes a novel approach to skin cancer detection, utilizing optimization techniques in conjunction with pre-trained networks and wavelet transformations. First, normalized images will undergo pre-trained networks such as Densenet-121, Inception, Xception, and MobileNet to extract hierarchical features from input images. After feature extraction, the feature maps are passed through a Discrete Wavelet Transform (DWT) layer to capture low and high-frequency components. Then the self-attention module is integrated to learn global dependencies between features and focus on the most relevant parts of the feature maps. The number of neurons and optimization of the weight vectors are performed using three new swarm-based optimization techniques, such as Modified Gorilla Troops Optimizer (MGTO), Improved Gray Wolf Optimization (IGWO), and Fox optimization algorithm. Evaluation results demonstrate that optimizing weight vectors using optimization algorithms can enhance diagnostic accuracy and make it a highly effective approach for SCD. The proposed method demonstrates substantial improvements in accuracy, achieving top rates of 98.11% with the MobileNet + Wavelet + FOX and DenseNet + Wavelet + Fox combination on the ISIC-2016 dataset and 97.95% with the Inception + Wavelet + MGTO combination on the ISIC-2017 dataset, which improves accuracy by at least 1% compared to other methods.