Squamous Cell Carcinoma (SCC) is the most common cancer type of the epithelium and is often detected at a late stage. Besides invasive diagnosis of SCC by means of biopsy and histo-pathologic assessment, Confocal Laser Endomicroscopy (CLE) has emerged as noninvasive method that was successfully used to diagnose SCC in vivo. For interpretation of CLE images, however, extensive training is required, which limits its applicability and use in clinical practice of the method. To aid diagnosis of SCC in a broader scope, automatic detection methods have been proposed. This work compares two methods with regard to their applicability in a transfer learning sense, i.e. training on one tissue type (from one clinical team) and applying the learnt classification system to another entity (different anatomy, different clinical team). Besides a previously proposed, patch-based method based on convolutional neural networks, a novel classification method on image level (based on a pre-trained Inception V.3 network with dedicated preprocessing and interpretation of class activation maps) is proposed and evaluated. The newly presented approach improves recognition performance, yielding accuracies of 91.63% on the first data set (oral cavity) and 92.63% on a joint data set. The generalization from oral cavity to the second data set (vocal folds) lead to similar area-under-the-ROC curve values than a direct training on the vocal folds data set, indicating good generalization.