As Earth science enters the era of big data, artificial intelligence (AI) not only offers great potential for solving geoscience problems, but also plays a critical role in accelerating the understanding of the complex, interactive, and multiscale processes of Earth's behavior. As geoscience AI models are progressively utilized for significant predictions in crucial situations, geoscience researchers are increasingly demanding their interpretability and versatility. This study proposes an interpretable geoscience artificial intelligence (XGeoS-AI) framework to unravel the mystery of image recognition in the Earth sciences, and its effectiveness and versatility is demonstrated by taking computed tomography (CT) image recognition as an example. Inspired by the mechanism of human vision, the proposed XGeoS-AI framework generates a threshold value from a local region within the whole image to complete the recognition. Different kinds of artificial intelligence (AI) methods, such as Support Vector Regression (SVR), Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), can be adopted as the AI engines of the proposed XGeoS-AI framework to efficiently complete geoscience image recognition tasks. Experimental results demonstrate that the effectiveness, versatility, and heuristics of the proposed framework have great potential in solving geoscience image recognition problems. Interpretable AI should receive more and more attention in the field of the Earth sciences, which is the key to promoting more rational and wider applications of AI in the field of Earth sciences. In addition, the proposed interpretable framework may be the forerunner of technological innovation in the Earth sciences.