This paper proposes a novel regularization method, named Spatio-Spectral Structure Tensor Total Variation (S3TTV), for denoising and destriping of hyperspectral (HS) images. HS images are inevitably contaminated by various types of noise, during acquisition process, due to the measurement equipment and the environment. For HS image denoising and destriping tasks, Spatio-Spectral Total Variation (SSTV), defined using second-order spatio-spectral differences, is widely known as a powerful regularization approach that models the underlying spatio-spectral properties. However, since SSTV refers only to adjacent pixels/bands, semi-local spatial structures are not preserved during denoising process. To address this problem, we newly design S3TTV, defined by the sum of the nuclear norms of matrices consisting of second-order spatio-spectral differences in small spectral blocks (we call these matrices as spatio-spectral structure tensors). The proposed regularization method simultaneously models the spatial piecewise-smoothness, the spatial similarity between adjacent bands, and the spectral correlation across all bands in small spectral blocks, leading to effective noise removal while preserving the semi-local spatial structures. Furthermore, we formulate the HS image denoising and destriping problem as a convex optimization problem involving S3TTV and develop an algorithm based on a preconditioned primal-dual splitting method to solve this problem efficiently. Finally, we demonstrate the effectiveness of S3TTV by comparing it with existing methods, including state-of-the-art ones through denoising and destriping experiments.