In this paper, we propose a new convolutional layer called Depthwise-STFT Separable layer that can serve as an alternative to the standard depthwise separable convolutional layer. The construction of the proposed layer is inspired by the fact that the Fourier coefficients can accurately represent important features such as edges in an image. It utilizes the Fourier coefficients computed (channelwise) in the 2D local neighborhood (e.g., 3x3) of each position of the input map to obtain the feature maps. The Fourier coefficients are computed using 2D Short Term Fourier Transform (STFT) at multiple fixed low frequency points in the 2D local neighborhood at each position. These feature maps at different frequency points are then linearly combined using trainable pointwise (1x1) convolutions. We show that the proposed layer outperforms the standard depthwise separable layer-based models on the CIFAR-10 and CIFAR-100 image classification datasets with reduced space-time complexity.