Low-channel EEG devices are crucial for portable and entertainment applications. However, the low spatial resolution of EEG presents challenges in decoding low-channel motor imagery. This study introduces TSFF-Net, a novel network architecture that integrates time-space-frequency features, effectively compensating for the limitations of single-mode feature extraction networks based on time-series or time-frequency modalities. TSFF-Net comprises four main components: time-frequency representation, time-frequency feature extraction, time-space feature extraction, and feature fusion and classification. Time-frequency representation and feature extraction transform raw EEG signals into time-frequency spectrograms and extract relevant features. The time-space network processes time-series EEG trials as input and extracts temporal-spatial features. Feature fusion employs MMD loss to constrain the distribution of time-frequency and time-space features in the Reproducing Kernel Hilbert Space, subsequently combining these features using a weighted fusion approach to obtain effective time-space-frequency features. Moreover, few studies have explored the decoding of three-channel motor imagery based on time-frequency spectrograms. This study proposes a shallow, lightweight decoding architecture (TSFF-img) based on time-frequency spectrograms and compares its classification performance in low-channel motor imagery with other methods using two publicly available datasets. Experimental results demonstrate that TSFF-Net not only compensates for the shortcomings of single-mode feature extraction networks in EEG decoding, but also outperforms other state-of-the-art methods. Overall, TSFF-Net offers considerable advantages in decoding low-channel motor imagery and provides valuable insights for algorithmically enhancing low-channel EEG decoding.