Abstract:Background: Epilepsy is a neurological illness affecting the brain that makes people more likely to experience frequent, spontaneous seizures. There has to be an accurate automated method for measuring seizure frequency and severity in order to assess the efficacy of pharmacological therapy for epilepsy. The drug quantities are often derived from patient reports which may cause significant issues owing to inadequate or inaccurate descriptions of seizures and their frequencies. Methods and materials: This study proposes a novel deep learning architecture based lightweight convolution transformer (LCT). The transformer is able to learn spatial and temporal correlated information simultaneously from the multi-channel electroencephalogram (EEG) signal to detect seizures at smaller segment lengths. In the proposed model, the lack of translation equivariance and localization of ViT is reduced using convolution tokenization, and rich information from the transformer encoder is extracted by sequence pooling instead of the learnable class token. Results: Extensive experimental results demonstrate that the proposed model of cross-patient learning can effectively detect seizures from the raw EEG signals. The accuracy and F1-score of seizure detection in the cross-patient case on the CHB-MIT dataset are shown to be 96.31% and 96.32%, respectively, at 0.5 sec segment length. In addition, the performance metrics show that the inclusion of inductive biases and attention-based pooling in the model enhances the performance and reduces the number of transformer encoder layers, which significantly reduces the computational complexity. In this research work, we provided a novel approach to enhance efficiency and simplify the architecture for multi-channel automated seizure detection.