Electrocardiogram (ECG) is an essential signal in monitoring human heart activities. Researchers have achieved promising results in leveraging ECGs in clinical applications with deep learning models. However, the mainstream deep learning approaches usually neglect the periodic and formative attribute of the ECG heartbeat waveform. In this work, we propose a novel ECG-Segment based Learning (ECG-SL) framework to explicitly model the periodic nature of ECG signals. More specifically, ECG signals are first split into heartbeat segments, and then structural features are extracted from each of the segments. Based on the structural features, a temporal model is designed to learn the temporal information for various clinical tasks. Further, due to the fact that massive ECG signals are available but the labeled data are very limited, we also explore self-supervised learning strategy to pre-train the models, resulting significant improvement for downstream tasks. The proposed method outperforms the baseline model and shows competitive performances compared with task-specific methods in three clinical applications: cardiac condition diagnosis, sleep apnea detection, and arrhythmia classification. Further, we find that the ECG-SL tends to focus more on each heartbeat's peak and ST range than ResNet by visualizing the saliency maps.