Photoplethysmography (PPG) sensors have been widely used in consumer wearable devices to monitor heart rates (HR) and heart rate variability (HRV). Despite the prevalence, PPG signals can be contaminated by motion artifacts induced from daily activities. Existing approaches mainly use the amplitude information to perform PPG peak detection. However, these approaches cannot accurately identify peaks, since motion artifacts may bring random and significant amplitude variations. To improve the performance of PPG peak detection, the time information can be used. Specifically, heart rates exhibit temporal consistency that consecutive heartbeat intervals in a normal person can have limited variations. To leverage the temporal consistency, we propose the Temporal Attentive U-Net, i.e., TAU, to accurately detect peaks from PPG signals. In TAU, we design a time module that encodes temporal consistency in temporal embeddings. We integrate the amplitude information with temporal embeddings using the attention mechanism to estimate peak labels. Our experimental results show that TAU outperforms eleven baselines on heart rate estimation by more than 22.4%. Our TAU model achieves the best performance across various Signal-to-Noise Ratio (SNR) levels. Moreover, we achieve Pearson correlation coefficients higher than 0.9 (p < 0.01) on estimating HRV features from low-noise-level PPG signals.