Autonomous vehicles (AVs) are envisioned to revolutionize our life by providing safe, relaxing, and convenient ground transportation. The computing systems in such vehicles are required to interpret various sensor data and generate responses to the environment in a timely manner to ensure driving safety. However, such timing-related safety requirements are largely unexplored in prior works. In this paper, we conduct a systematic study to understand the timing requirements of AV systems. We focus on investigating and mitigating the sources of tail latency in Level-4 AV computing systems. We observe that the performance of AV algorithms is not uniformly distributed -- instead, the latency is susceptible to vehicle environment fluctuations, such as traffic density. This contributes to burst computation and memory access in response to the traffic, and further leads to tail latency in the system. Furthermore, we observe that tail latency also comes from a mismatch between the pre-configured AV computation pipeline and the dynamic latency requirements in real-world driving scenarios. Based on these observations, we propose a set of system designs to mitigate AV tail latency. We demonstrate our design on widely-used industrial Level-4 AV systems, Baidu Apollo and Autoware. The evaluation shows that our design achieves 1.65 X improvement over the worst-case latency and 1.3 X over the average latency, and avoids 93% of accidents on Apollo.