Estimating treatment effects, especially individualized treatment effects (ITE), using observational data is challenging due to the complex situations of confounding bias. Existing approaches for estimating treatment effects from longitudinal observational data are usually built upon a strong assumption of "unconfoundedness", which is hard to fulfill in real-world practice. In this paper, we propose the Variational Temporal Deconfounder (VTD), an approach that leverages deep variational embeddings in the longitudinal setting using proxies (i.e., surrogate variables that serve for unobservable variables). Specifically, VTD leverages observed proxies to learn a hidden embedding that reflects the true hidden confounders in the observational data. As such, our VTD method does not rely on the "unconfoundedness" assumption. We test our VTD method on both synthetic and real-world clinical data, and the results show that our approach is effective when hidden confounding is the leading bias compared to other existing models.