We present a generative method to estimate 3D human motion and body shape from monocular video. Under the assumption that starting from an initial pose optical flow constrains subsequent human motion, we exploit flow to find temporally coherent human poses of a motion sequence. We estimate human motion by minimizing the difference between computed flow fields and the output of an artificial flow renderer. A single initialization step is required to estimate motion over multiple frames. Several regularization functions enhance robustness over time. Our test scenarios demonstrate that optical flow effectively regularizes the under-constrained problem of human shape and motion estimation from monocular video.