Radio technology enabled contact-free human posture and vital sign estimation is promising for health monitoring. Radio systems at millimeter-wave (mmWave) frequencies advantageously bring large bandwidth, multi-antenna array and beam steering capability. \textit{However}, the human point cloud obtained by mmWave radar and utilized for posture estimation is likely to be sparse and incomplete. Additionally, human's random body movements deteriorate the estimation of breathing and heart rates, therefore the information of the chest location and a narrow radar beam toward the chest are demanded for more accurate vital sign estimation. In this paper, we propose a pipeline aiming to enhance the vital sign estimation performance of mmWave FMCW MIMO radar. The first step is to recognize human body part and posture, where we exploit a trained Convolutional Neural Networks (CNN) to efficiently process the imperfect human form point cloud. The CNN framework outputs the key point of different body parts, and was trained by using RGB image reference and Augmentative Ellipse Fitting Algorithm (AEFA). The next step is to utilize the chest information of the prior estimated human posture for vital sign estimation. While CNN is initially trained based on the frame-by-frame point clouds of human for posture estimation, the vital signs are extracted through beamforming toward the human chest. The numerical results show that this spatial filtering improves the estimation of the vital signs in regard to lowering the level of side harmonics and detecting the harmonics of vital signs efficiently, i.e., peak-to-average power ratio in the harmonics of vital signal is improved up to 0.02 and 0.07dB for the studied cases.