Predicting the level of facial expression intensities based on videos allow capturing a representation of affect, which has many potential applications such as pain localisation, depression detection, etc. However, state-of-the-art DL(DL) models to predict these levels are typically formulated regression problems, and do not leverage the data distribution, nor the ordinal relationship between levels. This translates to a limited robustness to noisy and uncertain labels. Moreover, annotating expression intensity levels for video frames is a costly undertaking, involving considerable labor by domain experts, and the labels are vulnerable to subjective bias due to ambiguity among adjacent intensity levels. This paper introduces a DL model for weakly-supervised domain adaptation with ordinal regression (WSDA-OR), where videos in target domain have coarse labels representing of ordinal intensity levels that are provided on a periodic basis. In particular, the proposed model learn discriminant and domain-invariant representations by integrating multiple instance learning with deep adversarial domain adaptation, where an Inflated 3D CNN (I3D) is trained using fully supervised source domain videos, and weakly supervised target domain videos. The trained model is finally used to estimate the ordinal intensity levels of individual frames in the target operational domain. The proposed approach has been validated for pain intensity estimation on using RECOLA dataset as labeled source domain, and UNBC-McMaster dataset as weakly-labeled target domain. Experimental results shows significant improvement over the state-of-the-art models and achieves higher level of localization accuracy.