Special cameras that provide useful features for face anti-spoofing are desirable, but not always an option. In this work we propose a method to utilize the difference in dynamic appearance between bona fide and spoof samples by creating artificial modalities from RGB videos. We introduce two types of artificial transforms: rank pooling and optical flow, combined in end-to-end pipeline for spoof detection. We demonstrate that using intermediate representations that contain less identity and fine-grained features increase model robustness to unseen attacks as well as to unseen ethnicities. The proposed method achieves state-of-the-art on the largest cross-ethnicity face anti-spoofing dataset CASIA-SURF CeFA (RGB).