Abstract:The recent wave of AI research has enabled a new brand of synthetic media, called deepfakes. Deepfakes have impressive photorealism, which has generated exciting new use cases but also raised serious threats to our increasingly digital world. To mitigate these threats, researchers have tried to come up with new methods for deepfake detection that are more effective than traditional forensics and heavily rely on deep AI technology. In this paper, following up on encouraging prior work for deepfake detection with attribution and ensemble techniques, we explore and compare multiple designs for ensemble detectors. The goal is to achieve robustness and good generalization ability by leveraging ensembles of models that specialize in different manipulation categories. Our results corroborate that ensembles can achieve higher accuracy than individual models when properly tuned, while the generalization ability relies on access to a large number of training data for a diverse set of known manipulations.