Image manipulation and forgery detection have been a topic of research for more than a decade now. New-age tools and large-scale social platforms have given space for manipulated media to thrive. These media can be potentially dangerous and thus innumerable methods have been designed and tested to prove their robustness in detecting forgery. However, the results reported by state-of-the-art systems indicate that supervised approaches achieve almost perfect performance but only with particular datasets. In this work, we analyze the issue of out-of-distribution generalisability of the current state-of-the-art image forgery detection techniques through several experiments. Our study focuses on models that utilise handcrafted features for image forgery detection. We show that the developed methods fail to perform well on cross-dataset evaluations and in-the-wild manipulated media. As a consequence, a question is raised about the current evaluation and overestimated performance of the systems under consideration. Note: This work was done during a summer research internship at ITMR Lab, IIIT-Allahabad under the supervision of Prof. Anupam Agarwal.