Abstract:Fake online pharmacies have become increasingly pervasive, constituting over 90% of online pharmacy websites. There is a need for fake website detection techniques capable of identifying fake online pharmacy websites with a high degree of accuracy. In this study, we compared several well-known link-based detection techniques on a large-scale test bed with the hyperlink graph encompassing over 80 million links between 15.5 million web pages, including 1.2 million known legitimate and fake pharmacy pages. We found that the QoC and QoL class propagation algorithms achieved an accuracy of over 90% on our dataset. The results revealed that algorithms that incorporate dual class propagation as well as inlink and outlink information, on page-level or site-level graphs, are better suited for detecting fake pharmacy websites. In addition, site-level analysis yielded significantly better results than page-level analysis for most algorithms evaluated.