Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Jul 22, 2020

Rizka Purwanto, Arindam Pal, Alan Blair, Sanjay Jha

Figure 1 for PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Figure 2 for PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Figure 3 for PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Figure 4 for PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Share this with someone who'll enjoy it:

Abstract:Phishing has grown significantly in the past few years and is predicted to further increase in the future. The dynamics of phishing introduce challenges in implementing a robust phishing detection system and selecting features which can represent phishing despite the change of attack. In this paper, we propose PhishZip which is a novel phishing detection approach using a compression algorithm to perform website classification and demonstrate a systematic way to construct the word dictionaries for the compression models using word occurrence likelihood analysis. PhishZip outperforms the use of best-performing HTML-based features in past studies, with a true positive rate of 80.04%. We also propose the use of compression ratio as a novel machine learning feature which significantly improves machine learning based phishing detection over previous studies. Using compression ratios as additional features, the true positive rate significantly improves by 30.3% (from 51.47% to 81.77%), while the accuracy increases by 11.84% (from 71.20% to 83.04%).

* To appear in the proceedings of IEEE Conference on Communications and Network Security (CNS 2020)

View paper on

Share this with someone who'll enjoy it:

Title:PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Paper and Code