Phishing Website Detection Using C4.5 Decision Tree

Xiang YANG, Li YAN, Bo YANG, Ying-fang LI


Among various anti-phishing solutions, Machine Learning techniques are considered to be promising. The purpose of this study was to evaluate the effect of C4.5 decision tree algorithm on phishing website detection. In the experiment, C4.5 had learned two models using two phishing website datasets respectively, which were PWD and PWD2, the latter was obtained through dimensionality reduction on the former. Under 10-fold cross-validation, various metrics indicated that the two models all done very well. The results of the corrected paired t-test under the significance of 0.05 (two tailed), shown that accuracy and recall metrics of the model based on PWD is not statistically significantly different to those of PWD2. According to the better model, the one based on PWD2, which had the lower complexity and the similar performance compared with the one based on PWD, the top 5 key features for classification were obtained.


Phishing website, C4.5, Classification



  • There are currently no refbacks.