Data is containg 5,49,346 entries.
-
There are two columns.
-
Label column is prediction col which has 2 categories
-
A. Good - which means the URLs is not containing malicious stuff and this site is not a Phishing Site.
-
B. Bad - which means the URLs contains malicious stuff and this site is a Phishing Site.
-
-
There is no missing value in the dataset.
This dataset is obtained from Kaggle
(https://www.kaggle.com/datasets/taruntiwarihp/phishing-site-urls)
I naturally gravitate toward projects of this nature because I am interested in cybersecurity.
Each student chose a topic that interests them personally when our data science course coordinator instructed us to produce a project utilising machine learning methods.
So I started looking for tough assignments that were related to cybersecurity so that I could learn new things and subjects.
When I first learned about this phishing website detection, I made the decision to do this project in order to learn more about phishing websites and social engineering.