Scopus Harvesting Series

Naive bayes classification for email spam detection

Zain Syed, University of Wollongong in Dubai
Omar Taher, University of Wollongong in Dubai

Publication Name

Advanced Interdisciplinary Applications of Machine Learning Python Libraries for Data Science

Abstract

Email is one of the cheapest forms of communication that every internet user utilizes, from individuals to businesses. Because of its simplicity and wide availability, it is vulnerable to threats by perpetrators through spam with malicious intents, known to have resulted in huge financial losses and threatened the privacy of millions of individuals. Not all spam emails are malicious; however, they are a nuisance to users regardless. Because of these reasons, there is a dire need for good spam detection systems that are automatically able to identify emails as spam. This chapter aims to do exactly that by proposing a Naive Bayes approach to create a spam detection system by using a combination of the Enron Email dataset and the 419 fraud dataset. The datasets are lemmatized in order to boost performance in terms of execution time and accuracy. Grid search is one technique adopted to maximize accuracy. Finally, the model is evaluated through various metrics and a comparative analysis is performed.

Open Access Status

This publication is not available as open access

First Page

177

Last Page

201

Link to Full Text

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.4018/978-1-6684-8696-2.ch007

Scopus Harvesting Series

Naive bayes classification for email spam detection

Publication Name

Abstract

Open Access Status

First Page

Last Page

Link to publisher version (DOI)

Search

Browse

Links

Scopus Harvesting Series

Naive bayes classification for email spam detection

Authors

Publication Name

Abstract

Open Access Status

First Page

Last Page

Share

Link to publisher version (DOI)

Search

Browse

Links