Abstract
The revolution of Industry 4.0 will leave an impact on the domain of
everyone's lives directly or indirectly. Several new complex applications will be
developed in the days to come that are complicated to predict in the current scenario.
With the help of machine learning approaches and intelligent IoT devices, people will
be relieved from extra overheads of redundant work currently being performed.
Industry 4.0 has become a significant catalyst for innovation and development in
various industrial sectors like production processes and quality improvement with
greater flexibility. This chapter applied different machine learning algorithms for spam
detection and classifying emails into legitimate and spam. Seven classification models:
Decision Trees, Random Forest, Artificial Neural Network, Gradient Boosting
Machines, AdaBoost, Naive Bayes, and Support Vector Machines are applied. Three
benchmark spam datasets are extracted from standard repositories to conduct the
experiments. The chapter also presents a quantitative performance analysis. The results
from rigorous experiments reveal that ensemble methods, Gradient Boosting and
AdaBoost, outperformed other methods with an overall accuracy of 98.70% and
98.18%, respectively. The ensembled models are effective on a large-sized dataset
embedded with more extensive features. The performance of non-ensemble methods,
ANN and Naïve Bayes, was instrumental on large datasets as a viable alternative, with
an overall accuracy of 98.38% and 97.63% on test data.
Keywords: Cross-validation, Industrial revolution, Machine learning methods, Parameter optimization, Performance measurement, Preprocessing techniques.