Abstract
Breast cancer is a disease with a high fatality rate each year. It is the most
frequent cancer in women and the leading cause of death in women worldwide. The
method of machine learning (ML) is an excellent way to categorize data, particularly in
the medical industry. It is widely used for decision-making, categorization, and
analysis. The main objective of this study is to analyze the performances of different
ML algorithms on the WBCD dataset. In this paper, we analysed the performances of
different ML algorithms, i.e., XGboost Classifier, KNN, Random Forest, and SVM
(Support Vector Machine). Accuracy was used in the study to determine the
performance. Experimental result shows that SVMs perform better and are more
accurate than KNNs as the amount of training data increases. The SVM produces better
results when the main component (PC) value grows and the accuracy rating exceeds the
kNN.
Keywords: Breast Cancer, Decision Tree, Exploratory Data Analysis, Histograms, KNN, Random Forest, SVM, UCI Machine Learning Repository, WBCD, XgBoost.