Abstract
Lung cancer is a form of carcinoma that develops as a result of aberrant cell
growth or mutation in the lungs. Most of the time, this occurs due to daily exposure to
hazardous chemicals. However, this is not the only cause of lung cancer; additional
factors include smoking, indirect smoke exposure, family medical history, and so on.
Cancer cells, unlike normal cells, proliferate inexorably and cluster together to create
masses or tumors. The symptoms of this disease do not appear until cancer cells have
moved to other parts of the body and are interfering with the healthy functioning of
other organs. As a solution to this problem, Machine Learning (ML) algorithms are
used to diagnose lung cancer. The image datasets for this study were obtained from
Kaggle. The images are preprocessed using various approaches before being used to
train the image model. Texture-based Feature Extraction (FE) algorithms such as
Generalized Low-Rank Models (GLRM) and Gray-level co-occurrence matrix
(GLCM) are then used to extract the essential characteristics from the image dataset.
To develop a model, the collected features are given into ML classifiers like the
Support Vector Machine (SVM) and the k-nearest neighbor's algorithm (k-NN). To evaluate FE and classification, several performance metrics are used, such as accuracy, error rate, sensitivity specificity, and so on.
Keywords: Classification, CT scan, Lung Adenocarcinoma, Performance Metrics, Texture.