A Deep Learning Algorithm for Lung Cancer Detection Using E ﬃ cientNet-B3

: Lung carcinoma is one of the main causes of deaths over the whole world, causing a global burden of morbidity and mortality. Detecting lung tumors at their early stages can help reducing the risk of having lung cancer. This paper proposes a deep learning algorithm using E ﬃ cientNet B3 for lung cancer detection. The purpose is to improve detection accuracy highlighting potential to revolutionize the ﬁeld of medical imaging and improve patient care. The proposed approach is build based on E ﬃ cientNet B3 model to classify four di ﬀ erent types of lung cancer. The approach used CT scan images labeled into Normal, Squamous.cell.carcinoma, Large.cell.carcinoma, and Adenocarcinoma for the purpose of lung cancer detection. The results showed that the proposed model provided an improvement rate of 2.13% compared with the best-trained classiﬁer with accuracy of 96%. This model can be generalized to improve lung cancer detection. The ﬁnding of deep neural networks, particularly E ﬃ cientNet B3, in supporting the diagnosis and detection of the lung disease, particularly in its early times.


INTRODUCTION
Lung cancer is a really serious type of cancer that affects people all around the world.It happens when abnormal cells start growing out of control in the lung tissues, forming tumors.Detecting lung cancer early on is extremely important because it can greatly improve a patient's chances of survival and overall outcome [1].Nevertheless, it can be challenging to diagnose lung cancer because signs often show up when the disease has already progressed to later stages.That's why early detection is so crucial for better patient outcomes and survival rates [2].
The area of lung cancer diagnosis has undergone a revolutionary transformation thanks to advancements in medical technology and the development of modern detection techniques.Various techniques including chest radiographs and computed tomography scans have widely utilized in detection of lung cancer.These techniques use the manual interpretation of medical images, which is leading to time consuming and causing human errors.Recently, the deep learning methods has presented more efficient and accurate ways to detect lung cancer.In addition, These the deep learning methods have demonstrated a great promise in the healthcare field [3].In the other word, deep learning considers part of ML that deems remarkable success in areas as diverse as computer vision and natural language processing [4], [5].It also has the ability to automatically learn features from raw data has made it particularly well-suited for medical imaging tasks.DL algorithms have shown promising results in the analysis of radiographic images, including chest X-rays and computed tomography (CT) scans [6].
There are several challenges within Lung cancer detection models due to the integral complexity and variability of cancerous nodules in the lungs.These challenges include the presence of hidden or rare patterns with the variability between observers; therefore, it needs for accurate differentiation between benign and malignant nodules.As a result, deep learning methods provide a potential solution to overcome these challenges by leveraging large datasets, learning complex patterns and features to be imperceptible of human eye.
The aim of this paper is to propose deep learning algorithms using EfficientNet B3 for the purpose of detecting lung cancer.Having such a model can increase the potential to revolutionize the field of medical imaging and improve patient care.The proposed model highlights the potential impact of automated lung cancer detection, including early diagnosis, treatment planning, and monitoring of disease progression.
This study is structured as follows.In Section 2, an overview of the related work is provided.Section 3 presents our proposed method.The results along with their discussion are outlined in Section 4. Lastly, Section 5 which concludes the findings of our study, and provides definitive results and rational recommendations for future research.

RELATED WORKS
A lot of research was conducted in the field of detecting lung cancer in its early stages.A study proposed by [7] presented a deep transfer learning and convolutional neural networks (CNN) for the classification of malignancy in lung nodules, with the ultimate goal of facilitating early detection of lung cancer (LC).By combining SVM-RBF with CNN-ResNet50, the researchers achieved the highest efficacy in extracting typical imaging biomarkers for the classification of lung nodule cancer on chest CT images.The results were notably impressive, with an accuracy rate (ACC) of 88.41% and an area under the curve (AUC) of around 93%.This research focusses on the ability of CNNs and deep-transfer learning to support the diagnosis and detection of lung cancer.Lung cancer remains a major reason of cancer deaths worldwide, and this research contributes to addressing this persistent global health concern.
Another study [8] proposed to examine the evolution of deep convolutional neural networks (CNNs) and to highlight the impact of both transfer learning techniques and CNN architecture.The article emphasizes the valuable applications of CNN techniques in various domains of evaluating medical images, including classification, detection, and lesion segmentation.It is worth noting that, within the area of computer-aided detection (CAD), researchers are gradually revolving to CNN-based approaches for the analysis of medical images.The study suggests that the knowledge acquired from natural images can be effectively transferred to the domain of medical images, although certain challenges may arise due to disparities between the two image databases.Despite these challenges, it can be concluded that CNN-based methods possess immense potential for significant advancements in medical image analysis.
Sajja et al. [9] presented a deep neural network based on GoogleNet, which is a pre-trained convolutional neural network (CNN).GoogleNet aimed to separate benign tissue and cancerous types of CT scan images in detection of lung cancer.The authors evaluated the effectiveness of proposed model by conducting a comparison between GoogleNet network and other pre-trained CNNs, including ResNet50, AlexNet, and GoogleNet.This evaluation was performed using the Lung Image Database Consortium (LIDC) dataset.The study findings demonstrated the superiority of the proposed network, as it attained a precision rate of 93.33%, surpassing that of the other networks.
Humayun et al. [10] proposed a deep neural network for computer-aided diagnosis (CAD) within lung cancer.This study tried to address the challenge of data availability within medical image analysis by incorporating domain adaptation (DA) techniques to model the classifier.The presented model demonstrated the effectiveness, non-invasiveness, and a reduced number of parameters compared to existing state-of-the-art studies.This research specifically examined the performance of the Xception, VGG 19, and VGG 16 models in accurately classifying the lung tissue nodule data set.These results highlighted the abilities of these models in achieving accurate classification.The findings emphasise the ability of transfer learning, preprocessing techniques, and deep neural networks to facilitate the diagnosis and detection of lung cancer.
Al-Huseiny and Sajit [11] proposed deep neural networks (DNNs) as a means to detect lung tumor in its early phases by detecting images that contain cancerous nodules.The proposed algorithm, explained in definitive research, was validated, and trained using lung cancer.Interestingly, it achieved an impressive accuracy rate of 94.38%.Additionally, the algorithm outperformed the benchmark technique before being utilised with dataset.The research highlights the potential of using these technologies to improve early detection, causing more successful outcomes in the management of lung cancer cases.
Recently Mamun et al. [12] proposed a MobileNetV2 and CNN holds promise in advancing precise healthcare system by facilitating accurate and timely screening procedures.The presented model aims to harness artificial intelligence (AI) in the early detection of lung cancer through the analysis of CT scans, with the goal of improving lung health outcomes.
To estimate the performance of the proposed study, this study compared it with the Inception ResNet-50, Xception, and V3 models.The comparison was conducted based on metrics such as recall, area under curve (AUC), loss, and accuracy.The CNN proposed exhibited advanced achievement compared to the other studies, demonstrating its ability on traditional methods.This model achieved an impressive accuracy rate of 92%.
Nibali, He, and others [13] presented a way to enhance the predictive capacity of the computer-aided diagnosis (CAD) model by detecting cancer in lung nodes using CT scan images.This research uses complex CNN and ResNet architecture to classify lung nodules as benign or malignant.The study uses the LIDC / IDR dataset to address the lack of publicly available datasets.The system performs well in metrics like precision, specificity, accuracy, and AUROC.The combination of transfer learning, curriculum learning, and deep residual learning improves the precision of nodule classification, with potential applications in other medical imaging domains.The system achieved an accuracy level of 89.90%.
Shaffie et al. [14] proposed a novel framework for categorising lung nodules using computed tomography scans.The framework uses a Markov-Gifbs random domain model to capture geometric characteristics and appearance characteristics to represent the shape of the nodules.The model accurately depicts the observed nodules and is combined with the extracted geometric characteristics.This proposed is estimated using visibly available data from the Lung Image Database Consortium, achieving a precision of 91.20% for nodules, demonstrating its potential for lung cancer detection.
A recent study presented by Al-Shouka and Alheeti proposed a deep 2D CNN diagnosis approach for lung cancer has been developed.Lung cancer is a high-risk disease with a high mortality rate, making it the third most common cause of death for women and the leading cause of male mortality.Early tumour detection is crucial for effective therapy.Traditional medical imaging techniques, such as X-rays and CT scans, have limited promise for lung tumour identification.This study achieved 86% accuracy, 92% recall and an 87% F1 score [15].The same researcher advanced their work by using MobileNetV2 with CNN transfer learning, RESNET, VGG16, and Xception models for analyzing healthcare images.The AI model, using transfer learning models, can improve lung cancer diagnosis and treatment.These systems can adapt already built models to the latest assignments, providing efficient healthcare image analysis.This proposed achieved the highest accuracy at 0.94, demonstrating the potential of AI systems in healthcare [16].

RESEARCH METHODOLOGY
The proposed classifier is structured using a multistage EfficientNets B3 architecture.The dataset undergoes preprocessing as an augmentation.The methodology then examines the lung cancer classification strategy applied to the suggested classifiers.The test data are evaluated using the suggested classifier, using criteria like accuracy, F score, recall, and precision.Figure 1 shows the proposed framework.

DATASET
This study used CT-Scan images for the human chest downloaded from Kaggle.This dataset provides an extensive compilation of chest CT scans focusing primarily on lung diseases.This dataset comprises 1,000 scans obtained from patients diagnosed with various pulmonary conditions, including large.cell.carcinoma,adenocarcinoma, and squamous.cell.carcinoma.Additionally, the data set includes cases of normal lung scans for comparative analysis.The dataset was split into three subsets: the training set (613 images), the validation set (315 images), and the test set (72 images).The split done evenly across four classes for each category.Figure 2 shows a sample of the dataset [17].

DATASET AUGMENTATION
It involves enhancing or modifying something to enhance its effectiveness or quality.In the context of data augmentation, image processing refers to the generation of data by transforming or modifying existing data.Such concept is commonly employed in computer vision and machine learning applications to augment the dataset size and enhance the performance of models.In our study, we utilised various data augmentation techniques, including image rotation, fill_mode, height_shift, width_shift, horizontal flip, zoom, and shear, as illustrated in Table I.This table presents the methodological enhancements achieved through the integration of these complementary approaches with the proposed deep learning model.

TRANSFER LEARNING
Transfer learning, also known as domain adaptation, is a conceptual framework that uses knowledge from one domain to tackle related tasks [18].In this context, the authors of the study applied their parameters to the ImageNet dataset, which is of significant importance in deep learning applications.Transfer learning serves as a semisupervised learning technique that mitigates the reliance on labelled data, particularly in scenarios where data collection is limited.One notable strength   of this approach is its impressive ability to generalise well across different tasks.The study successfully used transfer learning and fine-tuning techniques to implement EfficientNets B3, demonstrating its effectiveness in addressing related tasks.

PROPOSED EFFICIENTNET B3
After applying data augmentation on a Chest CT-Scan images dataset, the transfer learning model is applied.This research focused on the EfficientNet family of models, specifically EfficientNet B3.EfficientNet models are designed to strike a balance between accuracy and efficiency in convolutional neural networks (CNNs) through a unique strategy called compound scaling.This approach uniformly scales the width, depth, and resolution dimensions using predefined coefficients.EfficientNet B3, as part of this family, embodies this balance with a specific architecture tailored to offer notable generalizability [19].
In the construction of our model, we used the Keras library to instantiate EfficientNet B3 as the base model with pre-trained weights from ImageNet.To leverage the pre-trained features effectively, we froze the first half of the layers.Subsequently, we added custom layers on top of the base model.A Global Average Pooling layer was employed to reduce spatial dimensions, followed by two Dense layers (were 128 and 64) with ReLU activation functions.Dropout layers (was 0.5) were strategically inserted to introduce regularization and prevent overfitting [20].
The final layer, a Dense layer with softmax activation, was configured for our multi-class classification task (4 class).This model, with a distinctive architecture, demonstrated its effectiveness in capturing intricate patterns, especially in higher-quality images.The compound scaling method ensured that EfficientNet B3 struck an optimal balance between model parameters, accuracy, and computational efficiency.Table 2 presents the hyperparameter of the model.

EVALUATION
To validate the performance of the proposed EfficientNet B3 model, models are evaluated by specialists using various performance metrics to assess their effectiveness including precision, accuracy, F1-score, and recall.
Accuracy is one of the mainly applied metrics in such domain.This metric calculates the number of true predictions that model was able to detect.It is calculated by dividing the sum of true negatives (TN) and true positives (TP) by the sum of TP, TN, false negatives (FN), and false positives (FP) [21].
Precision measures the rate of correct predicted outcomes that model was able to provide.It is determined by dividing the number of TPs by the sum of TP and FP [22].
The recall rate, also known as the sensitivity, or true positive rate, represents the rate at which the model was able to provide positive outcomes over the whole predictions the model provided.It is calculated by dividing the number of TP by the sum of TP and FN [23].
The final metric we are using is the F1 score.This score is a composite metric determined by enlarging the score of precision and recall over the total of these scores.It provides a stable assessment of the system performance considering both recall and precision [24].

RESULT AND DISCUSSION
Lung cancer was classified into four types which are Adenocarcinoma, Large cell carcinoma, Squamous cell carcinoma, and normal cell.The model was trained using EfficientNet methods.The proposed model achieved the highest classification accuracy, outperformed classifiers in all testing scenarios.The accuracy of the proposed model reported a score of 96%, which is the highest accuracy score compared with other proposed studies.Table 3 presents the results.
As shown in Figure 3 the first figure (a) is a ROC curve and the second figure is a precision and recall curve.
It is important to compare the proposed results with state-of-the-art research using the same data set.This article [15] is a probabilistic deep 2D CNN diagnosis method for lung cancer, with the aim of improving early tumour detection for effective therapy.Although traditional imaging techniques such as X-rays and CT scans have limited promise, the study achieved 86% accuracy.This study [25] proposes three CNN models for detecting lung cancer using VGG16, ResNet50V2, and DenseNet201 architectures.The ensemble model, which includes these models, achieves 91% validation accuracy, outperforming other existing models.This improves the performance and accuracy of lung cancer detection.This study [12] suggests that convolutional neural networks (CNN) and MobileNetV2 can improve classification of medicine providing on time and accurate screening procedures.The model trained to provide a quick classification of lung cancer using CT scans images..The model improvement accuracy is 92%.This study by [16]

CONCLUSIONS
This study presents a deep learning algorithm using EfficientNet B3 for lung cancer detection, showing its potential to improve medical imaging and patient care.The algorithm distinguishes malignant and normal tissue genre in CT scan images, achieving a 96% accuracy rate compared to the best-trained classifier.The model also shows good generalisability, indicating its potential to enhance lung cancer detection in diverse cases.The high accuracy of the algorithm suggests its potential to improve patient outcomes and reduce the burden of lung cancer.Future research should focus on validating the algorithm in larger datasets and integrating other clinical data.Advancements in deep learning algorithms and advanced neural network architectures could further enhance the precision and efficiency of lung cancer detection accuracy and efficiency.

FIGURE 2 .
FIGURE 2. Sample of CT scan chest image dataset

FIGURE 3 .
FIGURE 3. (a) ROC curve; (b) Precision and recall curve proposed CNN domain adaptation techniue with the Xception, VGG16, MobileNetV2, and RESNET models to analyze medical images, demonstrating the potential of machine learning models to improve lung cancer diagnosis and treatment, with the highest accuracy achieved at 0.94.Depending on the results, this proposed EfficientNet B3 model is still competitive.As shown in Figure 4 a comparison results with related studies.

FIGURE 4 .
FIGURE 4. Results of a comparison with related studies