Lung cancer is the leading cause of cancer death in men and women, accounting for more than a quarter of all cancer deaths worldwide. Lung cancer is currently the most common and deadly cancer globally, with over two million new cases diagnosed each year and more cancer-related deaths than all other cancers combined, including colon, breast, and prostate cancers.
Despite significant advances in the diagnosis and treatment, lung cancer is still associated with poor clinical outcomes, and survival is strongly determined by the stage of disease at diagnosis. Therefore, developing an effective screening method for early diagnosis is paramount and has been a long-term goal in lung cancer care.
Several screening methods were used in the last decade, including sputum cytology, chest radiographs (CXR), and low-dose computer tomography (LDCT). Although screening lung cancer with LDCT has demonstrated a clear benefit in early detection and reducing mortality, this method has various limitations, especially a high rate of false positives and the cost of unnecessary diagnostic procedures needed to confirm or rule out those false positives.
The emergence of artificial intelligence (AI) as a new tool for assessing medical data opens up new opportunities for improving lung cancer diagnosis. Taking advantage of the ability of AI to recognize complex patterns and quantify information obtained from tissue/fluid biomarkers, electronic medical record (EMR), magnetic resonance images (MRI), or computer tomography (CT) scans, AI has the potential to aid clinicians in deciphering the clinical significance of data, leading to improved diagnosis of lung diseases.
Using complex algorithms and software, artificial intelligence (AI) can emulate human cognition in analyzing, interpreting, and comprehending complex data. Currently, it is being successfully applied in various healthcare settings.
In the last decade, several AI models were used for lung cancer screening, and some algorithms performed equal or even outperformed experienced radiologists in distinguishing benign from malign lung nodules. In addition, some of those models improved diagnostic accuracy and decreased the false-positive rate.
This post will explore various artificial intelligence tools and machine learning or deep learning models in lung cancer screening.
Cancer screening is the process of detecting undiagnosed cancer in asymptomatic people through tests, examinations, and other detection procedures. Early disease detection is the goal of cancer screening. Today, we use AI tools, taking advantage of the unlimited capacity of computers for analyzing data and images that may be specific to lung cancer to more accurately differentiate between benign and cancerous nodules.
Through automated segmentation and through the rapid expansion of computing speed and the increased efficiency of AI algorithms, AI has the potential to improve the efficiency, reproducibility, and accuracy of tumor identification. Furthermore, with the help of AI algorithms, it is likely that a separate segmentation analysis of suspicious images will be unnecessary, thus making it possible to evaluate whole-body imaging data.
Below is a list of AI models used by researchers to analyze low-dose computer tomography (LDCT) images for lung cancer diagnosis.
1. Convolutional neural network (CNN)
This model, introduced by Francesco Ciompi, was one of the first to illustrate the potential of AI algorithms in assisting radiologists in diagnosing pulmonary nodules in lung cancer screening. It used a multi-stream convolutional network architecture to analyze lung nodules and classified nodule types relevant for patient management based on the Lung-RADS assessment categories and the PanCan malignancy postulates. Data from the MILD trial (943 patients; 1352 nodules) was used to train the model, which was then independently validated using data from the DLCST trial (468 patients; 639 nodules).
2. Dynamic Bayesian networks (DBN)
Panayiotis Petousis proposed this method, which used a set of dynamic Bayesian networks to assess the utility of combining longitudinal data from lung cancer screening programs to improve diagnostic accuracy. The researchers used LDCT screening outcome data and demographic data, smoking status, cancer history, family lung cancer history, exposure risk factors, and lung cancer comorbidities from the LDCT arm of the NLST dataset models. They then tested the models on the entire LDCT arm of the NLST dataset. They found that they outperformed traditional comparison models like logistic regression and nave Bayes in terms of generalization, indicating coupling LDCT imaging data with demographic clinical characteristics could help improve the accuracy of lung cancer screening programs.
3. Three-dimensional convolutional neural network
Chao Zhang used a three-dimensional convolutional neural network to classify pulmonary nodules derived from clinical CT images. The model was first trained using LDCT images from lung cancer screenings available in public databases, then validated using clinical LDCT images from four different hospitals. It was then evaluated on a 50-image set of patients who underwent surgical dissection and had preoperative CT images prospectively collected. Finally, the algorithm’s performance was compared to that of other algorithms.
4. Machine learning and DBN
Panayiotis Petousis and his colleagues also used various machine learning-based methods to create a framework for learning a partially-observable Markov decision process that improves test specificity while optimizing lung cancer detection. The model was trained and tested using inverse reinforcement learning to discover a rewards function based on expert decisions using a dataset of 5402 single nodule unique trajectories of lung cancer screening patients from the NLST LDCT trial. The model had a high level of accuracy, with an actual positive rate comparable to that of human experts, while also lowering the false-positive rate.
5. Deep learning
Another study by Peng Huang proposed a deep learning algorithm that evaluates all relevant nodule and non-nodule features on screening chest CT scans to predict the presence of lung cancer within three years. The model was trained using data from the NLST trial, which included participants who had received at least two CT screening scans separated by at least two years and validated using data from the PanCan study. Two skilled chest radiologists carried out this double validation in large academic centers assessing each LDCT image. Using the time-dependent area under the receiver operating characteristic curve (AUC) analysis, the accuracy of the deep learning algorithm scores to predict lung cancer incidence at 1 year, 2 years, and 3 years was compared to that of the Lung-RADS system and volume doubling time.
6. Deep learning (Three-dimensional deep convolutional neural networks)
Diego Ardila developed a three-dimensional deep convolutional neural network model that used patients’ current and prior CT volumes from LDCT scans to predict lung cancer risk in high-risk individuals. When the model was tested against 6716 NLST cases, it had an area under the curve of 94.4 percent (AUC). It performed similarly when tested against an independent clinical validation set of 1139 cases. When previous CT images were available, the model was performed with six expert radiologists with similar accuracy. However, when previous CT images were not available, the model outperformed all six radiologists with absolute reductions of 11% in false positives and 5% in false negatives.