World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Comparison of two reader modes of computer-aided diagnosis in lung nodules on low-dose chest CT scan

    https://doi.org/10.1142/S1793545822500134Cited by:3 (Source: Crossref)

    Abstract

    Low-dose computerized tomography (LDCT) scanning is of great significance for monitoring and management of pulmonary nodules on chest computerized tomography (CT). Nevertheless, the malignant potential of these nodules is often difficult to detect, especially for some smaller pulmonary nodules on LDCT images. Recent advances using the state-of-art computer-aided detection (CAD) system have attempted to address this problem by identifying small nodules that can be easily missed during clinical practice. CAD is used in two reading modes: Concurrent-reader (CR) mode or second-reader (SR) mode. In this study, we prospectively evaluated the efficiency of a CAD system’s SR and CR modes in detecting pulmonary nodules on LDCT. We found that the SR mode improves pulmonary nodule detection regardless of the dose and experience level, especially for interns in the low-dose setting. The CR mode maintains the sensitivity of SR mode while significantly decreasing reading times.

    1. Introduction

    Lung cancer is one of the most common fatal diseases, which has killed millions of people worldwide. As known, pulmonary nodules often occurred in early-stage lung cancer when it could be treated effectively. Additionally, pulmonary metastases that often manifested as pulmonary nodules occur in approximately 1/3 of extra-thoracic malignancies.1 However, the malignant potential of these nodules is often difficult to detect, especially for some smaller pulmonary nodules, because it is impractical or difficult to establish a definitive diagnosis by obtaining pathological tissues.2 In this case, it is important to identify and monitor pulmonary nodules as early as possible.

    Due to the advantages of noninvasion and low cost, chest CT became a routine checkup method to detect lung cancer and a very effective method for detecting small pulmonary nodules.3 For example, reports had shown a high detection rate of pulmonary nodules in patients with extra-pulmonary malignancies on thin-section computed tomography CT.4 To decrease cumulative radiation exposure from repeated routine CT exams, Low-dose computerized tomography (LDCT) can be employed as an effective option for routine monitoring of metastatic pulmonary nodules.5,6 However, low-dose and thin-slice thicknesses lead to increased image noise in LDCT images, which may make it more difficult for the radiologist to find lesions.7 Also, the workload of viewing a large number of images causes the fatigue and exhaustion of radiologists, contributing to the generation of false-negative results in the diagnostic decision.8,9,10 Recent advances in using the state-of-art CAD system based on DL have attempted to address this problem by identifying small nodules that can be easily missed during clinical practice.8,11 Several studies have demonstrated promising results of CAD improved the detection of pulmonary nodules and inter-reader agreement, leading to more consistent follow-up recommendations across radiologists with different levels of experience.8,11,12,13,14,15 Previous studies.1,16,17,18 have evaluated CAD in the detection of pulmonary nodules in patients with extra-thoracic malignancies on stand-dose CT (SDCT). However, studies on the evaluation of the detection of pulmonary nodules in patients with extra-pulmonary malignancies using DL-based CAD on low-dose CT images are still few and insufficient.

    CAD is often used in two different reading modes: CR mode (deployed simultaneously with the unassisted interpretation).19 or SR mode (only deployed after the radiologist has performed a full unassisted analysis of the data set first).20 The SR mode is more effective and accurate.11,13,14,15,21,22 except for longer reading time, since it has to read images twice.23 Comparatively, the CR mode is potentially more attractive for its high efficiency despite that its effectiveness remains controversial.23,24 Matsumoto et al.24 reported that the sensitivity in the CR mode increased significantly compared with unaided reading. On the contrary, Beyer et al.23 found that the CR mode led to no increase in the sensitivity.

    The purpose of this study was to prospectively assess the value of a CAD system in SR and CR modes for the detection of pulmonary nodules on LDCT images of patients with extra-thoracic malignancies as evaluated by radiologists with a wide spectrum of experience in the clinical setting. We hypothesized that, on both SDCT and LDCT images, CAD would improve the detection of pulmonary nodules and make detection rates more comparable across radiologists of all experience levels. We also hypothesized that the CR mode would decrease reading time.

    2. Materials and Methods

    The local institutional review board approved this prospective study. Written consent was obtained from all participants.

    2.1. Patients

    From July–December 2017, 129 patients with extra-thoracic malignancies scheduled consecutively for chest CTs were recruited. Exclusion criteria were severe pulmonary fibrosis,severe emphysema, extensive pulmonary infection, tuberculosis or sarcoidosis, and massive pleural effusion. One hundred and seventeen patients were ultimately included. The present study was approved by the ethics committee of Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology (TJ-IRB20180615).

    2.2. CT acquisition

    All patients underwent two noncontrast chest CT scans during an end-inspirational breath-hold: (1) SDCT scan (120 kvp with 150 mAs, NI = 13); (2) LDCT scan (120 kvp for BMI ≥ 22, n = 88; 100 kvp for BMI < 22, n = 41, with 20mAs, NI = 30) on a 64-row multidetector system (Discovery CT750 HD; GE Healthcare).The dose is calculated based on the anti-true human body model and the Monte Carlo method.25 The mean size-specific dose estimate (SSDE) was calculated to be 10.54 ± 1.6mGy and 8.43 ± 1.54mGy for SDCT images and 2.24 ± 0.14mGy and 1.55 ± 0.07mGy for LDCT images.26 The effective dose after ICRP guideline 103 of LDCT versus SDCT was 1.23 ± 0.14 versus 6 ± 0.89mSv and 0.76 ± 0.1 versus 4.71 ± 0.97mSv. Data were reconstructed to 1.25mm-thick sections, with ASIR 60% and “Standard” kernels, which provide higher spatial resolution and are frequently used for the lung.14,27 Importantly, they are recommended for nodule management in current recommendations.27,28,29

    2.3. CAD software

    In this study, we implemented a commercially available AI system for nodule detection and characterization (inferRead CT Lung Research, Beijing, China). The principle of this AI system has been previously described, and the performance of which was vigorously validated in several studies.30,31,32,33 In addition, this AI system has been officially approved by the Food and Drug Administration (FDA) in the USA (510(k) K192880), Conformite Europeenne (CE) in Europe (ISO 13485: 2016), and National Medical Products Administration (NMPA) in China (Registration No. 20203210839) for clinical use. The system is trained on a large data set of over 450,000 chest CT images. Each lesion identified by the software was marked on the image showing the lesion’s largest diameter.

    2.4. Detection study

    All readers were trained on the CAD system. Radiologists were blinded to patient identity and evaluated images independently. Radiologists were allowed to adjust the window level, zoom in and out, invert gray-scale, and use maximum intensity projection (MIP) thick slab images. Images were randomized.

    Acceptable image quality and the reference standard for the pulmonary nodules34 on both SDCT and LDCT images were determined by the consensus of three chest radiologists, each with >20 years of experience. They read images twice in the CR mode in three months between sessions. CAD-detected lesions were included in the reference standard if all radiologists agreed they represented true pathology. None of these radiologists participated in the below studies.

    Nine observers completed the first study: Three interns (< 1 year of experience), observers A–C; three radiology residents (2–3 years of experience), observers D–F; and three experienced chest radiologists (5–10 years of experience), observers G–I. Each observer identified the number of nodules per patient and nodule size (≤ 4mm, > 4mm to < 6mm, 6mm to 8mm, and > 8mm) and characteristics (solid, subsolid, or calcified) on 117 LDCT image sets first without CAD (unaided mode) and then with CAD in the SR mode. Eight weeks later, the same readers repeated the above task with the SDCT images.

    To prevent reading fatigue, 40 of the 117 LDCT sets were randomly chosen for the second study. Eight different observers (3 interns, 3 radiology residents, and 2 experienced radiologists) identified pulmonary nodules on 20 datasets without CAD; they used the CR mode for the remaining 20 datasets. Eight weeks later, the same observers read the images that they had initially read unaided in the CR mode and vice versa. Fourteen weeks after the first session, all 40 datasets were read in the SR mode. Observers assigned confidence levels to each nodule from 1 (probably not a nodule) to 4 (definitely a nodule).35 Reading times were recorded.

    2.5. Data analysis

    All statistical analyses were performed using SPSS 21.0. True positive (TP) rates/sensitivity, and false positive (FP) rates were calculated for all conditions, including mode, dose, experience level, and nodule descriptors (size, characteristics). The FP rate was computed as the number of FP detections per CT. Differences in sensitivity and FP rates were compared between SDCT and LDCT images, between the unaided and SR mode on both SDCT and LDCT images,and between the unaided reading, SR, and CR reading modes on LDCT images. Reading times were also compared among three reading modes on LDCT images. The sensitivity was further compared according to the nodule diameter and characteristics. Normality was tested using the Kolmogorov–Smirnov test. Normally distributed variables were compared using analyses of variance (ANOVA); the remaining variables were compared using the Wilcoxon signed-rank test. The Mann–Whitney test was used to compare performance (i.e., sensitivity, FPs) between experience levels on both SDCT and LDCT images and in all reading modes.

    Free-response receiver operating characteristic (FROC) curve analyses in implementations (JAFROC 4.2.1; www. devchakraborty.com) were used to analyze true-positive versus false-positive rates for all reading modes35,36,37,38 on LDCT images. Figures of merit (FOM) representing the nonparametric estimate (ANOVA) of the area under the curve and its 95% confidence interval were calculated using the jackknife method38 FOMs were also calculated by experience level.

    Significance was set at P < 0.05. However, when multiple comparisons were conducted among the experience groups and three reading modes using the Wilcoxon signed-rank test,the Bonferroni-corrected P-value was set to alpha = <0.05∕3.

    3. Results

    3.1. Patient characteristics

    LDCT was completed at 120 kVp for the first 81 consecutive patients (44 women, 37 men; mean age 50.63 ± 9.54 years, range 29–78 years, BMI 23.82 ± 1.5kg/m2, range 22–28kg/m2). After preliminary data analysis, the radiation dose for LDCT was lowered to 100kVp for the final 36 consecutive patients (18 women, 18 men; mean age 48.53 ± 13.25 years, range 22–76 years, BMI 20.58 ± 0.87kg/m2, range 18.5–22kg/m2). Image quality was determined to be acceptable in 100% of patients. All the enrolled patients were followed up for one year and 27 nodules were found to have enlarged pulmonary nodules, of which 6 were smaller than 4mm, 6 were between 4 and 6mm, 8 were between 6 and 8mm, and 7 were larger than 8mm in diameter.

    3.2. Observer performance

    3.2.1. Study 1

    Unaided Reading technique (Without CAD) versus SR Mode

    Table E1 (online) presents the nodules as a reference standard both in the standard-dose and the low-dose settings for study 1 (117 patients). Readers, regardless of experience level, were more sensitive at detecting nodules in the SR mode than without CAD on both standard-dose and low-dose images (all P < 0.001). Observer A (intern) had a much lower detection rate than the other participants (Table E2). To explore the effect of observer A on the results, a sensitivity analysis in which observer A was excluded was performed (Table 1). The average sensitivities of the 8 observers continued to be significantly higher when using the SR mode compared to the unaided reading technique on images at both dose settings (all P < 0.001). The experienced radiologists and residents showed significantly better detection of pulmonary nodules in the unaided reading technique than interns regardless of the dose (all P < 0.001), while experienced radiologists, residents, and interns had comparable sensitivities in the SR mode regardless of the dose (all P > 0.05). In a subgroup analysis, the sensitivity in the SR mode was also statistically higher than in the unaided reading technique for nodules with a diameter ≤ 8mm and in all characteristics on both SDCT and LDCT images (all P < 0.05) (Table 2).

    Table 1. Results of nodule detection in the unaided reading and second-reader modes: sensitivity and false positive rate per CT (exclude observer A).

    SensitivityFalse positive rate
    URSR*URSR**
    GroupsSDLDPSDLDPSDLDPSDLDP
    Interns
    B22.77%20.36%45.09%45.70%3.63/CT2.05/CT7.45/CT5.34/CT
    C30.28%26.17%56.42%61.21%1.51/CT1.82/CT2.74/CT3.36/CT
    Average26.47%23.21%0.46150.68%53.33%0.8842.56/CT1.89/CT0.0025.06/CT4.23/CT0.034
    Residents
    D36.36%36.62%53.48%54.46%1.21/CT1.13/CT1.88/CT1.90/CT
    E46.82%37.23%59.39%63.08%1.49/CT1.39/CT1.89/CT2.30/CT
    F45.45%44.24%61.97%62.37%3.82/CT3.58/CT5.91/CT4.42/CT
    Average42.88%39.36%0.00458.28%59.97%0.8582.17/CT2.03/CT0.0073.23/CT2.87/CT0.066
    Experienced radiologists
    G39.55%30.15%68.18%64.31%0.76/CT0.88/CT1.15/CT1.57/CT
    H57.12%54.31%71.21%70.31%1.66/CT2.02/CT2.24/CT2.62/CT
    I50.15%47.08%68.79%61.69%1.42/CT1.17/CT1.95/CT1.54/CT
    Average48.94%43.85%0.00869.39%65.44%0.0071.28/CT1.36/CT0.4111.78/CT1.91/CT0.318
    Average of all43.96.%39.76%< 0.00062.52%61.06%0.1111.81/CT1.79/CT< 0.0002.76/CT2.57/CT< 0.000

    Notes: UR = unaided reading technique, SR = the second-reader mode, SD = standard-dose CT LD = lowdose CT, and X/CT: The average number of false-positive pulmonary nodules detected on each CT image.

    Table 2. Characteristic of nodules in unaided reading and the second-reader modes of CAD on standard-dose and low-dose CT images

    AttenuationSize rangeLocation
    SolidSubsolidCalcified< 4(4, 6)[6, 8]> 8SubpleuralNonsubpleural
    SDLDSDLDSDLDSDLDSDLDSDLDSDLDSDLDSDLD
    Groups(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)(%)
    InternsUR22.5919.6321.3339.686.6712.519.317.8625.8134.6235.2947.6293.9490.9116.418.724.3922.44
    SR41.8544.445.3353.9737.7841.6739.4342.7458.0661.5452.9457.1410010032.840.6547.5647.8
    P< 0.001< 0.0010.0010.007< 0.0010.001< 0.001< 0.0010.002< 0.0010.0030.0140.1570.083< 0.00< 0.001< 0.001< 0.001
    ResidentsUR41.9840.6347.5640.5245.9321.7140.636.6847.352.0870.5957.4596.6796.6739.8739.9244.7239.05
    SR57.4762.7762.6755.660.7433.3356.9158.5856.9960.4270.5965.9610010054.856.7260.4161.84
    P< 0.001< 0.001< 0.001< 0.001< 0.001< 0.001< 0.001< 0.0010.0030.00510.0460.3170.317< 0.001< 0.001< 0.001< 0.001
    Experienced radiologistsUR48.9543.9849.7842.4747.4144.4446.3140.8364.5262.3768.6366.6710010050.5348.9747.9740.79
    SR70.4366.9269.7859.3656.357.7868.2864.0272.0467.7472.5576.4710010068.464.27066.18
    P< 0.001< 0.001< 0.001< 0.0010.001< 0.001< 0.001< 0.0010.0080.0250.1570.02511< 0.001< 0.001< 0.001< 0.001
    AllUR42.23944.76412540.9530.134035.7451.6154.4264.7159.6696.6795.9641.0940.7343.2137.41
    SR60.7961.8663.245755.5645.1959.2858.6163.5963.7268.9168.9110010057.4957.6162.6861.68
    P< 0.001< 0.001< 0.001< 0.001< 0.001< 0.001< 0.001< 0.001< 0.001< 0.0010.025< 0.0010.0830.0460.001< 0.001< 0.001< 0.001

    Notes: UR = unaided reading technique, SR = the second-reader mode, SD = standard-dose, and LD=low-dose.

    Regardless of the experience level and dose, FP rates in the SR mode were significantly higher than in the unaided reading technique (all P < 0.0001). Additionally, significant differences in FP rates existed between experienced radiologists, residents, and interns in the SR and unaided reading technique on both SDCT and LDCT images (all P < 0.05).

    SDCT versus LDCT

    Readers, regardless of the experience level (excluding observer A), were more sensitive at detecting nodules on SDCT images than LDCT images in the unaided mode. However, the SR mode resulted in comparable sensitivities between SDCT and LDCT images, and interns even had a higher sensitivity on LDCT images (P > 0.05) (Table 1).

    FP rates were comparable regardless of the dose within each mode for experienced radiologists (P = 0.411 and 0.318), and residents and interns even had a lower FP rate on LDCT images compared to SDCT.

    3.2.2. Study 2

    Sensitivity and FP rates in three reading modes

    Table E3 (online) shows the nodules as a reference standard both in the standard-dose and the low-dose settings for study 2 (40 patients). The average sensitivity of reading with CAD in the CR mode (64.76%) was significantly higher than reading without CAD (44.71%, P < 0.001) and was comparable to reading in the SR mode (66.86%, P = 0.033 > 0.05∕3) on LDCT images. In a subgroup analysis, the sensitivity in the CR mode was also statistically higher than in the unaided reading technique and was comparable to reading in the SR mode for nodules with a diameter less than 6mm and for solid and subsolid nodules (all P < 0.001). There were no statistical differences regarding the sensitivity among all experience levels regardless of reading modes (all P > 0.05). FP rates in the CR mode (average, 4.09/CT) were significantly higher than in the unaided reading technique (average, 2.67/CT, P < 0.0001)) and were comparable to FP rates in the SR mode (average, 4.11/CT, P = 0.26) (Tables 3 and 4, Fig. 1).

    Fig. 1.

    Fig. 1. Examples of a pulmonary nodule that was misdiagnosed by observers without CAD. (a–c) Male patient, 51-year-old, with metastatic adenocarcinoma of the neck. The nodule (1–2mm in size, red circle) was detected by only one observer without CAD; in the concurrent-reader mode, the tumor was correctly identified by six observers. (d–e) Male patient, 48-year-old, with rectal cancer. (d) Nodule (2mm in size) was detected by CAD. However, it was detected by only one observer without CAD in the low-dose settings; in the concurrent-reader and second-reader mode, the tumor was correctly identified by six observers. (e) Seven months later, the size of the pulmonary nodule (yellow circle) increased obviously.

    Table 3. Results of nodule detection in the unaided reading, the concurrent-reader and second-reader modes of CAD: sensitivity, false positive rate per CT.

    SensitivityFalse positive rate
    URCRSRP1P2P3URCRSRP1P2P3
    Experienced chest radiologists
    139.35%62.13%65.98%< 0.0001< 0.00010.0811.38/ct3.38/ct3.425/ct0.0000200.0000100.899782
    258.88%70.41%76.92%0.009< 0.00010.0574.23/ct3.23/ct4.275/ct0.0172760.8357220.002641
    Average of 1–249.56%66.27%71.45%< 0.0001< 0.00010.0302.8/ct3.3/ct3.85/ct0.1061240.000640.020213
    Residents
    339.94%66.57%62.43%< 0.0001< 0.00010.1422.28/ct5.05/ct4.3/ct< 0.0001< 0.00010.053
    434.91%57.69%67.16%< 0.0001< 0.00010.051.05/ct2.5/ct2.8/ct< 0.0001< 0.00010.751
    557.69%59.17%57.40%0.5410.8550.6292.6/ct2.63/ct2.63/ct0.9770.6820.424
    Average of 3–544.18%61.14%62.33%< 0.0001< 0.00010.5591.98/ct3.39/ct3.2/ct< 0.0001< 0.00010.161
    Interns
    629.59%65.09%72.49%< 0.0001< 0.00010.9481.4/ct4.7/ct4.43/ct< 0.0001< 0.00011.000
    738.17%64.79%63.02%0.0130.0020.3072.83/ct4.58/ct4.63/ct0.005.00050.860
    854.73%72.19%69.53%< 0.01< 0.00010.5315.6/ct6.68/ct6.63/ct0.0300.0350.310
    Average of 6–840.83%67.36%68.34%< 0.0001< 0.00010.4803.28/ct5.31/ct5.23/ct< 0.0001< 0.00010.263
    Average of 1–844.27%64.76%66.86%< 0.0001< 0.00010.0332.67/ct4.09/ct4.11/ct< 0.0001< 0.00010.259521

    Notes: UR = unaided reading technique, CR = the concurrent-reader mode, SR = the second-reader mode, P1: UR versus CR, P2: UR versus SR, P3:CR versus SR.

    Table 4. Characteristic of nodules in the unaided reading, the concurrent-reader and second-reader modes of CAD on low-dose CT images.

    AttenuationSize rangeLocation
    SolidSubsolidCalcified≤ 4(4, 6)[6, 8]> 8SubpleuralNonsubpleural
    InternsUR41.87%27.93%50.00%37.40%55.56%78.79%100%41.85%40.16%
    CR67.50%63.96%71.67%65.24%74.07%96.97%100%61.65%71.06%
    SR67.26%63.96%91.67%66.01%77.78%100%100%65.66%70.08%
    P1< 0.0001< 0.00010.01700.0120.0341< 0.0001< 0.0001
    P2< 0.0001< 0.0001< 0.000100.0010.0081< 0.0001< 0.0001
    P30.95210.0010.6050.5270.31710.0480.628
    ResidentsUR43.89%37.84%60.00%41.25%51.85%81.81%100%40.55%46.57%
    CR60.74%54.05%80.00%58.64%72.22%90.90%100%51.74%67.32%
    SR61.45%53.15%91.67%59.08%81.48%100%100%56.14%66.34%
    P1< 0.00010.0010.003< 0.00010.0280.1810.001< 0.0001
    P2< 0.00010.008< 0.0001< 0.0001< 0.00010.0141< 0.0001< 0.0001
    P30.7120.8130.0080.9360.0960.08310.0950.984
    Experienced radiologistsUR48.58%48.65%57.50%45.87%63.89%86.36%100%52.61%46.81%
    CR65.66%66.22%75.00%65.02%66.67%81.81%100%57.09%72.30%
    SR69.75%77.03%85.00%69.14%86.11%95.45%100%69.55%72.68%
    P1< 0.00010.0490.09< 0.00010.7050.65510.416< 0.0001
    P2< 0.00010.0010.005< 0.00010.0050.15710.001< 0.0001
    P30.0410.0210.2850.0410.0080.0831< 0.00010.383
    AllUR44.31%36.82%55.63%40.97%56.25%81.81%100%44.35%44.21%
    CR64.50%60.81%75.63%62.71%71.53%90.91%100%57.05%69.76%
    SR65.70%63.18%90%64.19%81.25%98.86%100%63.06%69.33%
    P1< 0.0001< 0.0001< 0.0001< 0.00010.0010.0591< 0.0001< 0.0001
    P2< 0.0001< 0.0001< 0.0001< 0.0001< 0.0001< 0.00011< 0.0001< 0.0001
    P30.1830.311< 0.00010.150.0060.0081< 0.00010.881

    Notes: UR = rmunaided reading technique, CR = the concurrent-reader mode, SR = the second-reader mode, P1: UR versus CR, P2: UR versus SR, P3:CR versus SR.

    The observer-averaged JAFROC FOM was 0.58 (95% confidence interval: 0.53, 0.64) for the CR mode and 0.48 (95% confidence interval: 0.41, 0.54) for the unaided reading technique, yielding a significant difference (P < 0.0001) (Table 4). The average FOM value in the SR mode was 0.61 (95% confidence interval: 0.55, 0.66),indicating comparability between CR and SR modes (P = 0.259). The alternative FROC curves for all modes are shown in Fig. 2. For experienced radiologists, the FOM was significantly better when using CAD in the CR mode than in the unaided mode (0.61 versus 0.49, respectively; P = 0.0003) and not significantly different from using the SR mode (0.63, P = 0.53); the same trend applies to both residents (0.58 versus 0.48 versus 0.58, respectively) and interns (0.59 versus 0.44 versus 0.60, respectively). Detailed results are provided in Table 5.

    Fig. 2.

    Fig. 2. Receiver operating characteristic curves representing detection performances of 8 radiologists in the unaided reading (red line), concurrent-reader CAD (green line), and second-reader CAD (purple line). The area under the receiver operating characteristic curve (AUC) of 0.48 in the unaided reading, 0.58 in concurrent-reader CAD, and 0.61 in second-reader CAD.

    Table 5. Results of nodule detection: figure-of-merit value.

    GroupsURCRSRP1P2P3
    Interns0.410.610.63< 0.0001< 0.00010.4987
    0.420.580.57< 0.00010.00010.752
    0.490.580.600.00850.0030.7191
    Average0.440.590.60< 0.0001< 0.00010.675
    Residents0.450.590.560.00020.00330.355
    0.460.550.630.05170.00020.0564
    0.560.580.550.55090.58650.2556
    Average0.490.580.580.00160.00090.8675
    Experienced radiologists0.500.580.610.01760.00150.3868
    0.480.610.670.001< 0.00010.1507
    Average0.490.610.630.0003< 0.00010.5291
    Average of all0.480.580.61< 0.0001< 0.00010.259

    Notes: UR = unaided reading technique, CR = the concurrent-reader mode, SR = the second-reader mode, P1: UR versus CR, P2: UR versus SR, P3:CR versus SR.

    Reading times

    The reading times of the 8 observers in unaided reading, CR and SR reading modes are shown in Table E4. Regardless of the experience level, reading times were significantly shorter in the CR mode(average, 165 ± 133s) than unaided reading(average, 235 ± 162s,P < 0.0001)and SR modes (average, 294 ± 153; P < 0.0001).

    4. Discussion

    LDCT decreases cumulative radiation exposure while maintaining good diagnostic quality for pulmonary nodules.39 However, the noise interference during LDCT makes the detection of lesions more challenging,7 especially smaller pulmonary nodules. The present study aimed to assess the value of a DL-based CAD system in the detection of pulmonary nodules in patients with extra-thoracic malignancies. The study was conducted in SR and CR modes and evaluated SDCT and LDCT images, providing a comprehensive evaluation in a realistic clinical environment. The results showed that the CAD system improved the detection sensitivity on SD and LDCT images across all readers, especially beneficial for interns on reviewing LDCT images. We also addressed the relationship of the diagnostic sensitivity with varied dose and experience levels, specifically the SR mode allowed for comparable sensitivities regardless of the dose and experience level and the CR mode significantly decreased reading times without impacting the sensitivity.

    Previous studies have shown that the SR mode increases the diagnostic sensitivity compared to the unaided reading.14,21,22,40,41,42 Our results confirmed and further expanded this finding by showing that, in comparison with the unaided reading mode, DL-based CAD improved the sensitivity particularly on nodules with a diameter less than 8mm, a clinically important size that related to malignancy.43 The larger the pulmonary nodules, especially those nodules ≥ 5mm, the higher the probability of pulmonary metastasis; moreover, the larger nodules have more distinct morphological features.44 But smaller pulmonary nodules also need more attention so as to detect small pulmonary metastases earlier. While the use of the SR mode on SDCT images is not a new concept, the performance comparison between SD and LDCT images in the SR mode has not been fully evaluated. It is acknowledged that LDCT images contain increased noise and comprised image quality that potentially results in an increased risk of missing pulmonary nodules. Our results confirmed this finding and showed reduced detection sensitivity in LDCT images than that in SDCT images. Whereas, the SR mode yielded comparable sensitivities between SD and LDCT images. The results suggested that SR improved diagnostic performance on LDCT images may potentially address the concern regarding the noise level in LDCT images.

    Additionally, the present study demonstrated that the experience level-dependent detection sensitivity of pulmonary nodules was observed among varied reading modes. For experienced radiologists and residents who were bearing higher detection efficiency than interns in the unaided reading technique, the SR mode allowed identified sensitivities across all experience levels at either dose setting. This finding aligned well with the previous study that CAD-assisted diagnosis improves the ability of less-experienced readers, interns particularly in the detection of pulmonary nodules.38 The consistency detection efficiency across all experience and dose levels adopts great significance in real clinical practice.

    The prolonged reading time in the SR mode draws our attention by employing the CR mode in the second experimental design. Our results found that the CR mode indeed had decreased reading time than SR and unaided modes while preserving the similar sensitivity and FOM value (based on JAFROC analysis) of the SR mode. Moreover, the sensitivity was found to be independent of experience levels on LDCT images, which is also true to FP rates. According to Beyer et al.,23 the CR mode had decreased reading times while leading to no improvement in the sensitivity compared to the unaided reading technique. In contrast, Matsumoto et al.24 and Foti et al.45 reported that the CR mode did not necessarily correlate with reduced reading time and Foti et al. found that the sensitivity in the CR mode has no significant improvement compared to the unaided reading technique. It shall be noted that authors in studies above23,24,45 extracted the reading time from the SR mode and used it as the one in the unaided mode, whereas our study performed each experiment separately, the recorded time representing a more realistic reading condition. Shortened interpretation time (while maintaining an equally high sensitivity) may be important in real clinical practice that a large number of cases need read daily by radiologists.

    This study indeed has several limitations. Firstly, the reference standard of pulmonary nodules was based on the consensus of radiologists rather than histological evidence, a limitation common to CAD studies. However, biopsying every nodule may not be appropriate clinically, especially for smaller pulmonary nodules, because of the difficulty in obtaining pathological specimens. In addition, it may have exposed patients to unnecessary risks. Further clinical and radiographic follow-up is needed to determine whether these nodules are metastatic pulmonary nodules. Secondly, 90% of nodules studied in the context were found to have a diameter less than 4mm, which lead to lower TP rates and higher FP rates. A more uniform distribution of the nodule size would be appropriate in our further study. Thirdly, the size of reader numbers is small. Studies with larger cohorts of readers are needed to substantiate our results. Lastly, CAD operating systems are not uniform; therefore, our results may not be applicable to other CAD systems.

    In conclusion, this study aimed to assess the value of a CAD system based on DL in the detection of pulmonary nodules. We have demonstrated that the deep learning-based CAD system holds a great potential in both SR and CR modes that can help readers especially less experienced radiologists detecting pulmonary nodules on LDCT images. The evaluation of clinical scenarios in this research is closer to the actual clinical application. Results revealed that the deep learning-based CAD system is expected to be an effective method to improve the efficiency and accuracy of radiologists in detecting suspected metastatic pulmonary nodules. In the future, we will study the predictive value of radiomics combined with AI based on DL in lung cancer treatment response, screen benefit cases, and reduce unnecessary waste of medical resources. For example, investigating the performance of radiomics signatures extracted from pretherapy CT images may be important for preoperative predicting PD-1 immunotherapy response.

    Conflicts of Interest

    The authors declare that there are no conflicts of interest relevant to this paper.

    Acknowledgments

    The authors express our gratitude to Li Tan and Chenzhe Li for consultation on statistical analysis; Yula Hu, Yueying Pan, Jiahuang, Xinping Zhang, Lisi Liu, Haidan Zhu, Zhaoxia Yang, Jinhan Qiao, Wangzhen Zhang, Chenyong Sun, for reading the images. Qiongjie Hu and Shaofang Wang contributed equally to this work. This study was supported by a grant from the Key Project of Science and Technology Committee of Wuhan, China (Grant No. 2018060401011326).