Abstract
Objective
To compare the segmentation capability of the 2 currently available commercial volumetry software programs with specific segmentation algorithms for pulmonary ground-glass nodules (GGNs) and to assess their measurement accuracy.
Materials and Methods
In this study, 55 patients with 66 GGNs underwent unenhanced low-dose CT. GGN segmentation was performed by using 2 volumetry software programs (LungCARE, Siemens Healthcare; LungVCAR, GE Healthcare). Successful nodule segmentation was assessed visually and morphologic features of GGNs were evaluated to determine factors affecting segmentation by both types of software. In addition, the measurement accuracy of the software programs was investigated by using an anthropomorphic chest phantom containing simulated GGNs.
Results
The successful nodule segmentation rate was significantly higher in LungCARE (90.9%) than in LungVCAR (72.7%) (p = 0.012). Vascular attachment was a negatively influencing morphologic feature of nodule segmentation for both software programs. As for measurement accuracy, mean relative volume measurement errors in nodules ≥ 10 mm were 14.89% with LungCARE and 19.96% with LungVCAR. The mean relative attenuation measurement errors in nodules ≥ 10 mm were 3.03% with LungCARE and 5.12% with LungVCAR.
Pulmonary ground-glass nodules (GGNs) including both pure GGNs and part-solid (PS) GGNs have been well known to have a substantially high probability to be malignant (1), with malignancy rates of 63% for PS GGNs and 18% for pure GGNs reported by Henschke et al. (1), much higher than that for solid nodules. However, a substantial proportion of GGNs is benign (2), and thus differentiation between benign and malignant GGNs is essential. Unfortunately, preoperative differentiation between benign and malignant GGNs is not easy. Although Lee et al. (3) reported that a lesion size > 8 mm and a lobulated border for pure GGNs and a lobulated border for PS GGNs could be useful predictors of malignant GGNs, Kim et al. (4) reported that there were no morphologic differences between benign and malignant GGNs. Considering these conflicting results, it may be too early to determine the benignity of GGNs with morphologic CT features alone. Furthermore, a study on FDG-PET has reported limitations in differentiating malignant from benign GGNs, as malignant GGNs are often found to be false negatives on FDG-PET (5).
Thus, today, the differentiation of benign from malignant GGNs is usually determined based on a lesion's change over time (6), in which computer-aided volumetry is thought to provide more accurate and reproducible assessment than visual assessment (7). However, there have been concerns in computer-aided volumetry of GGNs, regarding the capability of volumetry software packages to appropriately segment target nodules and in providing accurate volume measurements. Although knowledge of the nodule segmentation performance of each volumetry tool and the choice of the most appropriate software for GGN measurement may be of great interest, only a limited number of studies have been performed regarding the volumetric measurements of GGNs, of which none were comparison studies (6-9). In addition, knowledge of the decisive morphologic features affecting successful nodule segmentation can be of great clinical importance in excluding GGNs that are not suitable for volumetric analysis. These are important issues as GGNs usually require serial follow-up studies, however to our knowledge, there have been no studies dealing with these issues.
Thus, the purpose of our study was to compare the nodule segmentation capability of 2 commercially available volumetry software programs (LungCARE, Siemens Healthcare, Erlangen, Germany; LungVCAR, GE Healthcare, Waukesha, USA), which provide specific segmentation algorithms for GGNs, and to analyze the morphologic features of GGNs influencing successful segmentation. In addition, we also assessed volume and attenuation measurement accuracy of each software package.
This study was approved by the institutional review board of Seoul National University Hospital, which waived the requirement for patients' informed consent for the retrospecitve study.
From October 2010 to December 2011, one author with 13 years of experience in chest CT (C.M.P.) retrospectively searched the electronic medical records and the radiology information systems of our hospital for patients with pulmonary GGNs identified on low-dose, thin-section chest CT. Most of these patients underwent low-dose chest CT for the purpose of clinical follow-up of their pulmonary GGNs. The study population was determined based on the following criteria: 1) persistent pure or PS GGNs on 2 CT examinations with > 3 months interval to exclude transient inflammatory lesions; 2) available 1 mm slice thickness low-dose helical CT without intravenous contrast media administration; 3) nodules with a diameter > 5 mm, but < 2 cm in maximum diameter; and 4) nodules without calcification.
On the basis of the selection criteria, 55 patients (15 men and 40 women; age range, 30-76 years; mean age, 55.76 years) with 66 GGNs were enrolled. Among these 66 GGNs, 35 were pure GGNs and 31 were PS GGNs. The mean size of the 66 GGNs was 10.39 ± 3.55 mm (range, 5.64-19.75 mm) and their mean attenuation was -516.95 ± 146.27 Hounsfield unit (HU) (range, -758.99 to -171.05 HU). Twenty-eight nodules were surgically resected due to the increase in size of the nodule or due to the appearance of solid portion during the follow-up studies. The final pathologic diagnoses were as follows: atypical adenomatous hyperplasia in 1; adenocarcinoma-in-situ (AIS) in 15; invasive adenocarcinoma in 12. Twenty-six of the 66 GGNs were located in the right upper lobe, 3 in the right middle lobe, 10 in the right lower lobe, 16 in the left upper lobe, and 11 in the left lower lobe.
All chest CTs were performed using 3 multi-detector CT scanners; Somatom Definition, Sensation-16 (Siemens Medical Solutions, Forchheim, Germany), and Brilliance-64 (Phillips Medical Systems, Best, the Netherlands). Detailed scanning parameters were as follows: detector collimation, 0.6-0.75 mm; beam pitch, 0.516-1.2; reconstruction increment, 1.0 mm; slice thickness, 1.0 mm; rotation time, 0.5 second; tube voltage, 120 kVp; tube current, 30-60 effective mAs; and matrix, 512 × 512. Images were reconstructed using the medium sharp reconstruction algorithm. CT scans were obtained for all patients in the supine position at full inspiration.
To evaluate and compare the successful segmentation rate of the 2 commercially available volumetric software programs (LungCARE, Siemens Healthcare; LungVCAR, GE Healthcare), CT image data were transferred to workstations. At first, GGNs were detected on transverse thin-slab maximum intensity projection images with a window width of 1500 HU and level of -700 HU (LungCARE) or on transverse thin-section (1 mm thickness) images (LungVCAR). Then, the nodules were manually marked with a mouse click in the nodule's center, and the software programs automatically segmented the nodule margin. A 'subsolid' segmentation algorithm was used in LungCARE and 'nonsolid' and 'PS' (when solid proportion > 50%) algorithms were used in LungVCAR. LungCARE provided volume-rendered images of nodules and surrounding structures in the volume of interest, while LungVCAR provided segmentation boundary-overlaid transverse thin-section images with volume-rendered images of the segmented nodules (Fig. 1). Two radiologists (H.K. and S.M.L. with 2 and 7 years' experience in CT, respectively) evaluated the nodule segmentation of each software in consensus. Nodule segmentation was allowed up to 3 times consecutively. Successful nodule segmentation was assessed visually and classified into one of 4 categories: 1) 'excellent': the segmented part completely matched the nodule; 2) 'satisfactory': although not perfect, the segmented volume is still representative of the nodule. The maximum mismatch between the overlay and nodule was visually estimated not to exceed 30% in volume; 3) 'poor': part of the nodule is segmented, but the segmented volume is not representative of the nodule (estimated mismatch > 30%); 4) 'failure': no segmentation or the result has no similarity with the lesion. These criteria were originally suggested by de Hoop et al. (10) and modified by the authors. We regarded the first 2 categories (excellent and satisfactory categories) as 'successful nodule segmentation'.
To analyze the morphologic features of GGNs influencing successful nodule segmentation, 2 chest radiologists (J.M.G. and C.M.P. with 21 and 13 years experience in chest CT) evaluated the CT findings of each nodule in consensus as follows: (a) lesion size (maximum diameter), (b) attenuation, (c) shape (spherical, nonspherical), (d) contour (smooth, lobulated, spiculated), (e) margin (well-defined, poorly-defined), (f) solid portion size (maximum diameter), (g) solid proportion of PS nodules, (h) GGN type (I, II, III), (i) pleural attachment of the nodule, and (j) vascular attachment of the nodule. The solid proportion of the nodule was calculated by dividing the maximum diameter of the solid portion by the maximum diameter of the primary nodule, and the maximum diameter of the solid portion was measured on the mediastinal window setting (window width 400 HU, level 20 HU) using the vanishing ratio method (11). GGNs were classified into 3 types based on the extent of internal solid parts: Type I, pure GGN; Type II, PS GGN with a solid portion size ≤ 5 mm; Type III, PS GGN with a solid portion size > 5 mm. Pleural or vascular attachment of nodules was defined as when the contact surface between the nodule and pleura or vessel was greater than 50% of the nodule diameter at software-offered volume-rendered images (12) (e.g., A 10 mm-sized nodule was considered as vessel-attached when it had more than 5 mm contact with a vessel at its boundary or when a vessel passed through the nodule for more than 5 mm).
To evaluate and compare the measurement accuracy of the 2 commercial volumetry software programs, we performed a phantom study using an anthropomorphic chest phantom (multipurpose chest phantom N1 Lungman, Kyoto Kagaku, Kyoto, Japan) with simulated GGNs. The anthropomorphic chest phantom consisted of simulated pulmonary vessels, heart, chest wall, diaphragm, and liver. Simulated GGNs of various diameters and attenuations (diameter 5-, 8-, 10-, and 12-mm; attenuation -630 HU and -800 HU for each diameter) were manually affixed to the simulated pulmonary vessels.
Low-dose thin-section CT was performed for the phantom study with a Sensation-16 scanner (Siemens Medical Solutions, Forchheim, Germany). Scanning parameters were as follows; detector collimation, 0.75 mm; beam pitch, 1.0; reconstruction increment, 1.0 mm; slice thickness, 1.0 mm; rotation time, 0.5 second; tube voltage, 120 kVp; tube current, 60 mAseff; and matrix, 512 × 512. Images were reconstructed using the medium sharp reconstruction algorithm. For each nodule, CT scans were performed 10 times and a total of 80 nodule datasets were obtained.
One radiologist (S.M.L. with 7 years experience in CT) measured the volume and attenuation of each simulated GGN using the volumetry software programs (LungCARE, Siemens Healthcare; LungVCAR, GE Healthcare). The observer was allowed consecutive attempts at segmentation up to 3 times for the most satisfactory nodule segmentation. Thereafter, the relative volume measurement error for each nodule was calculated to evaluate the accuracy of each volumetry software, according to the following formula: ([measured nodule volume - assumed nodule volume] / assumed nodule volume) × 100. The assumed nodule volume was calculated according to the formula for the volume of a sphere supposing that the simulated GGNs are spherical. In addition, this formula was applied to calculation of attenuation measurement error. Measurement error was calculated using only the successfully segmented nodules.
To compare the segmentation capability of the 2 volumetry software programs, the chi-square test was performed. For determination of morphologic features of GGNs influencing successful nodule segmentation at each software program, we first used the Mann-Whitney U-test and Fisher's exact test for each variable, as appropriate. Subsequent multivariate logistic regression analysis was conducted with the enter mode, in which morphologic features with a p-value < 0.10 through univariate analysis were used as input variables.
In the phantom study, the paired t test was used to compare the relative volume and attenuation measurement error values between the 2 volumetry software programs. We, then, assessed and calculated the relative measurement error values according to the nodules' size.
All statistical analyses were performed using Statistical Package for the Social Sciences (SPSS) version 18.0 (SPSS Inc., Chicago, IL, USA). A p-value < 0.05 was considered to indicate a statistical significance.
Successful nodule segmentation was observed in 90.9% of nodules (60/66) for LungCARE and 72.7% (48/66) for LungVCAR. LungCARE showed a significantly higher successful segmentation rate than LungVCAR (p = 0.012). There were no cases of segmentation failure for LungCARE with only 6 poorly segmented nodules out of 66 GGNs. The detailed segmentation results of each software program are displayed in Figure 2.
Table 1 summarizes the morphologic features of enrolled GGNs. Among these morphologic features, 2 features (contour and vascular attachment) were significantly different with a p-value < 0.1 between the successful segmentation group and the poor/failed segmentation group for LungCARE (Table 2). Subsequent multivariate analysis revealed that a lobulated contour and vascular attachment were significant negatively influencing factors for successful segmentation (lobulated contour, p = 0.037, odds ratio [OR] = 0.063; vascular attachment, p = 0.008, OR = 0.030) (Fig. 3). For LungVCAR, vascular attachment was the only significant negatively influencing factor (p < 0.001, OR = 0.071) (Fig. 3, Table 3).
In the GGN phantom study, the successful nodule segmentation rate was 93.75% (75/80) for LungCARE. For LungVCAR, only nodules of -630 HU were used for the accuracy measurement study due to the poor segmentation capability for -800 HU nodules. The successful segmentation rate of LungVCAR for -630 HU nodules was 82.5% (33/40). For these successfully segmented nodules, the relative volume and attenuation measurement errors were calculated (Table 4). The mean relative volume measurement error for nodules with a ≤ 8 mm-diameter, and ≥ 10 mm-diameter was 61.47%, and 14.89% in LungCARE and 53.56%, and 19.96% in LungVCAR, respectively, and no significant difference between the 2 software programs was observed (p = 0.421). The mean relative attenuation measurement error for nodules with a ≤ 8 mm-diameter, and ≥ 10 mm-diameter was 2.09%, and 3.03% in LungCARE and 13.19%, and 5.12% in LungVCAR. The mean relative attenuation measurement error was significantly smaller in LungCARE (p < 0.001).
Our study is the first comparison study of commercially-available volumetry software programs for GGNs. We found that the successful segmentation rate of GGNs, including PS GGNs was 90.9% for LungCARE and 72.7% for LungVCAR and that LungCARE showed a significantly higher successful segmentation rate than LungVCAR. In solid nodules, the successful segmentation rate has been reported to range from 71% to 97% (8). For GGNs, Oda et al. (8) reported a 100% segmentation rate using prototype software with the semi-automated method and manual editing. For pure GGNs, Park et al. (9) reported a 97.8-98.3% segmentation rate using LungCARE with the semi-automated method. A previous comparison study of 6 different commercial volumetry software programs by de Hoop et al. (10) had shown that the segmentation rate varied significantly from 71% to 86% when the semi-automated method was used. They have also shown that there were significant differences in absolute nodule volumes among the different volumetry software programs (10). However, the study by de Hoop et al. (10) dealt with only solid nodules, and our study demonstrated that such substantial variation of segmentation capability between software programs can also be applied to the volumetry of GGNs. As segmentation capability is an essential pre-requisite for nodule volumetry, it may be important to know which volumetry software program provides the best segmentation performance.
With respect to the morphologic features affecting GGN segmentation, a lobulated contour and vascular attachment of GGNs were revealed to be significant negatively-influencing factors for successful nodule segmentation in LungCARE. In LungVCAR, vascular attachment of the nodule was proven to be the single negative feature. It has been well-known that nodule attachments make it difficult to accurately define boundaries in the case of juxtavascular and juxtapleural solid nodules (13), with segmentation failure rates ranging from 20% to 28%, according to a study by Kostis et al. (14). In addition, Das et al. (15) showed that the overall absolute percentage error of volume measurement was highest for juxtapleural nodules. Our result partly coincides with these previous studies (13-15) and it is noteworthy that vascular attachment would be critical in the volumetric analysis of GGNs, not only for solid nodules. As for nodule contour, we found that lobulated contour was a negatively influencing morphologic feature of GGN segmentation in LungCARE. In addition, Petrou et al. (16) showed that nodule contour had a significant negative effect on volume measurement variability in solid nodules.
As for the influence of the solid portion on GGN segmentation, there were no significant differences in solid proportion or solid part size between the successful segmentation group and the poor/failed segmentation group on both software programs. PS GGNs with a solid portion ≤ 5 mm frequently prove to be AISs or minimally invasive adenocarcinomas, pathologically (17, 18), for which close follow-up can be an important management option instead of immediate surgical resection. In this context, our study results support that PS GGNs with a solid portion ≤ 5 mm could also be eligible for the application of volumetry software programs for follow-up.
For measurement accuracy evaluation using simulated GGNs, we found that the measurement accuracy of volume and attenuation of GGNs was reasonably acceptable in successfully segmented GGNs ≥ 10 mm in both software programs. As for attenuation measurement, LungCARE showed significantly smaller measurement error than LungVCAR, but with respect to volume measurement, there was no significant difference in measurement error between the 2 software programs. However, considering that simulated GGNs with -800 HU were not included in this comparison as they were not able to be segmented in LungVCAR, although they were able to be segmented in LungCARE, a simple comparison between the 2 software programs based on the successfully segmented nodules in this study may not have been appropriate. In real clinical practice, GGNs with attenuation around -800 HU may not be successfully segmented in LungVCAR and therefore might show higher volume measurement error than with LungCARE. Thus, we assume that both volume and measurement error would be smaller with LungCARE if GGNs with a wider range of size and attenuation were measured. We can also infer that the attenuation of GGNs would be an important factor in determining successful nodule segmentation with LungVCAR. Unfortunately, we were not able to identify the effect of GGN attenuation in our segmentation capability study as the enrolled GGNs ranged from -759 to -171 HU.
Previously, Oda et al. (8) reported that the mean relative volume measurement error for simulated GGNs ≥ 5 mm ranged from -4.1% to 7.1% using prototype software with the semi-automated method and manual editing. Relative attenuation measurement error was not described in that study and we did not include -450 HU nodules which were used as solid nodule surrogates in the study by Oda et al. (8). Our volume measurement error was higher than that reported by Oda et al. (8); however, we only used the semi-automated method without manual editing or manual drawing as we considered them to be impractical and time-consuming in real clinical practice.
Our study has several limitations. First, we used 3 different CT scanners for GGN segmentation analysis. Scanner-specific parameters can potentially affect volumetric measurements (13). However, since differences of the scanning parameters including detector collimation, section thickness and dose settings among the 3 CT scanners were minimal in our study, we think that the measurement variability according to different CT scanners with subequal scanning parameters would be of acceptable range as reported by Das et al. (15). Further investigation would be necessary for the exact impact of vendor-specific acquisition on the measurement variability. Second, GGN segmentation in our study was assessed visually and categorization was consensus-driven between 2 radiologists. As of yet, no definite objective method for the evaluation of nodule segmentation is available. Some investigator reported that simultaneous truth and performance level estimation algorithm or probability maps using multi-reader manual segmentation datasets might provide volumetric analysis values closer to the ground truth (19, 20); however, visual analysis could be a simple, practical and efficient alternative for the categorization of semi-automated segmentation successfulness (10). Third, the number of GGNs in the segmentation analysis study was relatively small and there were few GGNs with spiculation and nonspherical shape included. Further studies with a large number of nodules would be warranted in the future. Fourth, a significant number of GGNs were not confirmed pathologically. Thus, we did not perform subgroup analysis between benign and malignant GGNs. Fifth, simulated GGNs used in the phantom study might be overly simplistic to reflect the measurement accuracy of volumetry software programs in real clinical practice in that simulated GGNs are completely spherical in shape.
In conclusion, LungCARE showed significantly higher segmentation success rates than LungVCAR and vascular attachment was a factor of negative influence on nodule segmentation in both software programs. Measurement accuracy of volume and attenuation of GGNs was acceptable in GGNs ≥ 10 mm in both software programs.
References
1. Henschke CI, Yankelevitz DF, Mirtcheva R, McGuinness G, McCauley D, Miettinen OS. ELCAP Group. CT screening for lung cancer: frequency and significance of part-solid and nonsolid nodules. AJR Am J Roentgenol. 2002; 178:1053–1057.
2. Park CM, Goo JM, Lee HJ, Lee CH, Chun EJ, Im JG. Nodular ground-glass opacity at thin-section CT: histologic correlation and evaluation of change at follow-up. Radiographics. 2007; 27:391–408.
3. Lee HJ, Goo JM, Lee CH, Park CM, Kim KG, Park EA, et al. Predictive CT findings of malignancy in ground-glass nodules on thin-section chest CT: the effects on radiologist performance. Eur Radiol. 2009; 19:552–560.
4. Kim HY, Shim YM, Lee KS, Han J, Yi CA, Kim YK. Persistent pulmonary nodular ground-glass opacity at thin-section CT: histopathologic comparisons. Radiology. 2007; 245:267–275.
5. Tsunezuka Y, Shimizu Y, Tanaka N, Takayanagi T, Kawano M. Positron emission tomography in relation to Noguchi's classification for diagnosis of peripheral non-small-cell lung cancer 2 cm or less in size. World J Surg. 2007; 31:314–317.
6. de Hoop B, Gietema H, van de Vorst S, Murphy K, van Klaveren RJ, Prokop M. Pulmonary ground-glass nodules: increase in mass as an early indicator of growth. Radiology. 2010; 255:199–206.
7. Goo JM. A computer-aided diagnosis for evaluating lung nodules on chest CT: the current status and perspective. Korean J Radiol. 2011; 12:145–155.
8. Oda S, Awai K, Murao K, Ozawa A, Yanaga Y, Kawanaka K, et al. Computer-aided volumetry of pulmonary nodules exhibiting ground-glass opacity at MDCT. AJR Am J Roentgenol. 2010; 194:398–406.
9. Park CM, Goo JM, Lee HJ, Kim KG, Kang MJ, Shin YH. Persistent pure ground-glass nodules in the lung: interscan variability of semiautomated volume and attenuation measurements. AJR Am J Roentgenol. 2010; 195:W408–W414.
10. de Hoop B, Gietema H, van Ginneken B, Zanen P, Groenewegen G, Prokop M. A comparison of six software packages for evaluation of solid lung nodules using semi-automated volumetry: what is the minimum increase in size to detect growth in repeated CT examinations. Eur Radiol. 2009; 19:800–808.
11. Kakinuma R, Kodama K, Yamada K, Yokoyama A, Adachi S, Mori K, et al. Performance evaluation of 4 measuring methods of ground-glass opacities for predicting the 5-year relapse-free survival of patients with peripheral nonsmall cell lung cancer: a multicenter study. J Comput Assist Tomogr. 2008; 32:792–798.
12. Wang Y, van Klaveren RJ, van der Zaag-Loonen HJ, de Bock GH, Gietema HA, Xu DM, et al. Effect of nodule characteristics on variability of semiautomated volume measurements in pulmonary nodules detected in a lung cancer screening program. Radiology. 2008; 248:625–631.
13. Gavrielides MA, Kinnard LM, Myers KJ, Petrick N. Noncalcified lung nodules: volumetric assessment with thoracic CT. Radiology. 2009; 251:26–37.
14. Kostis WJ, Reeves AP, Yankelevitz DF, Henschke CI. Three-dimensional segmentation and growth-rate estimation of small pulmonary nodules in helical CT images. IEEE Trans Med Imaging. 2003; 22:1259–1274.
15. Das M, Ley-Zaporozhan J, Gietema HA, Czech A, Mühlenbruch G, Mahnken AH, et al. Accuracy of automated volumetry of pulmonary nodules across different multislice CT scanners. Eur Radiol. 2007; 17:1979–1984.
16. Petrou M, Quint LE, Nan B, Baker LH. Pulmonary nodule volumetric measurement variability as a function of CT slice thickness and nodule morphology. AJR Am J Roentgenol. 2007; 188:306–312.
17. Godoy MC, Naidich DP. Subsolid pulmonary nodules and the spectrum of peripheral adenocarcinomas of the lung: recommended interim guidelines for assessment and management. Radiology. 2009; 253:606–622.
18. Travis WD, Brambilla E, Noguchi M, Nicholson AG, Geisinger KR, Yatabe Y, et al. International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma. J Thorac Oncol. 2011; 6:244–285.
19. Warfield SK, Zou KH, Wells WM. Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging. 2004; 23:903–921.
20. Deeley MA, Chen A, Datteri R, Noble JH, Cmelak AJ, Donnelly EF, et al. Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study. Phys Med Biol. 2011; 56:4557–4577.