Abstract
Objective
To compare the detection performance of the automated whole breast ultrasound (AWUS) with that of the hand-held breast ultrasound (HHUS) and to evaluate the interobserver variability in the interpretation of the AWUS.
Materials and Methods
AWUS was performed in 38 breast cancer patients. A total of 66 lesions were included: 38 breast cancers, 12 additional malignancies and 16 benign lesions. Three breast radiologists independently reviewed the AWUS data and analyzed the breast lesions according to the BI-RADS classification.
Results
The detection rate of malignancies was 98.0% for HHUS and 90.0%, 88.0% and 96.0% for the three readers of the AWUS. The sensitivity and the specificity were 98.0% and 62.5% in HHUS, 90.0% and 87.5% for reader 1, 88.0% and 81.3% for reader 2, and 96.0% and 93.8% for reader 3, in AWUS. There was no significant difference in the radiologists' detection performance, sensitivity and specificity (p > 0.05) between the two modalities. The interobserver agreement was fair to good for the ultrasonographic features, categorization, size, and the location of breast masses.
Hand-held ultrasonography (HHUS) has been used as an important adjunct to mammography to evaluate breast lesions, and the diagnostic accuracy of HHUS has remarkably increased (1). The automated whole ultrasonography (AWUS) scanners were originally designed to effectively examine the entire breast and to overcome the operator dependency of HHUS (1-3). The first prone-type AWUS scanner was used in the late 1970s, and the supine-type AWUS scanner has been used since the mid 1980s. The image quality of the older automated scanners was inferior to that of HHUS (4). Nevertheless, the current high-resolution AWUS scanners with volumetric technologies can demonstrate the breast anatomy and document the breast lesions more accurately (1). Several studies have shown the good diagnostic performance (4-7) and good interobserver agreement (6, 7) between the volumetric AWUS and the HHUS. As good interobserver variability of using American College of Radiology (ACR) breast imaging reporting and data system lexicon and excellent expected malignancy probability of ACR BI-RADS assessment has been proved in HHUS, that of AWUS needs to be evaluated in advance for clinical application of screening or diagnostic setting. However, there are a few studies that have evaluated the radiologists' detection performance using AWUS (5). The interobserver variability for the description, size and location of the ultrasonographic features has yet to be evaluated. This study aimed to evaluate the radiologists' detection performance of AWUS in a group of breast cancer patients as well as to examine the degree of interobserver agreement for a group of clinicians with various levels of breast ultrasonography (US) experience.
This study was conducted with the approval of the institutional review board, and written informed consent was obtained from all patients before the AWUS evaluation. From October of 2009 to March of 2010, two radiographers performed automatic whole breast ultrasound in 45 consecutive breast cancer patients who were scheduled for breast MRI. They performed AWUS after performing 15 training cases.
Among the 45 patients, we excluded 5 patients who were scheduled for chemotherapy and 2 other patients who refused a breast operation. A total of 75 breasts in 38 patients were included in this study, and one breast was excluded due to previous mastectomy. The range of the lesion size was 5-80 mm (mean: 30.97 mm). Fifty lesions of the 38 breast cancers and 12 additional malignancies were pathologically confirmed to be malignant by surgical excision: 37 invasive ductal carcinomas (IDCs) with or without ductal carcinoma in situ (DCIS), 9 DCIS, 2 invasive micropapillary carcinomas, 1 mucinous carcinoma and 1 invasive lobular carcinoma (ILC). Sixteen lesions were considered benign. Ten lesions were pathologically confirmed to be benign by surgical excision (n = 4) and by US-guided percutaneous core-needle biopsy (n = 6): 6 fibrocystic changes, 1 fibroadenoma, 2 papillomas and 1 columnar cell change. Six lesions were shown to be typical cysts on HHUS.
A total of 38 mammograms, as well as HHUS and AWUS examinations were performed. For the mammograms, the standard craniocaudal and mediolateral oblique views were obtained using a Mammomat 3000 unit (Siemens Medical Solutions, Solna, Sweden) and a Lorad M3 mammography unit (Hologic Inc., Boston, MA, USA). The HHUS images were acquired using a 7-15 MHz linear probe (HDI 3000, Advanced Technology Laboratories, Bothell, WA, USA; iU22 Ultrasound System, Phillips Ultrasound, Bothel, WA, USA) and a 6-14 MHz linear probe (EUB-8500 scanner, Hitachi Medical, Tokyo, Japan). Before the breast MRI examination for the staging workup, the AWUS images were obtained with an ACUSON S2000 Automated Breast Volume Scanner (ABVS: Siemens Medical Solutions, Mountain View, CA, USA) by two radiographers. The ABVS acquired 15.4 × 16.8 × maximum 6 cm volume data sets of the breast in one sweep with a 5-14 MHz wide-aperture linear probe. The system captured the volume data at slice intervals of 0.5 mm. The right breast was initially scanned in the anterior-posterior view, which included the nipple and most parts of the breast with the patient in the supine position. The right lateral view and the right medial view, which mainly included the outer breast and the inner breast, were then scanned with the patient in the slightly right down position and the slightly left down position, respectively. After the acquisition of 3D volume data, the data was automatically sent from the ACUSON S2000 ABVS to the workstation and was reviewed in multiple orientations using an MPR display. The scan thickness was displayed at intervals of 1 mm without overlap.
All of the mammographic image files and AWUS data files were masked and randomised. There was a 2-hour tutorial explaining the operation of the ACUSON S2000 ABVS workstation operation. Three cases of AWUS data that were not part of this study were demonstrated and reviewed at the tutorial.
Four radiologists had no previous experience to interpret the AWUS images and were trained to operate the ABVS review workstation before reviewing the cases.
The three breast radiologists who specialized in breast imaging and have practiced in an academic breast imaging section for 7, 3, and 1 years, respectively, independently evaluated all of the 3D AWUS data after reviewing the mammographic images. They did not know the patient data or the HHUS and MRI findings. Each observer evaluated all of the 3D data and attempted to detect all suspicious solid lesions and typical benign cystic lesions.
The observers were instructed to capture each lesion with a marker on the representative images and to save it as a picture file. Each observer described each lesion according to Table 1. There were two ultrasonographic lesion types: the mass type and the non-mass type. In the non-mass type, there was neither a dominant mass (with or without daughter nodules) nor a few similar sized discrete nodules, and there were ductal extensions central toward the nipple and branch pattern involvements of several peripheral ducts or echogenic dots of microcalcifications without a definite mass formation.
They assigned the lesion type, size, location and a final BI-RADS category to each lesion and recorded the BI-RADS lexicon in cases exhibiting a mass type lesion.
One radiologist, who specialised in breast imaging and has practiced in an academic breast imaging section for 5 years, reviewed all of the medical records, mammography, HHUS, AWUS and MRI findings. The radiologist reviewed the HHUS data, described each lesion according to BI-RADS and recorded the maximal diameter and the location on a prepared sheet.
The detection rate, sensitivity and specificity of the HHUS and each reader of the AWUS were calculated and compared using the chi-square test or the Fisher's exact test. A p value of less than 0.05 indicated a statistical significance. Detection was considered positive when a MRI-detected lesion was detected by the AWUS. MRI-detected lesions were HHUS-detected lesions or a second-look HHUS-detected lesion. The case not detected with US was considered to be BI-RADS category 1. The sensitivity and the specificity for malignancy were calculated as a binary outcome: categories 1-3 were grouped as negative and categories 4-5 were grouped as positive.
The agreements among the HHUS and the three AWUS readers were examined using the coefficient for inter-rater agreement (Cohen kappa). The interpretation was translated into five scales: poor (less than 0.2), fair (0.21 to 0.4), moderate (0.41 to 0.60), good (0.61 to 0.80), and very good (0.81 to 1.00) (8). All of the statistical analyses were performed using the SAS software (version 9.1, SAS Institute Inc., Cary, NC, USA).
A total of 50 malignancies (38 cases of breast cancers and 12 cases of additional malignancies) and 16 benign lesions were included in the study. The additional malignancies were: 7 cases of multifocal cancers, 3 cases of multicentric cancers and 2 cases of contralateral cancers.
Table 2 shows the detection performance of the HHUS and the three AWUS readers. For the detection of breast cancer, HHUS revealed 38 lesions out of 38 lesions (100%), and the three AWUS readers revealed 35, 34 and 38 lesions (92.1%, 89.4% and 100%), respectively (Figs. 1, 2). For the detection of additional malignancies, HHUS revealed 11 lesions out of 12 cancers (91.7%), and the three readers revealed 10 lesions (83.3%) (Figs. 3, 4). For the detection of malignancy, there were 6 missed cancers, which were not detected by any of the three readers. The six cancers were composed of 4 breast cancers and 2 additional malignancies. Five mass lesions with a diameter range of 0.5-1.9 cm were missed and one non-mass lesion with a diameter of 4.5 cm was missed. For the detection of additional benign lesions, HHUS revealed 13 lesions out of 16 lesions (81.3%), and the three readers revealed 12, 12 and 8 lesions (75%, 75% and 50%), respectively (Fig. 5). One reader misinterpreted the breast lesion as category 4, suspicious finding, which was category 3, probably benign finding on HHUS.
The sensitivity and the specificity were 98.0% (49/50) and 62.5% (10/16) for HHUS, 90.0% (45/50) and 87.5% (14/16) for reader 1, 88.0% (44/50) and 81.3% (13/16) for reader 2 and 96.0% (48/50) and 93.8% (15/16) for reader 3. There was no significant difference in the radiologists' detection rate, sensitivity or specificity among HHUS and AWUS (p > 0.05).
For the 34 lesions that were detected by HHUS and the three AWUS readers, the three readers accurately classified the lesion type for 88.2% (30/34), 91.2% (31/34) and 94.1% (32/34) of the lesions, respectively, with the standard reference being the HHUS classification. There was no significant difference in lesion type classification among the three readers and the HHUS (p > 0.05).
Table 3 summarises the interobserver variability for the size, location and each description. Twenty six cases of mass type lesions that all three readers classified as the mass type were statistically analysed for the interobserver variability. We obtained a relatively high degree of interobserver agreement for the AWUS interpretation. A good agreement among the three readers was seen in describing the mass location (κ = 0.74) and the echogenicity (κ = 0.65). The overall agreement for the mass size, shape, posterior features, orientation and BI-RADS category was moderate (κ = 0.43, 0.45, 0.45, 0.50 and 0.57, respectively) (Figs. 1, 3). The overall agreement for the margin was fair (κ = 0.25) (Fig. 3).
Automatic whole breast ultrasound is used to scan the entire breast to overcome operator dependency. In the past, the acceptance rate of automated two-dimensional (2D) and three-dimensional (3D) US were low because of the poor resolution (4). Volumetric ultrasonography was recently proposed, and the progress in 2D high-frequency US transducer technology combined with compound imaging and speckle reduction is the basis for high-quality 3D volume US (4). The new design, registered as the ACUSON S2000 Automated Breast Volume Scanner (ABVS: Siemens Medical Solutions, Mountain View, CA, USA), consists of a rigid and substantially stationary frame and a compressive membrane (a polyester film sheet). By manually pushing the frame to make firm contact with the relatively soft breast, the transducer can optimally scan the breast with a sufficient amount of gel applied evenly on the breast surface, which leaves minimal to no contact artifacts (3).
Automatic whole breast ultrasound has several advantages over HHUS (1, 4, 9): 1) it is more reproducible, and it allows for thorough imaging of the entire breast, 2) it has higher definition, better contrast and sharpness and smaller images for review due to a high-resolution 2000-line reading monitor with 3D capability, 3) it allows displayed interpretation at computer-monitor-based reading stations with non-real-time review, which optimises the radiologist's reading environment and 4) it is well accepted by participants because of the reduced breast compression (compared to that of the mammography), the lack of exposure to ionizing radiation and the lack of contrast medium injection. Therefore, the high-resolution AWUS scanners are good for follow-up studies, and they can improve the confidence level of a negative reading (1).
In a previous study, there was no significant difference in image quality between HHUS and AWUS (4). The rate of breast cancer detection using both AWUS and MMG doubled in radiographically dense breasts, providing significant cancer detection improvement compared with MMG alone (9). Additional detection and the smaller size of invasive cancers may justify the expense of this technology for women with dense breasts and/or those who are at a high risk for breast cancer (9).
The detection rate of AWUS that all readers detected was 55.7% (39/70) (5) and 96.6% (29/30) (6), respectively. In our study, the detection rate of all readers was 78% (52/66), which was in the range of the previous studies (5, 6). For the detection of malignancies, the number of missed cancers was zero (6) and four (5) in the previous studies and six in this study. Five mass lesions with a diameter range of 0.5-1.9 cm were probably missed due to isoechogenicity for two cases, peripheral location and incomplete coverage for one case, deep location and incomplete depth coverage, and poor image quality of thin breast and multifocal shadowing by Cooper's ligament (The previous 2 comments are in regard to the 5 mass lesions missed as stated above). One non-mass lesion with a diameter of 4.5 cm was probably missed due to no definite mass formation.
In a previous study, all of the cancers missed by the readers were demonstrated on the retrospective analysis (5). The detectability of the lesions was associated with the mass size, the surrounding tissue changes and the shape of the mass on a multivariable logistic regression analysis (5). Berg found that the lesion detection was the most consistent for lesions larger than 11 mm. The rate of lesion detection decreased with the size (10). A future study concerning the detection rate of AWUS is needed with more cases and various reader groups who have different levels of experience with ABVS.
For the HHUS examinations, a sensitivity of 81-100% and a specificity of 33-88% have been reported to detect malignant and benign lesions (7, 11, 12). For the AWUS examinations, two studies reported that the sensitivity was 71.9% (5) and 96.5% (7), respectively, and the specificity was 74.5% (5) and 92.3% (7), respectively. Our study demonstrated a sensitivity of 88.0 to 96.0% and a specificity of 81.3 to 93.8%, which are in the range of the two previous studies (5, 7). Compared with the HHUS sensitivity of 81-100%, the AWUS sensitivity was slightly lower in this study, but there was no significant difference between the two modalities (p > 0.05).
Several studies have reported significant interobserver variability in the lesion description and final assessment using the BI-RADS lexicon for the HHUS (13-16). Only the observer agreement with the final assessment of BI-RADS for the AWUS has been studied, which was reported to be very good (6). Our study for AWUS showed a relatively high level of consistency for the US description and the BI-RADS category, but not for the margin terminology (Table 3). Compared with the HHUS studies (13-16), the agreement degree for echogenicity in our study was higher, and those for the shape, orientation, margin, posterior features and the BI-RADS category were similar. The radiologist who performed HHUS could easily adjust the focal zone, time or depth compensated gain and the transducer compressibility in a real time setting in order to obtain good contrast images. The AWUS with the fixed ultrasonographic parameters and wide volume data might have an effect on the interpretation, but the results showed a similar agreement level between HHUS and AWUS.
Compared to the AWUS study by Wenkel et al. (6), the agreement level for the BI-RADS category was lower, but it was similar to the HHUS study using the BI-RADS category (14) (Table 3). Of the three HHUS studies (13, 15, 16) that used the BI-RADS subcategories such as 4a, 4b and 4c, two studies showed a low level of consistency due to the multitude of the terms (13, 16) (Table 3). We achieved moderate agreement for the BI-RADS category, which was similar to the HHUS study; this result is an important factor to validate the usefulness of the BI-RADS terminology in AWUS.
Nonetheless, our study had some limitations. First, this was a retrospective study with a relatively small number of patients. Second, the sample contained a selection bias with a higher percentage of cancer and a lower percentage of benign lesions, which does not reflect the general population. Third, the radiologists and the radiographers lacked experience with the AWUS, which could have affected the image quality and image interpretation. Fourth, there was a limited inclusion of histopathologies, making it difficult to generalise the results of this study to various breast pathologies.
In conclusion, we believe that AWUS is useful for detecting breast cancers and benign breast lesions in breast cancer patients. Compared with HHUS, AWUS showed no significant difference in the detection rate, sensitivity or specificity. There were relatively high degrees of interobserver agreement in the AWUS interpretation except for the margin. Further study with a large case number is needed, focusing on the various factors affecting lesion detection.
References
1. Chou YH, Tiu CM, Chen J, Chang RF. Automated full-field breast ultra-sonography: the past and the present. J Med Ultrasound. 2007. 15:31–44.
2. Shipley JA, Duck FA, Goddard DA, Hillman MR, Halliwell M, Jones MG, et al. Automated quantitative volumetric breast ultrasound data-acquisition system. Ultrasound Med Biol. 2005. 31:905–917.
3. Tozaki M, Fukuma E. Accuracy of determining preoperative cancer extent measured by automated breast ultrasonography. Jpn J Radiol. 2010. 28:771–773.
4. Kotsianos-Hermle D, Wirth S, Fischer T, Hiltawsky KM, Reiser M. First clinical use of a standardized three-dimensional ultrasound for breast imaging. Eur J Radiol. 2009. 71:102–108.
5. Chang JM, Moon WK, Cho N, Park JS, Kim SJ. Radiologists' performance in the detection of benign and malignant masses with 3D automated breast ultrasound (ABUS). Eur J Radiol. 2011. 78:99–103.
6. Wenkel E, Heckmann M, Heinrich M, Schwab SA, Uder M, Schulz-Wendtland R, et al. Automated breast ultrasound: lesion detection and BI-RADS classification--a pilot study. Rofo. 2008. 180:804–808.
7. Kotsianos-Hermle D, Hiltawsky KM, Wirth S, Fischer T, Friese K, Reiser M. Analysis of 107 breast lesions with automated 3D ultrasound and comparison with mammography and manual ultrasound. Eur J Radiol. 2009. 71:109–115.
8. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977. 33:159–174.
9. Kelly KM, Dean J, Comulada WS, Lee SJ. Breast cancer detection using automated whole breast ultrasound and mammography in radiographically dense breasts. Eur Radiol. 2010. 20:734–742.
10. Berg WA, Blume JD, Cormack JB, Mendelson EB. Operator dependence of physician-performed whole-breast US: lesion detection and characterization. Radiology. 2006. 241:355–365.
11. Buchberger W, Niehoff A, Obrist P, DeKoekkoek-Doll P, Dünser M. Clinically and mammographically occult breast lesions: detection and classification with high-resolution sonography. Semin Ultrasound CT MR. 2000. 21:325–336.
12. Schnarkowski P, Schmidt D, Milz P, Kessler M, Reiser MF. [Comparison between current and high resolution ultrasound for diagnosis of breast lesions]. Ultraschall Med. 1996. 17:190–194.
13. Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology. 2006. 239:385–391.
14. Park CS, Lee JH, Yim HW, Kang BJ, Kim HS, Jung JI, et al. Observer agreement using the ACR Breast Imaging Reporting and Data System (BI-RADS)-ultrasound, First Edition (2003). Korean J Radiol. 2007. 8:397–402.
15. Lee HJ, Kim EK, Kim MJ, Youk JH, Lee JY, Kang DR, et al. Observer variability of Breast Imaging Reporting and Data System (BI-RADS) for breast ultrasound. Eur J Radiol. 2008. 65:293–298.
16. Abdullah N, Mesurolle B, El-Khoury M, Kao E. Breast imaging reporting and data system lexicon for US: interobserver agreement for assessment of breast masses. Radiology. 2009. 252:665–672.