Abstract
Objective
To assess the added value of coronal reformation for radiologists and for referring physicians or surgeons in the CT diagnosis of acute appendicitis.
Materials and Methods
Contrast-enhanced CT was performed using 16-detector-row scanners in 110 patients, 46 of whom had appendicitis. Transverse (5-mm thickness, 4-mm increment), coronal (5-mm thickness, 4-mm increment), and combined transverse and coronal sections were interpreted by four radiologists, two surgeons and two emergency physicians. The area under the receiver operating characteristic curve (Az value), sensitivity, specificity (McNemar test), diagnostic confidence and appendiceal visualization (Wilcoxon signed rank test) were compared.
Results
For radiologists, the additional coronal sections tended to increase the Az value (0.972 vs. 0.986, p = 0.076) and pooled sensitivity (92% [95% CI: 88, 96] vs. 96% [93, 99]), and enhanced appendiceal visualization in true-positive cases (p = 0.031). For non-radiologists, no such enhancement was observed, and the confidence for excluding acute appendicitis declined (p = 0.013). Coronal sections alone were inferior to transverse sections for diagnostic confidence as well as appendiceal visualization for each reader group studied (p < 0.05).
The role of computed tomography (CT) in the diagnosis of acute appendicitis has progressed from a problem solving tool to a standard front-line diagnostic procedure; more patients are now referred for a diagnostic CT (1, 2), even after hours. At the average hospital in the community, it is unlikely that every CT study will be evaluated and interpreted in a timely manner by an experienced gastrointestinal radiologist (3). Therefore, in reality clinical decisions are often influenced by preliminary CT interpretations by less-experienced radiologists (4-7), the referring physician or consulting surgeon. Although strong agreement has been reported between interpretations by radiology residents and staff radiologists (8, 9), considerable inter-observer variability still exists, and accuracy depends on the degree of clinician experience (6). In contrast to the excellent sensitivity and specificity achieved by gastrointestinal radiologists (often exceeding 95%) (5, 10-16), more disappointing CT results (sensitivity 76%, specificity 83%) have been reported for inter-observer variability in a prospective study (4). Therefore, it might be beneficial to provide images that are easier to interpret (6) for less-experienced readers who are frequently responsible for the decision to proceed with surgical exploration (3). Paulson et al. (17) recently reported that additional coronal reformation from isotropic voxel CT datasets enhances gastrointestinal radiologists' confidence for the diagnosis or exclusion of acute appendicitis. It has also been proposed that coronal images might be particularly helpful to less-experienced clinicians evaluating CT images by providing a more intuitive perspective on the orientation of structures, i.e., analogous to images obtained at exploratory laparotomy or abdominal radiography (17, 18). The purpose of this study was to assess the added value of coronal reformation for duty radiologists and for referring physicians and surgeons in the CT diagnosis of acute appendicitis.
This study took place in a 950-bed urban university-based hospital. Our institutional review board approved this retrospective study and waived informed consent. We included consecutive patients who visited the emergency room between August and December of 2003, older than 15 years, and had acute lower, or right lower quadrant, abdominal pain, for which CT examination was requested due to the suspicion of acute appendicitis. The study group consisted of 121 patients. All patients were evaluated initially by an emergency room physician, and then, if necessary a surgeon was consulted. A clinical fellow was usually the first member of the surgical team to evaluate the patient. All patients were referred for CT examination at the discretion of emergency physicians or surgeons. In general, patients who were referred to CT had an atypical clinical presentation; patients with clinical findings highly suspicious for appendicitis underwent surgery without undergoing preoperative imaging.
An abdominal radiologist reviewed the original CT reports, surgical records and pathologic reports. Eleven of the 121 patients who did not undergo surgical exploration, and were lost to follow-up, were excluded from further analysis. Of the remaining 110 patients (61 females and 49 males) aged 17-87 years (mean, 40 years), 46 (42%) had acute appendicitis and 64 (58%) did not. A diagnosis of acute appendicitis was confirmed at surgery and by histopathology; an exclusion of acute appendicitis was confirmed at surgery and by histopathology (n = 7) or at clinical follow-up (n = 57). Clinical follow-up consisted of evaluation for symptom resolution during the hospital stay, and a telephone interview at least 12 weeks after the emergency department admission. CT findings for alternative diagnoses in the 64 patients without appendicitis were not analyzed due to the lack of a consistent reference standard, and the large number of alternative diagnoses.
Nonfocused CT examinations of the abdomen and pelvis were performed using 16-detector-row CT scanners (Brilliance; Philips Medical Systems, Cleveland, OH) without oral or rectal contrast material. Intravenous nonionic contrast material (2 mL/kg; Ultravist 370; Schering, Berlin, Germany) was administered at a rate of 3 mL/sec. Bolus-tracking software was used to trigger scanning 60 sec after aortic enhancement reached a 150-HU threshold. Helical scan data was acquired using 16×1.5-mm collimation, a rotation speed of 0.5 sec, a pitch of 1.17 to 1.25, and 120 kVp. Tube current was automatically modulated by patient body size, and the asymmetric nature of the object scanned to produce the same image noise level as a reference image (Dose-Right, Philips Medical Systems, Eindhoven, The Netherlands). From raw data, two transverse section datasets were reconstructed (i.e., 5-mm thick at 4-mm increments, and 2-mm thick at 1-mm increments). Other reconstruction parameters, such as field-of-view and reconstruction filter type (filter C), were kept constant for these two image datasets. From the 2-mm thick transverse section dataset, we reformatted a 5-mm thick coronal section dataset, with 4-mm increments, using the average intensity projection technique (Extended Brilliance Workspace; Philips). We chose this through-plane resolution because it is generally regarded as sufficient for the diagnosis of acute appendicitis (7, 11, 19, 20).
A study coordinator selected eight readers they included: four radiologists (working experiences of three to seven years) with a variation of academic focus in gastrointestinal radiology, two emergency physicians (working experiences of two and five years), and two surgeons (clinical fellows). During the study period, all eight readers took turns of duty at each department with other staff members who did not participate in this study. The non-radiologists (readers 5-8) had no other specific CT interpretation training prior to this study. However, because coronal reformation is commonly used for CT studies at our institution, all readers had been previously exposed to these images.
The primary diagnostic criteria for acute appendicitis included visualization of the abnormally distended appendix (approximate threshold diameter ≥ 6 to 8 mm) with mural enhancement and periappendiceal fat stranding. Secondary diagnostic criteria were: calcified appendicolith, inflammatory mass and abscess (1). Each CT reader rehearsed the evaluation procedure for five sample cases, including three cases with acute appendicitis and two with normal appendix; none of the samples were included in the formal analysis.
Each CT reader independently reviewed 110 examinations in the following three viewing planes: 5-mm thick transverse alone, coronal alone and combined transverse and coronal. The 330 observations for each reader were randomly assigned to 15 reading sessions, avoiding repetition of any case at a given session. The order of the reading sessions was changed between readers. Reading sessions were separated by a minimum of 1-week.
Studies were viewed in the stack mode. For combined transverse and coronal sections, the CT readers could use cross-reference line functionality if needed. Radiologists used a diagnostic workstation (DS3000, Impax version 4.5; Agfa HealthCare, Mortsel, Belgium) and flat-panel monochrome 3-megapixel monitors (ME315; Totoku, Tokyo, Japan) with a diagonal display size of 20.8 inches (52.8 cm), whereas non-radiologists used a clinical review workstation (CS5000; Agfa HealthCare) and dual flat-panel color monitors (SyncMaster CX171T; Samsung, Seoul, Korea) of matrix size 1,280×1,024 and a diagonal display size of 17 inches (43.2 cm), to simulate the clinical setting at our hospital.
The CT readers were informed of patient inclusion criteria, but were unaware of the original CT report, the results obtained from any other diagnostic technique (e.g., laboratory results), and the final diagnosis. Review was conducted at the convenience of readers without any time constraint. A research assistant manually recorded the amount of time (including the time required to load images and mark scores) needed for each evaluation.
CT readers recorded the possibility of a diagnosis of appendicitis by using a rating scale from 0 to 100 to represent confidence level. A rating of 0 indicated absolute certainty that the appendix was normal, a rating of 50 indicated the reader was uncertain if the appendix was normal or inflamed, and a rating of 100 indicated absolute certainty of acute appendicitis. If the CT reader arrived at an alternative diagnosis, the evaluation procedure was continued on the assumption that the patient might have additional acute appendicitis. CT readers also scored how clear visualization of the appendix was. A rating of 0 indicated that the appendix was not identified at all, and a rating of 100 indicated that the entire appendix was perfectly visualized.
Two biostatisticians participated in the study design and the statistical analysis. To assess the value of additional coronal sections, comparisons were made between transverse and combined transverse and coronal sections. To assess the value of coronal sections alone, comparisons were made between transverse and coronal sections.
To compare the diagnostic performance for different viewing planes, a multireader-multicase receiver operating characteristic (ROC) analysis was performed. Individual diagnostic sensitivity and specificity were determined using a decision threshold of a 50% possibility of appendicitis; derived values were then compared using the McNemar test. Sensitivity and specificity were pooled for radiologists and for non-radiologists using the inverse variance method (21).
CT reader confidence scores, for the diagnosis and exclusion of acute appendicitis, were compared between viewing planes, for appendicitis (true-positive) and non-appendicitis (true-negative) cases, respectively. Scores for appendiceal visualization were compared for true-positive, true-negative and all cases. To pool radiologist, non-radiologist, or overall CT reader results for eight readers, the scores were averaged. Because these data distributions were asymmetric, the Wilcoxon matched-pairs signed-ranks test was used. Inter-observer variability in terms of diagnostic confidence was assessed using the intra-class correlation coefficient. The reading time was compared, between transverse and combined transverse and coronal sections, for each reader using the paired-t-test.
Statistical software used included: SPSS (version 12.0, SPSS, Chicago, IL), LABMRMC (version 1.0.3B, Metz CE, Department of Radiology, University of Chicago, IL), and R (Version 2.1.0., The R Foundation for Statistical Computing). Significance for all statistical tests was assumed at a p < 0.05.
The addition of coronal sections to transverse sections tended to increase the area under the ROC curve (Az value) for radiologists (0.972 vs. 0.986, p = 0.076), however, this trend was not observed for non-radiologists (0.927 vs. 0.928, p = 0.949) and all (eight) CT readers (p = 0.437). For non-radiologists, the Az value, for coronal sections alone, tended to be smaller than for transverse sections (0.889 vs. 0.927, p = 0.247), but this tendency was not observed for radiologists (0.972 vs. 0.972, p = 0.989) (Table 1).
On the basis of a decision threshold of a 50% possibility of acute appendicitis, the alterations of each decision, by the CT readers, (i.e. diagnosis vs. exclusion of acute appendicitis) with the additional coronal sections, from a decision with transverse sections alone, were tabulated and are presented in Table 2. The data averaged among the readers showed that each radiologist altered the decision to correctly diagnose or exclude acute appendicitis in 3.0 (2.7% of 110 patients) patients, and incorrectly to diagnose or exclude in 1.6 (1.5%) patients, whereas each non-radiologist altered the decision correctly in 9.3 (8.5%) and incorrectly in 7.5 (6.8%) patients (Figs. 1, 2).
For radiologists, the pooled sensitivities and specificities were 92% (95% CI: 88, 96) and 96% (95% CI: 94, 98), respectively, for transverse sections alone; 91% (95% CI: 87, 95) and 95% (95% CI: 92, 98) for coronal sections alone; 96% (95% CI: 93, 99) and 96% (95% CI: 93, 98), for combined transverse and transverse sections. For non-radiologists, the pooled sensitivities and specificities were 95% (95% CI: 91, 98) and 70% (95% CI: 65, 76), respectively, for transverse sections alone; 92% (95% CI: 89, 96) and 68% (95% CI: 62, 74) for coronal sections alone; 95% (95% CI: 92, 98) and 73% (95% CI: 67, 78), for combined transverse and transverse sections. For the individual sensitivity and specificity, no significant difference was observed between viewing planes (Table 3).
For the diagnosis of acute appendicitis, the addition of coronal sections did not significantly increase the mean diagnostic confidence score for radiologists or non-radiologists (Fig. 3). For individual CT readers, additional coronal sections enhanced the diagnostic confidence for an emergency physician (reader 7, p = 0.009), however, reduced confidence for a surgeon (reader 6) (p = 0.039). For exclusion of acute appendicitis, additional coronal sections significantly increased mean scores, and therefore confidence declined, in non-radiologists (Figs. 2, 4) (21.5 [median] vs. 27.5, p = 0.013). However, this trend was not observed in the mean scores of radiologists. Both radiologists and non-radiologists were less confident in the diagnosis of and exclusion of acute appendicitis using coronal sections alone compared to using transverse sections (p < 0.05).
The mean score for a radiologist to visualize the appendix significantly increased by adding coronal sections in true-positive cases (p = 0.031) (Figs. 5, 6). However, this trend was not observed for the non-radiologists. In true-negative cases, no significant difference in the mean appendiceal visualization score was observed in any of the CT reader groups (Fig. 7). Mean score for coronal sections alone in all 110 cases was lower than that for transverse sections in both CT reader groups (p < 0.03).
Additional coronal sections tended to increase the intra-class correlation coefficient for the diagnostic confidence score for radiologists (0.818 vs. 0.835), non-radiologists (0.586 vs. 0.599), and all CT readers (0.670 vs. 0.698), while the coefficient for coronal sections alone in each CT reader group (0.785, 0.556, and 0.622, respectively) tended to be smaller than that for transverse sections.
The reading time for combined transverse and coronal sections was significantly longer than that required for transverse sections alone for two radiologists (readers 3 and 4) and a surgeon (reader 5), but was shorter for two radiologists (readers 1 and 2) (p < 0.05) (Table 4).
Before commencing this study, like Paulson et al. (17, 18), we postulated that additional coronal sections would be particularly helpful to non-radiologists; we believed that additional coronal sections would provide a more intuitive anatomic perspective (18), and help trace the unpredictable tortuous course of the vermiform appendix (22-24) for less-experienced CT readers.
However, our results suggest that radiologists, rather than non-radiologists, CT interpretation benefited more after additional coronal sections. For the radiologists involved in this study, additional coronal sections tended to increase diagnostic performance (Az value) and pooled sensitivity, and significantly enhanced appendiceal visualization in true-positive cases. These trends were not apparent among non-radiologists, although the additional coronal sections provided occasional advantages for the diagnostic confidence (reader 7). No significant enhancement of pooled Az value, pooled sensitivity, pooled specificity, or appendiceal visualization was observed for non-radiologists. Moreover, the non-radiologists' confidence at excluding acute appendicitis was unexpectedly reduced by additional coronal sections.
In this study, the null hypothesis was that the diagnostic performance would be the same for transverse and combined transverse and coronal sections. The cost of a Type II error (failure to reject the false null hypothesis) is diagnostic inaccuracy (abandoning an effective adjunct for more accurate diagnosis), whereas a Type I error (incorrect rejection of true null hypothesis) represents an additional data load. Because the latter is more tolerable, it might be more appropriate to use a higher significance level, instead of traditional 0.05, to reduce Type II errors, even at the expense of additional Type I errors (25). If the significance level moves to 0.1 from traditional 0.05, it becomes more apparent that additional coronal sections increase diagnostic performance (Az value, p = 0.076) and pooled sensitivity only for radiologists.
Our results show that coronal sections alone were inferior to transverse sections for diagnostic confidence and appendiceal visualization for both radiologists and non-radiologists. Although not statistically significant, the Az value for non-radiologists with coronal sections tended to be smaller than for transverse sections, while this tendency was not observed in radiologists; this may explain the observed limited benefit of adding coronal sections for non-radiologists. It is not clear why the coronal sections alone were inferior to the transverse sections despite the improved z-axis resolution of 16-detector-row CT. Our results show that coronal sections could be a diagnostic adjunct to the transverse sections, rather than a replacement of transverse sections, for the practical diagnosis of acute appendicitis.
Based on our results, we postulate that additional coronal sections enable radiologists, rather than non-radiologists, to comprehend the three-dimensional configuration of the diseased or normal appendix more accurately, by allowing the integration of information in both viewing planes. Interestingly, the reading time, for the two radiologists most experienced in abdomen and pelvic CT interpretation, was reduced by the additional coronal sections. This may be explained by more prompt comprehension of the three-dimensional configuration using multiplanar images by expert radiologists. We believe that these results have implications concerning the role of radiologists in the interpretation of modern CT scans; that provide more "intuitive" multiplanar images for the broad range of abnormalities involving any small tubular structure in the abdomen, as well as the appendix.
The limitations of the present study are as follows. First, because this study included a limited number of heterogeneous CT readers arbitrarily selected from a single institution, our results may not be generally applicable to all CT readers. Nevertheless, we believe that our CT reader sample is likely to reflect the real clinical situation in an average teaching hospital. Second, we did not analyze the additional value of coronal sections for the visualization for a variety of secondary CT signs associated with acute appendicitis. However, of the various CT findings associated with acute appendicitis, the visualization of the inflamed appendix is the single most critical sign (11, 26-28). Third, we included only adult patients to have a homogeneous study sample. Our results might not be applicable to children who tend to have less abdominal fat.
In conclusion, for radiologists, additional coronal reformation enhances diagnostic performance and appendiceal visualization in the CT diagnosis of acute appendicitis. The added value of coronal reformation is likely to be more apparent for radiologists than for referring physicians or surgeons.
Acknowledgements
This work was supported by Korea Research Foundation Grant funded by Korea Government (MOEHRD, Basic Research Promotion Fund) (KRF-2005-003-E00186). We thank Sang Il Lee, MD, Yoo Shin Choi, MD, and Eui Joong Lee, MD who participated as readers, and Jihyun Yang and Chang Min Dae, radiation technologists for their assistance during image dataset preparation.
References
1. Raptopoulos V, Katsou G, Rosen MP, Siewert B, Goldberg SN, Kruskal JB. Acute appendicitis: effect of increased use of CT on selecting patients earlier. Radiology. 2003. 226:521–526.
2. Rhea JT, Halpern EF, Ptak T, Lawrason JN, Sacknoff R, Novelline RA. The status of appendiceal CT in an Urban Medical Center 5 years after its introduction: experience with 753 patients. AJR Am J Roentgenol. 2005. 184:1802–1808.
3. Kaiser S, Frenckner B, Jorulf HK. Suspected appendicitis in children: US and CT--a prospective randomized study. Radiology. 2002. 223:633–638.
4. Poortman P, Lohle PN, Schoemaker CM, Oostvogel HJ, Teepen HJ, Zwinderman KA, et al. Comparison of CT and sonography in the diagnosis of acute appendicitis: a blinded prospective study. AJR Am J Roentgenol. 2003. 181:1355–1359.
5. Wijetunga R, Tan BS, Rouse JC, Bigg-Wither GW, Doust BD. Diagnostic accuracy of focused appendiceal CT in clinically equivocal cases of acute appendicitis. Radiology. 2001. 221:747–753.
6. Wise SW, Labuski MR, Kasales CJ, Blebea JS, Meilstrup JW, Holley GP, et al. Comparative assessment of CT and sonographic techniques for appendiceal imaging. AJR Am J Roentgenol. 2001. 176:933–941.
7. Checkoff JL, Wechsler RJ, Nazarian LN. Chronic inflammatory appendiceal conditions that mimic acute appendicitis on helical CT. AJR Am J Roentgenol. 2002. 179:731–734.
8. Lowe LH, Draud KS, Hernanz-Schulman M, Newton MR, Heller RM, Stein SM, et al. Nonenhanced limited CT in children suspected of having appendicitis: prospective comparison of attending and resident interpretations. Radiology. 2001. 221:755–759.
9. Albano MC, Ross GW, Ditchek JJ, Duke GL, Teeger S, Sostman HD, et al. Resident interpretation of emergency CT scans in the evaluation of acute appendicitis. Acad Radiol. 2001. 8:915–918.
10. Rao PM, Rhea JT, Novelline RA, McCabe CJ, Lawrason JN, Berger DL, et al. Helical CT technique for the diagnosis of appendicitis: prospective evaluation of a focused appendix CT examination. Radiology. 1997. 202:139–144.
11. Jacobs JE, Birnbaum BA, Macari M, Megibow AJ, Israel G, Maki DD, et al. Acute appendicitis: comparison of helical CT diagnosis focused technique with oral contrast material versus nonfocused technique with oral and intravenous contrast material. Radiology. 2001. 220:683–690.
12. Weltman DI, Yu J, Krumenacker J Jr, Huang S, Moh P. Diagnosis of acute appendicitis: comparison of 5- and 10-mm CT sections in the same patient. Radiology. 2000. 216:172–177.
13. Lane MJ, Liu DM, Huynh MD, Jeffrey RB Jr, Mindelzun RE, Katz DS. Suspected acute appendicitis: nonenhanced helical CT in 300 consecutive patients. Radiology. 1999. 213:341–346.
14. Keyzer C, Tack D, de Maertelaer V, Bohy P, Gevenois PA, Van Gansbeke D. Acute appendicitis: comparison of low-dose and standard-dose unenhanced multi-detector row CT. Radiology. 2004. 232:164–172.
15. Lane MJ, Katz DS, Ross BA, Clautice-Engle TL, Mindelzun RE, Jeffrey RB Jr. Unenhanced helical CT for suspected acute appendicitis. AJR Am J Roentgenol. 1997. 168:405–409.
16. Mullins ME, Kircher MF, Ryan DP, Doody D, Mullins TC, Rhea JT, et al. Evaluation of suspected appendicitis in children using limited helical CT and colonic contrast material. AJR Am J Roentgenol. 2001. 176:37–41.
17. Paulson EK, Harris JP, Jaffe TA, Haugan PA, Nelson RC. Acute appendicitis: added diagnostic value of coronal reformations from isotropic voxels at multi-detector row CT. Radiology. 2005. 235:879–885.
18. Paulson EK, Jaffe TA, Thomas J, Harris JP, Nelson RC. MDCT of patients with acute abdominal pain: a new perspective using coronal reformations from submillimeter isotropic voxels. AJR Am J Roentgenol. 2004. 183:899–906.
19. Bendeck SE, Nino-Murcia M, Berry GJ, Jeffrey RB Jr. Imaging for suspected appendicitis: negative appendectomy and perforation rates. Radiology. 2002. 225:131–136.
20. Nikolaidis P, Hwang CM, Miller FH, Papanicolaou N. The nonvisualized appendix: incidence of acute appendicitis when secondary inflammatory changes are absent. AJR Am J Roentgenol. 2004. 183:889–892.
21. Deeks JJ. Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and screening tests. BMJ. 2001. 323:157–162.
22. Callahan MJ, Rodriguez DP, Taylor GA. CT of appendicitis in children. Radiology. 2002. 224:325–332.
23. Guidry SP, Poole GV. The anatomy of appendicitis. Am J Surg. 1994. 60:68–71.
24. Wagner JM, McKinney WP, Carpenter JL. Does this patient have appendicitis? JAMA. 1996. 276:1589–1594.
25. Motulsky H. Intuitive biostatistics. 1995. New York: Oxford University Press;106–112.
26. Malone AJ Jr, Wolf CR, Malmed AS, Melliere BF. Diagnosis of acute appendicitis: value of unenhanced CT. AJR Am J Roentgenol. 1993. 160:763–766.
27. Rao PM, Rhea JT, Novelline RA, Mostafavi AA, Lawrason JN, McCabe CJ. Helical CT combined with contrast material administered only through the colon for imaging of suspected appendicitis. AJR Am J Roentgenol. 1997. 169:1275–1280.
28. Balthazar EJ, Birnbaum BA, Yee J, Megibow AJ, Roshkow J, Gray C. Acute appendicitis: CT and US correlation in 100 patients. Radiology. 1994. 190:31–35.