Abstract
Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM.
Intestinal metaplasia (IM) of the gastric mucosa is an important premalignant lesion or condition (1, 2); hence it is expected that early detection of gastric cancer could be achieved through accurate diagnosis of IM with regular follow-up (3). However, the concordance of morphological appearance and histopathologic findings is low, and the accuracy of endoscopic diagnosis is also known to be poor (4, 5). The reason for this is because ash-colored nodular changes spreading in the antrum, which is a typical endoscopic feature of IM is apt to be confused with antral nodular hyperplasia or raised erosion. In addition, flat type of IM is difficult to be found under the view of endoscopy alone (6). Therefore, accurate diagnosis of IM by means of endoscopy is a major concern among those interested in the field of gastric cancer.
Recently, a number of studies aimed at improving the rate of diagnostic rate of IM have been published. Conventional methods ranging from multiple biopsies and methylene blue chromoendoscopy to state-of-the-art narrow band imaging (NBI) and confocal laser endomicroscopy have been examined, and all yielded some promising results (7-10). Unfortunately all of these methods can create an increased cost or work load, especially in the Asia-Pacific region where gastric cancer is prevalent, making it nearly impossible to apply them widely. Thanks to advances in optical endoscopy techniques, even the mucosal structure can be visualized with high-definition (HD) endoscopy, and it has a similar detection rate to NBI or chromoendoscopy in diagnosing Barrett's esophagus and colon adenomas (11-13). Therefore it expected to be better than conventional endoscopy in accurately diagnosing gastric IM, even without any additional modality. To our knowledge, there has been no study of the diagnostic yield of HD endoscopy in diagnosing gastric IM. Hence the aims of this study were to evaluate 1) the interobserver variation in diagnosing gastric IM by HD endoscopy and 2) the diagnostic accuracy of this modality for gastric IM among experienced and inexperienced endoscopists.
Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. We evaluated the interobserver agreement of the enrolled endoscopists in order to validate the diagnostic reliability of HD endoscopy in diagnosing IM. This was followed by the comparison of the diagnostic accuracy between the experienced and inexperienced endoscopists group to find out whether endoscopic experience affects the diagnosis of IM. The study procedure is summarized in Fig. 1.
Patients who received esophagogastroduodenoscopy (EGD) at Hanyang University Guri Hospital between October 2008 and September 2009 were included in the study. From 1,284 patients endoscopically diagnosed with IM in the endoscopic image database, and 648 patients histopathologically proven to have IM in the pathology database, a total of 1,596 patients (908 males, mean age 56 ± 14 yr) was finally enrolled after excluding 336 patients whose data overlapped in the two databases. Accordingly, the study population consists of those who were endoscopically diagnosed as gastric IM irrespective of histopathologic diagnosis, and those who were histopathologically diagnosed as IM regardless of endoscopic diagnosis.
All EGDs had been performed by one of four endoscopists. An HD endoscope (GIF-H260; Olympus Optical Co., Ltd, Tokyo, Japan) and high-definition television (HDTV) system (Evis Lucera; Olympus Optical Co., Ltd, Tokyo, Japan) were used in all procedures.
We have considered the following conditions for the endoscopic cases of the patients referred for interpretation: first, the quality of image should be that of a highest standard. Second, the level of quality of each image should be similar. Third, the images should contain pictures that represent typical characteristics of the lesion. Fourth, histopathologic examination should be carried out to confirm the endoscopic diagnosis. To satisfy these conditions, we have designated the following selection criteria:
A total of 101 patients were selected from the 1,596 patients after direct screening by a single gastroenterology faculty member and a single clinical fellow using the above criteria. Next, a final group of 50 cases was selected from 101 patients by simple random sampling using SPSS software version 18.0 (SPSS, Chicago, IL, USA). The finally selected 50 cases were composed of three different groups: a group diagnosed with IM both endoscopically and histopathologically, a group diagnosed as IM endoscopically but not by histopathologic examination, and a group not initially diagnosed as IM through endoscopy but confirmed as IM through histopathologic examination.
Production of endoscopic cases for interpretation: All endoscopic images had vertical and horizontal resolutions of 300 dots per inch, with a width of 900 and height of 780 pixels. The slides of the 50 cases were made into a slide show using Microsoft Power Point 2007 (Microsoft, Redmond, WA, USA). Each case was composed of two slides: on the first slide the anatomical position of the lesion where the biopsy taken was typed, and second slide endoscopic images were shown. The anatomical position shown on the first slide was written according to the minimal standard terminology 3.0 (14), and the stomach body was further subdivided into upper third, middle third, and lower third. The second slide consisted of four endoscopic images, with one close-up image showing a clear mucosal structure of the biopsy site and three other images from different angles that included the biopsy site (Fig. 2).
Assessors and interpretation of the selected cases: Five experienced endoscopists were enrolled, of which two were assistant professors and the other three associate professors. The mean (SD) age of all faculty was 40.26 (1.94) yr and the mean (SD) duration of their endoscopic careers was 105.6 (33.30) months. The inexperienced endoscopists consisted of five clinical fellows, with mean (SD) age 33.84 (1.76) yr and mean (SD) duration of their endoscopic careers of 7.5 (1.36) months. The endoscopists were recruited from five different university-based hospitals.
The purpose and method of the study was explained in advance to all the endoscopists, and 50 endoscopic cases were sent each endoscopist via e-mail. Endoscopic interpretation was performed according to the anatomical position of the lesion presented by each case. There were no time limits in interpretation. If the endoscopist decided that a different finding was combined with IM, a diagnosis of IM was still to be made. The interpretation results were sent back via e-mail.
Sample size was calculated in order to guarantee the design accuracy. We assumed the sensitivity of HD endoscopy to be around 85% in diagnosing gastric IM. Furthermore, we considered that the lower 95% confidence limit should not fall below 0.65, with 0.95 probabilities. Therefore, the adequate sample size was calculated to be 50 (15).
Statistical analysis of interobserver variability was performed with SPSS software version 18.0 (SPSS, Chicago, IL, USA). Interobserver agreement was expressed as the percentage of full agreement among all observers, as well as by an overall κ statistic with 95% confidence interval (95% CI) (16, 17). A κ value greater than 0.8 denoted excellent agreement, 0.8 to 0.6 denoted good agreement, 0.6 to 0.4 denoted fair agreement, and less than 0.4 denoted poor agreement. A κ value of 0 indicated agreement equal to chance, and a value less than 0 suggested disagreement (18).
To calculate the accuracy in diagnosing IM, 2 × 2 tables were constructed to compare HD endoscopic diagnoses with histopathology as the reference standard. The histopathologic findings in all the endoscopic cases were reviewed by two experienced gastrointestinal pathologists. The outcome parameters were sensitivity, specificity, overall accuracy, and diagnostic predictive value, of HD endoscopy. The diagnostic accuracy of the experienced and inexperienced endoscopists was compared in the following manner: each correct (true positive or true negative) score of an observer counted as +1 point; each incorrect score counted as 0 points. The sums of all scores were compared by the paired t-test (19).
The composition of the 50 cases referred for interpretation to the 10 endoscopists was; 33 IM, 7 atrophic gastritis, 5 erosive gastritis, and 5 cases with other diagnoses according to the initial endoscopic findings. The diagnoses of 31 IM, 11 chronic active gastritis, 5 chronic inflammation, and 3 with other diagnoses were made according to the histopathologic diagnosis criteria. When the IM was categorized according to histopathologic severity, there were 11 mild, 14 moderate, and 6 marked cases.
The interobserver agreement in the experienced endoscopists group for the 50 referred endoscopic cases was "poor" (κ = 0.38; 95% CI, 0.25-0.52) and it was also "poor" (κ = 0.33; 0.20-0.47) in the inexperienced group. In addition, interobserver agreement among all endoscopists was also "poor" (Table 1).
Interobserver agreement in both the experienced and inexperienced group was also "poor" for the histologically proven 31 IM cases, with κ-values of 0.28 (0.13-0.47) and 0.27 (0.12-0.47) for the experienced and inexperienced group, respectively. After categorizing the IM according to severity into mild, moderate, and marked based on the Updated Sydney System (20), the interobserver agreement was "poor" for the mild (n = 11) and moderate (n = 14) cases in both groups. However, in the marked IM cases (n = 6), interobserver agreement in the experienced group was "fair to good" (κ = 0.55; 0.18-0.90) but still "poor" in the inexperienced group. For the cases histologically proven to be non-IM (n = 19), the interobserver agreement for both groups was again "poor", with κ-values of 0.32 (0.13-0.57) and 0.39 (0.19-0.64) for the experienced and inexperienced group, respectively.
The sensitivity, specificity, and overall accuracy for IM of the experienced endoscopists were 74.2, 78.9, and 76.0%, respectively; while for the inexperienced endoscopists they were 67.7, 52.6, and 60.0% respectively. In comparing diagnostic accuracy of the experienced and inexperienced endoscopists, the sum of score for the experienced endoscoipists was 161, and for the inexperienced endoscopists it was 139, indicating a higher diagnostic accuracy rate for the experienced endoscopists (P = 0.003). The positive and negative predictive values for the experienced group were 85.2% and 65.2%, respectively, and 70.0% and 50.0%, respectively, for the inexperienced group (Table 2). Table 3 showed the value of diagnostic accuracy of each endoscopists.
In the present study, we have evaluated the interobserver agreement between endoscopists in order to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and further compared the diagnostic accuracy of experienced and inexperienced endoscopists to find out whether the degree of experience affects the diagnostic accuracy. The interobserver agreement between both groups was poor, and the diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists.
Endoscopes equipped with HDTV-compatible charge-coupled devices and HDTV video systems permit clear observation of even the fine structure of the mucosal surface, and are replacing conventional endoscopy and becoming the standard equipment. In studies that compared the rate of detection of colon polyps with white-light (WL) and NBI using HD endoscopy, and the diagnostic rate of Barrett's esophagus with WL, chromoendoscopy, and NBI, HD endoscopy using only WL yielded similar results to the other additional enhancement techniques (11-13). In addition, although it is known that the value of conventional endoscopy for diagnosing gastric IM is low, there has, until now, been no study of the diagnostic reliability of HD endoscopy, which provides high resolution images (21). In the present study, interobserver agreement in diagnosing gastric IM using WL HD endoscopic images was low in both the experienced (κ = 0.38; 0.25-0.52) and inexperienced group (κ = 0.33; 0.20-0.47). Therefore, although there have been great improvements in optical endoscopy since the invention of the endoscope, there remains a lingering doubt whether endoscopy is trustworthy in diagnosing precancerous lesions or the condition known as gastric IM. Furthermore, the low interobserver agreement seems very meaningful since the experienced endoscopists enrolled in the present study all had at least seven years of experience in screening endoscopy for gastric cancer in the Korean national cancer screening program for gastric cancer (22). The result of this study indicates that all suspicious areas for gastric IM should be biopsied, and that diagnosis through visual inspection is highly unreliable in the diagnosis of IM. This is particularly true in regions with high prevalence of gastric cancer. Therefore, our findings should be borne in mind when deciding on gastric cancer-related policies in countries using endoscopy as a screening tool for people at high risk of gastric cancer.
Variation in the quality of endoscopy likely has an important impact on patient outcomes, and then endoscopic training and experience should be needed for the high quality of endoscopy (23, 24). There have been several studies looking at the effects of endoscopic training and experience on endoscopic diagnostic accuracy, most of them confined to the diagnosis of colorectal polyps (19, 25, 26). To our knowledge, this is the first study that evaluates whether endoscopic experience influences the diagnostic yield in gastric IM. In this study, the overall diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). In addition, in the analysis of interobserver agreement, the histopathologically diagnosed marked IM cases by the experienced endoscopists was the only instance which yielded "fair to good" (κ = 0.55; 0.18-0.90) interobserver agreement. This outcome indicates that HD endoscopy by experienced endoscopists may be diagnostically acceptable in the limited category of cases classified as "marked" IM. Endoscopic experience is consequently needed in order to raise the diagnostic accuracy of gastric IM, and does have some influence on diagnostic reliability.
This is the first study that validated the diagnostic reliability of HD endoscopy in diagnosing gastric IM, and emphasized the role of endoscopic experience on influencing the diagnostic accuracy. However, there are distinct limitations of our study. First, video recordings of the endoscopic procedures were not included in the slide show. Although video recording is much more realistic than still images, making it a great asset because of its resemblance to the environment of real-time endoscopy, in our study we used an endoscopic image database in order to access as many endoscopy cases as possible, thus missing out on any endoscopic video recordings that might be available. To make up for this, for every case we provided motionless images of the highest quality, consisting of three images from different angles with the biopsy site included, and a single close-up image showing a vivid pit pattern. Nevertheless, the diagnostic reliability or validity of HD endoscopy may be lower than expected since there are no video recordings included in the slide show. Second, the present study was a retrospective study which enrolled cases diagnosed as IM through either endoscopic finding or histopathological confirmation, which may have introduced some evidence of selection bias. Despite this fact, the interobserver agreement of both groups was poor.
In conclusion, diagnosis through visual inspection under HD endoscopy is unreliable in the diagnosis of IM. Therefore, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM.
ACKNOWLEDGEMENTS
We are grateful for the interpretations of endoscopic images carried out by Professor Y. C. Jeon, Hanyang University Guri Hospital, Professor Y. J. Jo and Professor B. K. Son, Eulji University, Professor H. L. Lee, Hanyang University Seoul Hospital, Professor S. Y. Yang, Seoul National University Hospital, Dr. H. Y. Park, Sungkyunkwan University, Dr. Y. H. Ahn, Wonkwang University, Dr. Y. H. Yu and Dr. J. Y. Jeong, Hanyang University Guri Hospital, and Dr. Y. W. Joo, Hanyang University Seoul Hospital.
Notes
References
1. Correa P. Human gastric carcinogenesis: a multistep and multifactorial process: first American Cancer Society Award Lecture on Cancer Epidemiology and Prevention. Cancer Res. 1992. 52:6735–6740.
2. De Vries AC, van Grieken NC, Looman CW, Casparie MK, de Vries E, Meijer GA, Kuipers EJ. Gastric cancer risk in patients with premalignant gastric lesions: a nationwide cohort study in the Netherlands. Gastroenterology. 2008. 134:945–952.
3. Correa P, Piazuelo MB, Wilson KT. Pathology of gastric intestinal metaplasia: clinical implications. Am J Gastroenterol. 2010. 105:493–498.
4. Kaur G, Raj SM. A study of the concordance between endoscopic gastritis and histological gastritis in an area with a low background prevalence of Helicobacter pylori infection. Singapore Med J. 2002. 43:090–092.
5. Laine L, Cohen H, Sloane R, Marin-Sorensen M, Weinstein WM. Interobserver agreement and predictive value of endoscopic findings for H. pylori and gastritis in normal volunteers. Gastrointest Endosc. 1995. 42:420–423.
6. Kaminishi M, Yamaguchi H, Nomura S, Oohara T, Sakai S, Fukutomi H, Nakahara A, Kashimura H, Oda M, Kitahora T, et al. Endoscopic classification of chronic gastritis based on a pilot study by the research society for gastritis. Dig Endosc. 2002. 14:138–151.
7. Areia M, Amaro P, Dinis-Ribeiro M, Cipriano MA, Marinho C, Costa-Pereira A, Lopes C, Moreira-Dias L, Romãozinho JM, Gouveia H, et al. External validation of a classification for methylene blue magnification chromoendoscopy in premalignant gastric lesions. Gastrointest Endosc. 2008. 67:1011–1018.
8. Bansal A, Ulusarac O, Mathur S, Sharma P. Correlation between narrow band imaging and nonneoplastic gastric pathology: a pilot feasibility trial. Gastrointest Endosc. 2008. 67:210–216.
9. Capelle LG, Haringsma J, de Vries AC, Steyerberg EW, Biermann K, van Dekken H, Kuipers EJ. Narrow band imaging for the detection of gastric intestinal metaplasia and dysplasia during surveillance endoscopy. Dig Dis Sci. 2010. 55:3442–3448.
10. Guo YT, Li YQ, Yu T, Zhang TG, Zhang JN, Liu H, Liu FG, Xie XJ, Zhu Q, Zhao YA. Diagnosis of gastric intestinal metaplasia with confocal laser endomicroscopy in vivo: a prospective study. Endoscopy. 2008. 40:547–553.
11. Adler A, Aschenbeck J, Yenerim T, Mayr M, Aminalai A, Drossel R, Schröder A, Scheel M, Wiedenmann B, Rösch T. Narrow-band versus whitelight high definition television endoscopic imaging for screening colonoscopy: a prospective randomized trial. Gastroenterology. 2009. 136:410–416.
12. Rex DK, Helbig CC. High yields of small and flat adenomas with high-definition colonoscopes using either white light or narrow band imaging. Gastroenterology. 2007. 133:42–47.
13. Curvers W, Baak L, Kiesslich R, Van Oijen A, Rabenstein T, Ragunath K, Rey JF, Scholten P, Seitz U, Ten Kate F, et al. Chromoendoscopy and narrow-band imaging compared with high-resolution magnification endoscopy in Barrett's esophagus. Gastroenterology. 2008. 134:670–679.
14. Aabakken L, Rembacken B, LeMoine O, Kuznetsov K, Rey JF, Rösch T, Eisen G, Cotton P, Fujino M. Minimal standard terminology for gastrointestinal endoscopy - MST 3.0. Endoscopy. 2009. 41:727–728.
15. Flahault A, Cadilhac M, Thomas G. Sample size calculation should be performed for design accuracy in diagnostic test studies. J Clin Epidemiol. 2005. 58:859–862.
16. Chmura Kraemer H, Periyakoil VS, Noda A. Kappa coefficients in medical research. Stat Med. 2002. 21:2109–2129.
17. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychological Methods. 1996. 1:30–46.
18. Stokes ME, Davis CS, Koch GG. Categorical data analysis using the SAS system. 2000. 2nd ed. Cary: SAS Institute Inc..
19. Van den Broek FJ, van Soest EJ, Naber AH, van Oijen AH, Mallant-Hent RCh, Böhmer CJ, Scholten P, Stokkers PC, Marsman WA, Mathus-Vliegen EM, et al. Combining autofluorescence imaging and narrow-band imaging for the differentiation of adenomas from non-neoplastic colonic polyps among experienced and non-experienced endoscopists. Am J Gastroenterol. 2009. 104:1498–1507.
20. Dixon MF, Genta RM, Yardley JH, Correa P. Classification and grading of gastritis: the updated Sydney System: International Workshop on the Histopathology of Gastritis, Houston 1994. Am J Surg Pathol. 1996. 20:1161–1181.
21. Sauerbruch T, Schreiber MA, Schüssler P, Permanetter W. Endoscopy in the diagnosis of gastritis: diagnostic value of endoscopic criteria in relation to histological diagnosis. Endoscopy. 1984. 16:101–104.
22. Choi IJ. Gastric cancer screening and diagnosis. Korean J Gastroenterol. 2009. 54:67–76.
23. Rabeneck L, Paszat LF, Saskin R. Endoscopist specialty is associated with incident colorectal cancer after a negative colonoscopy. Clin Gastroenterol Hepatol. 2010. 8:275–279.
24. Baxter NN, Sutradhar R, Forbes SS, Paszat LF, Saskin R, Rabeneck L. Analysis of administrative data finds endoscopist quality measures associated with postcolonoscopy colorectal cancer. Gastroenterology. 2011. 140:65–72.
25. Higashi R, Uraoka T, Kato J, Kuwaki K, Ishikawa S, Saito Y, Matsuda T, Ikematsu H, Sano Y, Suzuki S, et al. Diagnostic accuracy of narrow-band imaging and pit pattern analysis significantly improved for less-experienced endoscopists after an expanded training program. Gastrointest Endosc. 2010. 72:127–135.
26. Chang CC, Hsieh CR, Lou HY, Fang CL, Tiong C, Wang JJ, Wei IV, Wu SC, Chen JN, Wang YH. Comparative study of conventional colonoscopy, magnifying chromoendoscopy, and magnifying narrow-band imaging systems in the differential diagnosis of small colonic polyps between trainee and experienced endoscopist. Int J Colorectal Dis. 2009. 24:1413–1419.