Abstract
Background
Carcinoembryonic antigen (CEA) is one of the tumor markers available for evaluating disease progression status after initial therapy and monitoring subsequent treatment modalities in colorectal, gastrointestinal, lung, and breast carcinoma. We evaluated the correlations and differences between widely used, automated CEA immunoassays at four different medical laboratories.
Methods
In total, 393 serum samples with CEA ranging from 3.0 to 1,000 ng/mL were analyzed on ADVIA Centaur XP (Siemens Diagnostics, Tarrytown, NY, USA), ARCHITECT i2000sr (Abbott Diagnostics, Abbott Park, IL, USA), Elecsys E170 (Roche Diagnostics, Indianapolis, IN, USA), and Unicel DxI800 (Beckman Coulter, Fullerton, CA, USA), and the results were compared. Deming regression, Passing-Bablok regression, and Bland-Altman analyses were performed to evaluate the data correlation and % differences among these assays.
Results
Deming regression analysis of data from Elecsys E170 and UniCel DxI800 showed good correlation (y=3.1615+0.8970x). According to Bland-Altman plot, no statistically significant bias (−1.78 ng/mL [95% confidence interval: −4.02 to 0.46]) was observed between Elecsys E170 and UniCel DxI800. However, the relative differences of CEA concentrations between assays exceeded the acceptable limit of 30%. Regarding the agreement of positivity with cut-off value 5.0 ng/mL, ARCHITECT i2000sr and Elecsys E170 showed the highest agreement (95.2%), whereas ADVIA Centaur XP and ARCHITECT i2000sr showed the lowest agreement (70.7%).
Tumor markers are useful for evaluating disease progression status after initial therapy and monitoring subsequent treatment modalities [12345]. Carcinoembryonic antigen (CEA) is one of the longest known tumor antigens [6], and is a marker for colorectal, gastrointestinal, lung, and breast carcinoma [7]. With increasing incidence and prevalence of cancers, the CEA immunoassay workload in medical laboratories has increased. Because CEA concentration is used to monitor treatment responses and recurrences of various cancers, more sensitive, specific, reproducible, and interchangeable assays are needed to manage cancer patients.
Immunoassays quantify biologically relevant molecules based on the specificity and selectivity of antibody reagents [89]. Significant variability in results can result from the statistical model used for the calibration curve, which is used for quantification. Therefore, it is important to choose an appropriate curve-fitting model for calibration curves and to consider all calibration curve-related factors, including quality and stability of reference standards, quality and stability of reagents, and statistical validity of the calibration curve [1011]. Although various CEA assays with different principles, including chemiluminescence immunoassay (CLIA), enzyme immunoassay, radioimmunoassay, fluorescence immunoassay, and lateral flow immunoassay have been introduced, currently, automated CLIA analyzers with high sensitivity and high throughput are the most widely used [121314]. Despite ongoing standardization efforts, CEA concentrations from different manufacturers can vary owing to the lack of accurate calibration as well as differences in assay principle, the epitope used, antibody specificities, and the reagents used. Previous studies using individual samples and standard materials have reported that harmonization of CEA assays is far from being realized [1516].
This study aimed to comparatively evaluate the four automated CEA immunoassays and to estimate the harmonization of these four analyzers.
In total, 393 serum samples with high CEA concentrations were obtained from four laboratories. The samples were subjected to routine CEA quantification at all four laboratories using different CLIAs. Sera with CEA concentrations of 3.0–1,000 ng/mL were randomly collected between March 2014 and February 2015. The leftover samples after routine CEA tests were aliquoted into 6–10 new tubes and stored immediately at−70℃ until analysis according to the manufacturer's recommendations for sample management as per which samples should be frozen at or below −20℃ if they are not assayed within 2–7 days. Samples were transported in the frozen state. The declared sample stability and storage conditions are given in Table 1. This study was approved by the Institutional Review Board of The Catholic University of Korea (XC14SIMI0069K).
The automated immunoassays used at the medical laboratories were ADVIA Centaur XP (Siemens Diagnostics, Tarrytown, NY, USA) at Seoul St. Mary's Hospital (Seoul, Korea), ARCHITECT i2000sr (Abbott Diagnostics, Abbott Park, IL, USA) at St. Vincent's Hospital (Suwon, Korea), Elecsys E170 (Roche Diagnostics, Indianapolis, IN, USA) at Daejeon St. Mary's Hospital (Daejeon, Korea), and Unicel DxI800 (Beckman Coulter, Fullerton, CA, USA) at Inchon St. Mary's Hospital (Inchon, Korea). The measuring ranges of the four assays were as follows: 0.5–100 ng/mL for ADVIA Centaur XP, 0.5–1,500 ng/mL for ARCHITECT i2000sr, 0.2–1,000 ng/mL for Elecsys E170, and 0.1–1,000 ng/mL for Unicel DxI800. To evaluate the effect of dilution on the CEA results, comparisons between assays were performed separately for samples having concentrations <100 ng/mL and within the measurement range of all four immunoassays. Serum samples were thawed immediately before analysis, mixed thoroughly, and checked for clots. CEA concentrations were quantified concurrently in the same batch between 7 and 14 days by using each immunoassay based on the principle of electrochemiluminescence detection. Samples with measured concentrations exceeding the analytic measurement range were diluted on-board according to the manufacturers' recommendations. Performance characteristics of the four automated CEA immunoassay analyzers according to information provided by the manufacturers are summarized in Table 1.
Normality was assessed using Kolmogorov-Smirnov and Shapiro-Wilk tests. Deming regression with a constant CV of 5% and Passing-Bablok regression analyses were performed to identify proportional and systematic bias [17]. As the CEA data were not normally distributed as indicated by Kolmogorov-Smirnov and Shapiro-Wilk tests, Bland-Altman plots were displayed as relative difference plots with clinically acceptable bias limits of 30% according to a previous recommendation [17]. CEA concentrations of 0.5–100 ng/mL are within the overlapping analytical measuring range, and the results were regarded valid without further sample dilution. Therefore, the results from these samples were compared separately. MedCalc Statistical Software Version 17.6 (MedCalc software, Ostend, Belgium) was used for statistical investigation. Statistical significance was accepted at P <0.05.
The CEA concentrations in the 393 serum samples obtained from the four automated immunoassays were as follows: ADVIA Centaur XP (median, 7.4; range, 1.4–636.6 ng/mL), ARCHITECT i2000sr (12.8; 4–1,134), Elecsys E170 (12.1; 3–913.8), and UniCel DxI800 (10.2; 2.1–885.8). CEA concentrations measured by ARCHITECT i2000sr were found to be the highest, followed by those measured by Elecsys E170, Unicel DxI800, and ADVIA Centaur XP. Results of between-assay comparisons are shown in Table 2. Deming regression coefficients for these CEA assays varied from 0.6335 to 1.2895 (Fig. 1). There was no linear relationship for ARCHITECT i2000 vs Elecsys E170 and Elecsys E170 vs Unicel DxI800 (Cusum test for linearity, P <0.05). Therefore, Passing-Bablok analysis was not applicable to these comparisons. According to the Bland-Altman plot, no statistically significant bias (−1.78 ng/mL [95% confidence interval: −4.02 to 0.46]) was observed between Elecsys E170 and UniCel DxI800. The mean % difference in CEA concentrations by Bland-Altman analysis ranged from −54.5 to 21.3%. The mean % difference between ARCHITECT i2000sr and Elecsys E170, and that between Elecsys E170 and UniCel DxI800 was as low as 10.5% and 10.9%, respectively (Table 2). On the other hand, all six pairwise comparisons demonstrated % differences exceeding the acceptable limit of <30%. When the median difference (%) between assays was employed, the 2.5th to 97.5th percentile of median difference (%) also exceeded the acceptable limit of 30% (Fig. 2).
Twenty-four (6.1%) serum samples that had CEA concentrations over 100 ng/mL by ADVIA Centaur XP were diluted by a factor of 10. When we divided the samples into two subgroups with CEA concentrations <100 ng/mL and ≥100 ng/mL, mean differences (%) of ADVIA Centaur XP against mean CEA concentrations were similar in both subgroups (−31.1% and −26.5%, respectively). In Bland-Altman and Passing-Bablok regression analyses, the differences of CEA concentrations among the six pairwise comparisons exceeded the acceptable limit of 30%, but diluent matrix effects were not detected (Table 3).
We used a cut-off value of 5.0 ng/mL to categorize CEA data. Of all samples, 69.7% (274/393), 98.9% (389/393), 95.2% (374/393), and 87.3% (343/393) had CEA concentrations above 5.0 ng/mL when they were tested with the ADVIA Centaur XP, ARCHITECT i2000sr, Elecsys E170 and UniCel DxI800 analyzers, respectively. When we analyzed the agreement between assays based on categorical data, we observed the highest concentration of agreement between ARCHITECT i2000sr and Elecsys E170 (95.2%, 374/393) and the lowest concentration between ADVIA Centaur XP and ARCHITECT i2000sr (70.7%, 278/393) (Table 4).
This study aimed to compare four widely used automated CEA assays. Approximately 80% of healthy subjects have a CEA concentration lower than 3 ng/mL [14]. Therefore, serum samples with a CEA concentration >3.0 ng/mL as measured by any of the four immunoassays were randomly selected for analysis. The Bland-Altman plots showed that many samples were largely outside the linearity limits, and CEA concentrations varied between assays. In the absence of a reference method for CEA measurement, the clinically acceptable significant percentage difference between assays was defined as 30%. The maximum differences (%) between all six assays exceeded this limit. On comparing mean CEA concentrations, ARCHITECT i2000sr overestimated while ADVIA Centaur XP underestimated CEA concentrations, mainly for low-concentration samples (<10 ng/mL). Elecsys E170 and Unicel DxI800 showed good correlation by Deming regression analysis and in Bland-Altman plots. This might be due to the fact that these two immunoassay analyzers use a single monoclonal antibody for two-step sandwich immunoassay, while the other two assays use one or more different antibodies. Several comparative studies of CEA assays have been reported [1215181920]. In one of these studies, CEA data obtained by Unicel DxI800 showed the highest degree of correlation with those measured by ADVIA Centaur XP [slope (95% CI), 0.910 (0.883 to 0.947); intercept (95% CI), −0.240 (−0.362 to −0.171)] [19]. On the other hand, CEA concentrations from Unicel DxI800 were found to be the highest in one study [15], which was different from the results of the present study. CEA concentrations from ARCHITECT i2000sr and Elecsys E170 are reportedly higher than those from Siemens ADVIA Centaur XP, in agreement with the results of this study.
Differences in results from comparative studies might be due to the diluent matrix effects or interaction between components from blood collection tubes and blood samples [21]. Concentrating samples with CEA concentrations beyond the maximum detection limit might be highly subjective because of dilution effects. In the present study, the dilution of samples did not seem to influence the data, and the matrix effect of the diluent for ADVIA centaur was minimal. Therefore, variability in CEA measurements might be mainly due to harmonization problems.
To harmonize CEA concentrations, an international reference standard for CEA (code 73/601) was established by the World Health Organization in 1975 [22]. However, instrument-specific calibration and working standards in current immunoassays are less traceable to the international standard, resulting in inconsistent CEA results between assays [1523]. In the present study, the calibrators provided by the four manufacturers, except for Elecsys E170, were not standardized against the WHO 1st international reference preparation 73/601.
In general, a disadvantage of CEA measurement is the high rate of false positives. In the current study, the highest discrepancies between assays were noted for samples with low CEA concentrations. When we used 5.0 ng/mL as cut-off value for serum CEA, the agreement of positivity ranged from 70.7% to 95.2%. The agreement between ADVIA Centaur XP and ARCHITECT i2000sr was the lowest. Reference intervals of CEA can vary by ethnicity, assay method, and many other factors [14242526]. Therefore, different reference ranges for each immunoassay need to be established; and for follow-up of CEA variations, using the same immunoassay is recommended. Precautions should be taken when changing CEA assay because CEA concentrations from automated immunoassay are not comparable. In addition, clinicians should be aware of changes in analyzers and techniques used for CEA measurement and consider between-method agreement and CV as evidenced by external quality assessment data.
Potential limitations of this study include relatively small number of studied samples, non-normally distributed samples, and lack of information on the patients and pre-analytical errors in relation to the sample tube, sample storage, or transportation. These limitations should be considered when interpreting the present results. Despite these limitations, our results demonstrated that CEA concentrations might vary among the four immunoassays currently in use, and standardization and further harmonization for CEA testing are needed.
In conclusion, agreements between automated CEA immunoassays are variable and individual CEA concentrations can differ significantly between assays. Therefore, reference ranges should be established for each immunoassay or the widely used cut-off value of 5.0 ng/mL should be employed, and the reference range should be validated in laboratories to decrease the false positive rate.
Acknowledgment
This work was supported by the Technology Innovation Program (No: 10049771, Development of Highly-Specialized Platform for IVD Medical Devices) funded by the Ministry of Trade, Industry & Energy, Korea.
References
1. Jeon BG, Shin R, Chung JK, Jung IM, Heo SC. Individualized cutoff value of the preoperative carcinoembryonic antigen level is necessary for optimal use as a prognostic marker. Ann Coloproctol. 2013; 29:106–114. PMID: 23862128.
2. Peng Y, Wang L, Gu J. Elevated preoperative carcinoembryonic antigen (CEA) and Ki67 is predictor of decreased survival in IIA stage colon cancer. World J Surg. 2013; 37:208–213. PMID: 23052808.
3. Lawicki S, Glazewska EK, Sobolewska M, Bedkowska GE, Szmitkowski M. Plasma levels and diagnostic utility of macrophage colony-stimulating factor, matrix metalloproteinase-9, and tissue inhibitor of metalloproteinases-1 as new biomarkers of breast cancer. Ann Lab Med. 2016; 36:223–229. PMID: 26915610.
4. Kim CG, Ahn JB, Jung M, Beom SH, Heo SJ, Kim JH, et al. Preoperative serum carcinoembryonic antigen level as a prognostic factor for recurrence and survival after curative resection followed by adjuvant chemotherapy in stage III colon cancer. Ann Surg Oncol. 2017; 24:227–235. PMID: 27699609.
5. Maeda R, Suda T, Hachimaru A, Tochii D, Tochii S, Takagi Y. Clinical significance of preoperative carcinoembryonic antigen level in patients with clinical stage IA non-small cell lung cancer. J Thorac Dis. 2017; 9:176–186. PMID: 28203421.
6. Gold P. Demonstration of tumor-specific antigens in human colonic carcinomata by immunological tolerance and absorption techniques. J Exp Med. 1965; 121:439–462. PMID: 14270243.
7. Saito G, Sadahiro S, Kamata H, Miyakita H, Okada K, Tanaka A, et al. Monitoring of serum carcinoembryonic antigen levels after curative resection of colon cancer: cutoff values determined according to preoperative levels enhance the diagnostic accuracy for recurrence. Oncology. 2017; 92:276–282. PMID: 28178692.
8. Choi SI, Jang MA, Jeon BR, Shin HB, Lee YK, Lee YW. Clinical usefulness of human epididymis protein 4 in lung cancer. Ann Lab Med. 2017; 37:526–530. PMID: 28840992.
9. Cho YY, Chun S, Lee SY, Chung JH, Park HD, Kim SW. Performance evaluation of the serum thyroglobulin assays with immunochemiluminometric assay and immunoradiometric assay for differentiated thyroid cancer. Ann Lab Med. 2016; 36:413–419. PMID: 27374705.
10. Karen LC, Viswanath D. Immunoassay methods. In : Sittampalam GS, Coussens NP, editors. Assay guidance manual [Internet]. Bethesda, MD: Eli Lilly & Company and the National Center for Advancing Translational Sciences;2014. p. 223–266.
11. Miller WG, Tate JR, Barth JH, Jones GR. Harmonization: the sample, the measurement, and the report. Ann Lab Med. 2014; 34:187–197. PMID: 24790905.
12. Matsushita H, Xu J, Kuroki M, Kondo A, Inoue E, Teramura Y, et al. Establishment and evaluation of a new chemiluminescent enzyme immunoassay for carcinoembryonic antigen adapted to the fully automated ACCESS system. Eur J Clin Chem Clin Biochem. 1996; 34:829–835. PMID: 8933107.
13. Akbas N, Schryver PG, Algeciras-Schimnich A, Baumann NA, Block DR, Budd JR, et al. Evaluation of Beckman Coulter DxI 800 immunoassay system using clinically oriented performance goals. Clin Biochem. 2014; 47:158–163.
14. Zhang GM, Guo XX, Ma XB, Zhang GM. Reference intervals of alphafetoprotein and carcinoembryonic antigen in the apparently healthy population. Med Sci Monit. 2016; 22:4875–4880. PMID: 27941709.
15. Zhang K, Huo H, Lin G, Yue Y, Wang Q, Li J. A long way to go for the harmonization of four immunoassays for carcinoembryonic antigen. Clin Chim Acta. 2016; 454:15–19. PMID: 26721316.
16. Sturgeon C. Standardization of tumor markers-priorities identified through external quality assessment. Scand J Clin Lab Invest Suppl. 2016; 245:S94–S99. PMID: 27542005.
17. Twomey PJ. How to use difference plots in quantitative method comparison studies. Ann Clin Biochem. 2006; 43:124–129. PMID: 16536914.
18. Park Y, Park Y, Park J, Kim HS. Evaluation of the UniCel DxI 800 immunoassay analyzer in measuring five tumor markers. Yonsei Med J. 2012; 53:557–564. PMID: 22477000.
19. Falzarano R, Viggiani V, Michienzi S, Longo F, Tudini S, Frati L, et al. Evaluation of a CLEIA automated assay system for the detection of a panel of tumor markers. Tumour Biol. 2013; 34:3093–3100. PMID: 23775009.
20. Bilic-Zulle L. Comparison of methods: Passing and Bablok regression. Biochem Med (Zagreb). 2011; 21:49–52. PMID: 22141206.
21. Bowen RA, Remaley AT. Interferences from blood collection tube components on clinical chemistry assays. Biochem Med (Zagreb). 2014; 24:31–44. PMID: 24627713.
22. Laurence DJ, Turberville C, Anderson SG, Neville AM. First British standard for carcinoembryonic antigen (CEA). Br J Cancer. 1975; 32:295–299. PMID: 822862.
23. Sorensen CG, Karlsson WK, Pommergaard HC, Burcharth J, Rosenberg J. The diagnostic accuracy of carcinoembryonic antigen to detect colorectal cancer recurrence-A systematic review. Int J Surg. 2016; 25:134–144. PMID: 26700203.
24. Bjerner J, Hogetveit A, Wold Akselberg K, Vangsnes K, Paus E, Bjoro T, et al. Reference intervals for carcinoembryonic antigen (CEA), CA125, MUC1, Alfa-foeto-protein (AFP), neuron-specific enolase (NSE) and CA19.9 from the NORIP study. Scand J Clin Lab Invest. 2008; 68:703–713. PMID: 18609108.
25. Qin X, Lin L, Mo Z, Lv H, Gao Y, Tan A, et al. Reference intervals for serum alpha-fetoprotein and carcinoembryonic antigen in Chinese Han ethnic males from the Fangchenggang Area Male Health and Examination Survey. Int J Biol Markers. 2011; 26:65–71. PMID: 21337313.
26. Lao X, Yang D, Mo Z, Gao Y, Deng Y, Qin X, et al. Reference intervals for alpha-fetoprotein (AFP) and carcinoembryonic antigen (CEA) in Guangxi Zhuang ethnic males from the FAMHES Project. Clin Lab. 2016; 62:955–961. PMID: 27349024.