Abstract
Background
Immunochromatographic point-of-care (POC) devices ae widely used by laboratories and lay users for urinary human chorionic gonadotropin (hCG) detection. Performance evaluation of pregnancy POC devices is rarely published. We performed an analytical and clinical validation of the newly introduced AllCheck hCG Card assay and compared it with the Alere hCG Cassette comparative assay.
Methods
The analytical performance of the assay was evaluated using an international standard material for hCG, as per the protocol recommended in the Clinical and Laboratory Standards Institute (CLSI) guideline. Clinical validation and comparison study with the comparative method were performed with remnant urine samples from pregnant and non-pregnant women.
Results
Probit analysis showed an analytical sensitivity of 15.82 mIU/mL. The precision of the assay was validated at a threshold of 30%. Cross-reactivity with luteinizing hormone, follicle-stimulating hormone, and thyroid-stimulating hormone was not observed. Comparison with the comparative assay showed a negative percent agreement of 100.0% (95% confidence interval [CI]: 92.9%-100.0%) and a positive percent agreement of 96.4% (95% CI: 89.9%-98.8%). Cohen’s kappa value was 0.952 (95% CI: 0.899-1.000).
Conclusions
Overall, we validated the performance of the urine hCG POC device and suggest that probit regression is suitable for qualitative tests other than molecular tests. The AllCheck hCG Card device satisfied the demanding standards suggested by the CLSI guideline and was suitable for clinical use.
초록
배경
임신 여부를 알기 위해서 면역크로마토그래피법 원리를 이용한 사람융모성생식샘자극 호르몬(human chorionic gonadotropin, hCG) 현장 정성검사법이 임상검사실 및 일반 사용자들에게 널리 사용되고 있다. 하지만 hCG 정성 검사의 분석적 성능 평가에 대한 연구는 많지 않다. 따라서 새로 도입된 AllCheck hCG 카드 분석법(Card assay)으로 분석적 및 임상적 검증을 수행하고 Alere hCG 카세트 분석법(Cassette assay)과 비교 분석을 시행하였다.
방법
분석적 성능 평가는 임상검사실표준연구소(Clinical Laboratory Standards Institute, CLSI) 지침에서 권장하는 프로토콜에 따라, hCG 국제 표준 물질을 사용하여 진행되었다. 임신 여부가 확인된 여성 및 임신이 아닌 환자들의 잔여 소변 검체를 이용하여 임상적 검증 및 기존 검사법에 대한 비교 검증을 수행하였다.
Human chorionic gonadotropin (hCG) is a 37,900 kDa glycoprotein hormone that supports the maintenance of the corpus luteum and fetal growth [1]. It comprises an alpha (α) subunit that is noncovalently linked to a beta (β) subunit [2]. The alpha subunit of hCG (hCGα) is identical to that of thyroid-stimulating hormone (TSH), luteinizing hormone (LH), and follicle stimulating hormone (FSH), while the beta subunit (hCGβ) is unique. Most urinary hCG is in the intact form in early pregnancy [3], and the core fragmented form of hCGβ becomes predominant by 10 weeks of pregnancy [4, 5].
Serum and urine hCG detection has been widely used to diagnose pregnancy. In particular, point-of-care (POC) devices that apply immunochromatography to detect hCG in the urine are widely used in laboratories. However, the performance of these devices has not been fully evaluated. Understanding the limitation of urine hCG POC devices is important in the clinical setting because the exclusion of pregnancy is critical in patient management. Urine hCG test can yield false-negative results, with early gestational age as the most common cause, followed by the hCG variant hook effects [6]. Detection variability caused by hCG variants present in the urine has been reported among widely used POC devices [7]. Excess hCG was also reported to cause false-negative results owing to the high-dose hook effect in the test [8]. Therefore, method evaluation of qualitative hCG assay is warranted with the inspection of false-negative or false-positive results produced. Therefore, users should always be informed of the limitations of the method, including the limit of detection (analytical sensitivity) of urine hCG POC device. Meanwhile, a recommendation for validation of home pregnancy testing was published by a European group [9]. This recommendation insists manufacturers to define the analytical performance and minimum number of urine specimens tested. In particular, it requires the analytical sensitivity to be defined as the lowest concentration that detects ≥99% positive for the time [9]. On the other hand, the Clinical and Laboratory Standards Institute (CLSI) provides a guideline for evaluating the performance of qualitative tests. Validation protocol for precision and other performance parameters (sensitivity, specificity, and agreement) is issued with minimum number requirements of urine specimens [10].
In this study, we evaluated the analytical performance of the newly introduced AllCheck hCG Card (Calth, Inc., Seongnam, Korea) POC device and compared it with the Alere hCG Cassette (Alere San Diego, Inc., San Diego, CA, USA) comparative assay. Probit regression was used to identify the imprecision curve of the AllCheck hCG Card assay.
Refrigerated urine samples were maintained in the ambient air for 15-30 min before testing. Qualitative urine hCG analysis was performed using the two POC devices, Alere hCG Cassette and AllCheck hCG Card assay. Both devices use chromatographic immunoassay for qualitative detection of urinary hCG. Analytical sensitivities of the assays were claimed to be 25 mIU/mL. Approximately 100 µL of urine (3 drops by the pipette included in the kit) was transferred to the well of the AllCheck hCG Card and the Alere hCG Cassette following the manufacturer’s protocols. Results were interpreted after 5-10 minutes in the AllCheck hCG Card and 3-4 minutes in the Alere hCG Cassette assay. Results were classified as invalid if no line was observed on the control lane; positive if the lines were observed on both the control and the testing lane; negative if one line appeared only on the control lane. Invalid samples were repeated for testing. Equivocal result was considered positive. This research was considered as a quality assessment study and the informed consent was waived. The Institutional Review Board of National Health Insurance Service Ilsan Hospital approved the study (IRB No. NHIMC 2021-07-014).
The World Health Organization (WHO) 6th International Standard for hCG (NIBSC, 18/244) was first diluted with 1 mL of 1× bovine serum albumin with Tris-buffered saline (TBS) buffer for 10 minutes. The concentration of the stock solution was measured by an immunoassay. It was serially diluted with urine from non-pregnant female participants according to the measured concentration. Proportions of positive detection were obtained to construct a probit regression model at intended hCG concentrations of 25, 20, 17.5, 15, 12.5, and 10 mIU/mL around the claimed detection limit. Tests at each concentration were performed with 10 repeats for 3 consecutive days. The concentration showing equal to or more than 99% positive results was defined as the assay’s analytical sensitivity, which was expressed as C99. The probit regression was used with the results at the tested concentrations to estimate the C99 value. To validate analytical precision in a practical method, C50 was approximated where the positive and negative results were split by 50:50. The precision was validated by performing 40 test repeats for 4 consecutive days at three concentrations, C50−20%, C50, and C50+20%, according to the CLSI guideline EP12-A2 [10]. The guideline judges that if equal to or more than 90% of the results are negative or positive at the lower or upper concentrations, the C5–C95 interval is bounded by the interval [C50−20%, C50+20%] with 86% confidence. In case the experiment results did not meet the criteria of acceptance, the experiment was replicated with an increased interval of 30%, instead of 20%. Furthermore, the precision was also validated using the C50, C5, and C95 values obtained from the probit regression.
Validation of cross-reactivity was performed by adding WHO International Standards of 1,000 mIU/mL of LH (NIBSC, 81/535), 1,000 µIU/mL of TSH (NIBSC, 81/565), and 1,000 mIU/mL of FSH (NIBSC, 83/575) to the samples with hCG concentrations 0 and 25 mIU/mL diluted by non-pregnant urine. Results were read in three repeats per sample. Influence of various components expected to cause interference in the hCG test was assessed. The components included materials that could be commonly present in the urine or that could affect color reaction of the hCG test, such as acetaminophen (20 mg/dL), caffeine (20 mg/dL), acetylsalicylic acid (20 mg/dL), ascorbic acid (20 mg/dL), glucose (2 g/dL), ibuprofen (20 mg/dL), albumin (10 mg/dL), ampicillin (20 mg/dL), bilirubin (1 mg/dL), brompheniramine (20 mg/dL), hemoglobin (1 mg/dL), and ethanol (1%) (all from Sigma-Aldrich, St. Louis, MO). Each material was added to the samples with hCG concentrations of 0 and 25 mIU/mL diluted by non-pregnant female’s urine and repeated three times per se.
Clinical sensitivity and specificity were assessed by testing clinical samples. Remnant clinical urine samples from 100 patients were collected from July 2021 to April 2022 as per the following criteria: 1) positive hCG samples were collected from women with pregnancy confirmed by serum hCG test, ultrasound, or medical history; 2) negative hCG samples were collected from women without any history of medication or disease; 3) stored in the refrigerator for less than three days or under −20°C for less than 6 months. Additional 33 samples with missing medical information were collected for method comparison. Exclusion of the samples was as follows: 1) remnant volume less than 400 µL; 2) Visible interference materials in the urine; 3) positive red blood cells in the urine; 4) age under 18; 5) sample stored under inappropriate conditions; 6) insufficient medical information about the sample.
Probit regression was used for the approximation of concentrations with specific positive result probability. It was performed by using MedCalc Statistical Software version 19.2.6 (MedCalc Software bv, Ostend, Belgium; https://www.medcalc.org; 2020). The statistical significance of multiple assays’ agreement was calculated by Cohen’s kappa value.
A probit model was constructed by testing 30 repeats in 3 consecutive days at each concentration around the claimed limit (25 mIU/mL). This model was used to approximate C99, C95, C50, and C5. The proportions of positive results at each concentration are shown in Table 1. The probit regression produced a fitted model as follows: probit (probability)=concentration “×” 0.665−8.196. The analytical sensitivity was defined as the C99 concentration, for which the test would produce 99% positive results. The fitted model calculated C99 as 15.82 mIU/mL (95% confidence interval [CI], 14.86–17.57) for the analytical sensitivity (Fig. 1).
Precision was evaluated to test if the interval [C50−20%, C50+20%] contained the interval [C5, C95]. From the probit regression, we obtained 9.85, 12.32, and 14.80 mIU/mL as the values of C5, C50, and C95, respectively. The [C50−20%, C50+20%] interval [9.86, 14.78] did not contain the C5−C95 interval [9.85, 14.80]. Therefore, redefining the interval at a threshold of 30% from the C50 interval [8.62, 16.02] bounded the C5−C95 interval, thus confirming the precision of the AllCheck hCG Card assay to be 30%.
As described in the CLSI guideline EP12-A2 [10], for more practical validation of precision without using probit regression, we roughly estimated C50 as 12.5 mIU/mL based on the positive percentage of 53.3% at test concentration (Table 1). We performed 40 repeats of tests in 4 consecutive days at the concentrations C50−20% (10.0 mIU/mL), C50, and C50+20% (15.0 mIU/mL). The tests showed positive percentages of 15%, 50%, and 90% at each concentration, respectively (Table 1). The proportion at C50 was between 35% and 65%, suggesting the appropriateness of C50 estimation. The proportion at C50+20% was equal to or larger than 90%, which was validated as appropriate; however, the proportion at C50−20% exceeded 10% and did not satisfy the CLSI EP12-A2 guideline [10]. Therefore, a replication experiment using the 30% threshold from the C50 was performed. The tests showed positive percentages of 7.5%, 52.5%, and 97.5% at C50−30% (8.8 mIU/mL), C50, and C50+30% (16.3 mIU/mL) concentrations, respectively. This result satisfied the recommendation of the guideline.
Cross-reactivity was evaluated by the addition of 1,000 mIU/mL LH, 1,000 µIU/mL TSH, and 1,000 mIU/mL FSH to 0 and 25 mIU/mL hCG in urine. Three repeats for each hormone showed consistent negative results for the 0 mIU/mL hCG urine sample and produced consistent positive results for the 25 mIU/mL hCG urine sample. Likewise, the effect of the interference from acetaminophen, caffeine, acetylsalicylic acid, ascorbic acid, glucose, ibuprofen, albumin, ampicillin, bilirubin, brompheniramine, hemoglobin, and ethanol with 0 and 25 mIU/mL hCG in urine was evaluated. Three repeats for each compound produced consistent negative and positive results in 0 and 25 mIU/mL hCG in urine, respectively.
Random 50 positive and 50 negative urine samples from women with pregnancy confirmed by serum hCG, ultrasound, or medical history were tested with the AllCheck hCG Card assay. Positive samples were obtained from women having diverse pregnancy periods (4 to 38 weeks). The assay showed 98% (49/50, 95% CI, 89.4%-100.0%) clinical sensitivity and 100% (50/50, 95% CI, 92.9%-100.0%) clinical specificity. In addition, we assessed the clinical agreement between the AllCheck hCG Card and the Alere hCG Cassette assay by performing tests in 133 samples, including the previous test set. The two devices showed 100.0% (95% CI, 92.9%-100.0%) negative percent agreement and 96.4% (95% CI, 89.9%-98.8%) positive percent agreement (Table 2). In one discrepant sample, the candidate assay showed a false-negative result, whereas the comparative assay showed an equivocal result (±). The serum hCG level of the patient was 33.36 mIU/mL and the diagnosis was fever. The other two discrepant samples were not measured for serum hCG. One patient was diagnosed with acute gastroenteritis and the other urine sample was collected 9 days after her vaginal delivery. Cohen’s kappa value was 0.952 (95% CI, 0.899-1.000), indicating almost perfect agreement between the assays.
Pregnancy test is routinely performed in emergency medical centers before any procedures that could harm the fetus. These tests include a serum or plasma hCG test and urine qualitative test. Medical staffs are hesitant to use serum/plasma hCG quantitative tests despite their high accuracy and well-documented clinical applications because they are usually performed in a central laboratory. Urine hCG qualitative test is simple and fast for screening pregnancy using the immunochromatography method. Although urine hCG POC devices are widely used by lay users and experts, validation study of the devices is rarely published [11]. Furthermore, urine hCG concentration of 25 mIU/mL is often labeled as the manufacturer-claimed sensitivity without thorough validation. Likewise, the assay’s sensitivity at 25 mIU/mL concentration was confirmed for the AllCheck hCG Card by the manufacturer; however, we showed that the assay had better sensitivity than that claimed by the manufacturer.
For the analytical performance evaluation of any qualitative test, the validation of precision is documented in the CLSI EP12-A2 guideline [10]. It recommends users to obtain positive result proportions of 40 repeated tests at three specific concentrations: C50, C50−20%, and C50+20%. The method hypothesizes that the smaller the C5−C95 interval, the better is the precision of the assay because it signifies that the gray zone in which the assay produces inconsistent results is smaller. An assay is presumed to be precise when the negative and positive percentages are equal to or larger than 90% at C50−20% and C50+20%, respectively, the C5−C95 interval is bounded by the interval [C50−20%, C50+20%] by 86% confidence with 40 tests. The confidence may increase with a higher number of repeats and consistent results. Moreover, it could be inferred that the assay’s precision can also be validated when the interval [C5, C95] is within the interval [C50−20%, C50+20%] because C5 and C95 are the concentrations where the assay provides consistent negative/positive results. As there is no defined method to obtain the exact C5, C50, and C95, we applied probit regression that has been documented in the CLSI EP17-A2 for estimating the points [12]. Probit regression is commonly used in qualitative molecular tests for the derivation of the limit of detection; however, researchers suggest that it could be used in other qualitative tests. We propose that the probit method can be applied for the evaluation of chemistry tests by comparing the results from the protocol in EP12-A2. Probit is the value of inverse standard normal cumulative distribution function with a probability. Probit regression aims to linearize the S-shape imprecision curve to fit in the linear model. The guideline suggests at least three data points between C10 and C90, and it is always difficult to collect sufficient data points [12].
In our study, we validated precision in two ways: one that is more practical with naïve approximation and the other using probit regression. In the experiment with samples around the cutoff, we defined C50 as 12.5 mIU/mL based on the observation of its positive result percentage of 53.3%. This practical method resulted in 15% of positive result proportion at C50−20% and did not satisfy the criteria of precision. Thus, it was important to widen the threshold to 30% to validate the precision. Likewise, using the probit regression, we could define C50, C5, and C95 as 12.32, 9.85, and 14.80 mIU/mL, respectively. The interval [C5, C95] was bounded within the interval [C50–30%, C50+30%] but not [C50−20%, C50+20%]. We concluded that the assay provides precise results at 30% threshold.
We have confirmed the good agreement between the AllCheck hCG Card candidate assay and the Alere hCG Cassette comparative assay. Clinical validation was also performed and showed satisfying sensitivity and specificity. TSH, LH, and FSH that share identical structures with hCGα were shown to have no cross-reactivity during hCG detection by the assay [1]. Furthermore, materials that could interfere in the assays using monoclonal antibodies were validated.
Our study has a few limitations. First, testing with hCG variants was not performed. The assay uses a combination of monoclonal antibodies targeting intact hCG and α and β subunits of hCG; thus, there is a risk that variants such as the core fragment of hCGβ could be missed. Second, more probit points between C10 and C90 would have created a better-fitted model in the regression. When laboratory managers try to evaluate tests using probit regression, a narrower gap of concentrations would be needed. Third, the assay’s tolerance to the hook effect was not examined in this study.
In conclusion, we demonstrated the validation study of the urine hCG POC device and suggested that probit regression is suitable for qualitative tests other than molecular tests. The AllCheck hCG Card device satisfies the demanding standards suggested by the CLSI guideline and was suitable for clinical use.
REFERENCES
1. Stenman UH, Tiitinen A, Alfthan H, Valmu L. 2006; The classification, functions and clinical use of different isoforms of HCG. Hum Reprod Update. 12:769–84. DOI: 10.1093/humupd/dml029. PMID: 16877746.
2. Birken S, Armstrong EG, Kolks MA, Cole LA, Agosto GM, Krichevsky A, et al. 1988; Structure of the human chorionic gonadotropin beta-subunit fragment from pregnancy urine. Endocrinology. 123:572–83. DOI: 10.1210/endo-123-1-572. PMID: 2454811.
3. Kovalevskaya G, Kakuma T, Schlatterer J, O'Connor JF. 2007; Hyperglycosylated HCG expression in pregnancy: cellular origin and clinical applications. Mol Cell Endocrinol. 260-262:237–43. DOI: 10.1016/j.mce.2006.02.021. PMID: 17092638.
4. Norman RJ, Menabawey M, Lowings C, Buck RH, Chard T. 1987; Relationship between blood and urine concentrations of intact human chorionic gonadotropin and its free subunits in early pregnancy. Obstet Gynecol. 69:590–3.
5. Kato Y, Braunstein GD. 1988; Beta-core fragment is a major form of immunoreactive urinary chorionic gonadotropin in human pregnancy. J Clin Endocrinol Metab. 66:1197–201. DOI: 10.1210/jcem-66-6-1197. PMID: 2453528.
6. Herskovits AZ, Chen Y, Latifi N, Ta RM, Kriegel G. 2020; False-negative urine human chorionic gonadotropin testing in the clinical laboratory. Lab Med. 51:86–93. DOI: 10.1093/labmed/lmz039. PMID: 31245816.
7. Cervinski MA, Lockwood CM, Ferguson AM, Odem RR, Stenman UH, Alfthan H, et al. 2009; Qualitative point-of-care and over-the-counter urine hCG devices differentially detect the hCG variants of early pregnancy. Clin Chim Acta. 406:81–5. DOI: 10.1016/j.cca.2009.05.018. PMID: 19477170.
8. Griffey RT, Trent CJ, Bavolek RA, Keeperman JB, Sampson C, Poirier RF. 2013; "Hook-like effect" causes false-negative point-of-care urine pregnancy testing in emergency patients. J Emerg Med. 44:155–60. DOI: 10.1016/j.jemermed.2011.05.032. PMID: 21835572.
9. Sturgeon C, Butler SA, Gould F, Johnson S, Rowlands S, Stenman UH, et al. 2021; Recommendations for validation testing of home pregnancy tests (HPTs) in Europe. Clin Chem Lab Med. 59:823–35. DOI: 10.1515/cclm-2020-1523. PMID: 33544509.
10. Clinical, Laboratory Standards Institute. 2008. User protocol for evaluation of qualitative test performance; approved guideline-second edition. CLSI document EP12-A2. Clinical and Laboratory Standards Institute;Wayne, PA:
11. Kamer SM, Foley KF, Schmidt RL, Greene DN. 2015; Analytical sensitivity of four commonly used hCG point of care devices. Clin Biochem. 48:448–52. DOI: 10.1016/j.clinbiochem.2014.12.015. PMID: 25549977.
12. Clinical, Laboratory Standards Institute. 2012. Evaluation of detection capability for clinical laboratory measurement procedures; approved guideline-second edition. CLSI document EP17-A2. Clinical and Laboratory Standards Institute;Wayne, PA: