Abstract
Quantitative molecular genetic tests are increasingly used for the detection and quantification of target molecules or genetic alterations. When introducing a new assay into clinical laboratories, it is necessary to verify the manufacturers’ claimed performance characteristics within individual laboratories. Appropriate assay verification procedures are essential to ensure the quality of test results in clinical laboratories. This study aimed to provide recommendations for the verification of quantitative molecular genetic testing focused on the hemato-oncology field in clinical genetic laboratories. Based on a literature review, we provide recommendations for the performance verification of quantitative molecular hemato-oncology tests. The performance characteristic elements that comprise the verification procedures are presented and exemplified. These recommendations can assist individual clinical laboratories in verifying quantitative molecular diagnostic assays.
초록
정량 분자유전 검사는 표적 분자나 유전 변이를 검출하고 정량하기 위한 목적으로 점차 사용이 증가하고 있다. 임상 검사실에서 새로운 검사를 도입할 때 제조사에서 제시하는 검사 성능을 검정하는 절차가 필요하다. 임상 검사실에서 수행되는 검사결과의 품질을 보장하기 위해서 적절한 검정 절차가 중요하다. 본 연구는 임상 유전 검사실에서 수행되는 혈액종양 분야의 정량 분자유전 검사의 검정과 관련한 권고안을 제공하는 것을 목표로 하였다. 문헌 검토를 바탕으로, 혈액종양 정량 분자유전 검사법의 성능 검정을 위한 권고안을 마련하였다. 검정 절차를 구성하는 각각의 성능 평가항목을 제시하고 예시를 제시하였다. 이러한 권고안을 통해서 개별 임상 검사실에서 정량 분자유전 검사를 검정하는데 도움이 될 수 있을 것이다.
Clinical tests using molecular diagnostic techniques are increasingly used for patient management with regard to diagnosis and follow-up. In the hemato-oncology field, the quantification of mea-surable residual disease (MRD) has emerged as an important predictor of patient prognosis. For the assessment of molecular MRD, techniques with a limit of detection of 10−3 or lower should be used; real-time polymerase chain reaction (PCR) and droplet digital PCR (ddPCR) are typical tests [1]. Real-time PCR is currently the most commonly used technique for the quantification of target molecules because of its high sensitivity and specificity, low cost, and rapid time-to-result. Recently, ddPCR, which can perform absolute quantification without using a standard curve, has been introduced and is increasingly used for clinical testing and research [2].
It is important to understand the differences between validation and verification in the laboratory. Validation is the establishment of the performance characteristics of assays, and is usually completed by the manufacturer. If no suitable performance specifications are available, validation is required (e.g., laboratory development tests) [3]. Verification is the process of confirming the specifications provided by the product manufacturer in each laboratory [4]. Verification applies to unmodified non-waived tests that have been approved for in vitro diagnostic use [3].
The Clinical and Laboratory Standards Institute (CLSI) is referenced when evaluating most laboratory tests in clinical laboratories; CLSI guidelines are relatively suitable and standardized for high-volume automated assays. Molecular tests are labor-intensive and expensive, even though the volume of individual tests is relatively small compared to most assays in other fields, such as clinical chemistry. In a molecular diagnostic assay validation white paper updated by the Association for Molecular Pathology (AMP) Clinical Practice Committee in 2013, it was stated that the proposed recommendations are not standardized and that alternative methods are possible because of the characteristics of the molecular test itself and the different circumstances of each individual laboratory [5]. Thus, in this recommendation, several guidelines and literature related to the verification of quantitative molecular assays have been reviewed and presented so that various alternatives can be referred to when determining the verification protocol.
This review aimed to provide recommendations to guide verification procedures for quantitative molecular genetic testing that can be used in individual laboratories within the hemato-oncology field. We focused on real-time PCR and ddPCR, which are the most frequently performed verification procedures for in vitro diagnostic products in clinical laboratories. One can refer to this text for verification procedures in areas other than hemato-oncology (e.g., infectious disease) or real-time PCR/ddPCR (e.g., next-generation sequencing), but this was not our primary focus.
We reviewed the literature and available guidelines regarding the verification procedures of quantitative molecular genetic testing using real-time PCR and ddPCR. CLSI guidelines relating to the verification of quantitative molecular assays are referenced as follows: MM01-A3, MM20-A, EP17-A2, EP15-A3, EP06-ED2, and C28-A3 [4, 6-10]. Other references were searched in PubMed and Google Scholar using the following keywords: verification, validation, quantitative PCR, digital PCR, real-time PCR, molecular, and mutation. We selected the final target documentation by reviewing the abstracts of the literature and selecting the appropriate articles. We have cited the chosen literature and guidelines in the reference section of this paper. We then selected the performance characteristics that individual laboratories should consider when adopting a quantitative molecular genetic test (Table 1).
For verification of quantitative molecular genetic testing, the required performance characteristics vary according to the guidelines (Table 2). Clinical Laboratory Improvement Amendment regulations, the College of American Pathologists molecular pathology checklist, and AMP molecular diagnostic assay validation protocols recommend that laboratories verify that tests are performed as expected by obtaining data on accuracy, precision, reportable ranges, and reference intervals [5, 11, 12]. The established reference interval can be transferred from the manufacturer or publication without verification according to the judgment of the medical director of a laboratory [10]. Additionally, the College of American Pathologists and EuroGentest Validation Group recommend verifying the limits of detection of quantification assays [12, 13].
The accuracy of a quantitative test refers to the proximity between the test result and the accepted value, either as a conventional true value or a reference value [14]. Thus, a comparison with the value of a reference (“gold standard”) method or a recovery study with a known value for a certified reference can be performed to verify accuracy. However, a reference method is not available in most cases and the number of possible genotypes can preclude the use of reference materials that cover every genotype [5]. Alternatively, accuracy can be evaluated through a method comparison study between a new method and a method already established in the laboratory, using samples with an entire reportable range and different possible genotypes [5].
It is recommended to include samples with an entire reportable range and different possible genotypes [5]. For a method comparison study, the appropriate number of specimens depends on many factors, including the complexity of the assay, frequency of targets in the population, and established accuracy of the reference methods. The CLSI document EP15-A3 recommends testing both methods in parallel, with a minimum of 20 positive specimens in duplicate over several runs and days [9]. CLSI MM01-A3 recommends that at least 30 samples be analyzed when comparing two analysis methods using a paired test, because the sample mean and its standard deviation (SD) approach the population mean and its SD as the sample number approaches 30 [6]. An example of sample selection for the comparison of results between a new method and a method already established in the laboratory is presented in Table 3.
After setting the allowable criteria, a predetermined number of samples are checked with the new and existing methods to be compared, and the results are summarized and compared. If the difference is not clinically significant, the new assay is considered to be within the medical tolerance interval and can replace the old assay.
A method comparison study can be performed in various ways. First, the average difference can be calculated and determined to be acceptable by comparing it with allowable limits. The laboratory should establish and document allowable limits for acceptance (e.g., ±20%). The allowable limits can be derived from the package insert, previous literature, or empirical evidence from similar testing methods in the laboratory. Second, a t-test can be used to determine whether there is a significant difference between the means of the two methods. Third, linear regression analysis can be used to calculate linear regression statistics. After plotting the data (reference samples/method: x-axis; measured values/new method: y-axis), linear regression statistics can be analyzed using statistical programs (ideal values: slope=1, intercept=0, and r≥0.99).
Precision involves repeatability (within runs) and reproducibility (between runs). Repeatability is the degree of correspondence between repetitive results of the same sample under the same operating conditions; reproducibility refers to the degree of correspondence when the operating conditions vary. For real-time PCR, concentration, stochastic fluctuations, and temperature differences that affect the completion of annealing and denaturation are known to influence precision. The precision of ddPCR depends on the average number of molecules per partition (determined by the original sample concentration and preparation method) and number of partitions. When the average number of molecules per partition is very low and the number of positive partitions approaches saturation, precision becomes poor.
CLSI EP15-A03 indicates that at least two samples with different concentrations are needed to represent medical decision points or reference limits; ideally, patient samples, reference materials, proficiency testing samples, or control materials are used as test samples [9].
AMP molecular diagnostic assay validation recommends that at least three sample concentrations covering clinically important decision levels (e.g., for BCR/ABL1 quantitative assays, molecular response 3.0–5.0) be included [5]. A low concentration can be set to two to four times the limit of detection (LOD), and a high concentration can be close to the upper limit with regard to the limit of quantitation.
For user verification of the precision performance, CLSI EP15-A3 recommends testing each sample five times per run for five to seven runs over at least 5 days (a total of 25 replicates per concentration) [9]. An alternative approach, presented by AMP molecular diagnostic assay validation, uses five replicates at three concentrations (low, medium, and high) run for 3 days (a total of 15 replicates per concentration) [5]. Runs are commonly replicated in triplicate, and each run is tested for 3 days in a clinical setting [15]. Therefore, multiple repeatability verification studies for diverse testing variables are necessary.
For within-run and between-run precision, the mean value, standard deviation, percent coefficient of variation, and percent agreement can be calculated between tests performed under two different conditions. The results can then be compared with the manufacturer’s claims or clinically acceptable variation. If an unacceptable result is identified, the possible causes should be investigated and the necessary corrective action should be taken.
The reportable range of the quantitative assay refers to the lowest and highest results reliably obtained by the test method. Laboratories can only report test results that fall within the verified range. The range provided by the manufacturer should be verified using a complete test system, from sample preparation to the results. Care must be taken when working with higher concentrations of target analytes to prevent specimen-to-specimen cross-contamination. Linearity refers to the range in which test values are proportional to the analyte concentration in the sample.
To verify the linearity of a diagnostic quantitative test, various sample types can be used, such as proficiency testing materials, certified reference materials (CRM) with a proper matrix, quality control materials, and patient samples. However, it is preferable to verify linearity using patient samples because matrix differences can affect the results [8]. In the CLSI guideline EP06-Ed2, two replicates of five levels of samples are required to verify the linearity of the test [8]. To verify the analytical measurement range (AMR), the test specimen must have minimum analyte values near the low, midpoint, and high AMR values [12].
The CLSI guidelines recommend mixing two samples with high and low concentrations to create several samples with the same intervals as the target concentration [8]. The concentration of linear samples is bracketed by the upper and lower limits of quantitation proposed by the manufacturer [8]. Importantly, attention should be paid to pipetting errors in manufacturing samples with a specific concentration interval. A proper matrix should be used, which is the same as that of the routine test samples.
For quantitative molecular tests, DNA or RNA/complementary DNA (cDNA) extracted from patients can be used. For instance, if the linearity of a quantitative real-time PCR test is verified for quantifying BCR-ABL1 fusion, RNA or cDNA pools can be used from patients with chronic myeloid leukemia and diluted with a wild-type RNA/cDNA sample or a recommended diluent. An international-scale calibrator or DNA/RNA from other reference cell lines can also be used.
The first step in the analysis of the data is to prepare an xy plot with the measurements and results on the y-axis and the expected or known values on the x-axis. Individual data points or mean values are plotted for each set of replicates. Plotting individual results allows for the visual detection of outliers that do not fit the pattern represented by the rest of the data. A single outlier in a dataset can be removed and does not require replacement. Two or more unexplained outliers cast doubt on the precision of the testing system. A line can be drawn either manually or with the aid of a computer program. A visual examination of the plot shows whether there is obvious nonlinearity or whether the range of testing should be narrowed or expanded. It also provides insight into the most appropriate procedures for subsequent statistical analyses. If the data appear to have a curved relationship, as is often the case when testing analytes that cover a wide concentration range, a log transformation of the data points may straighten the line. Log transformation involves taking the log (generally base 10) of each observed value. All PCR-based data should be plotted after log10 transformation.
To determine the linear range, CLSI EP06-ED2 recommends using weighted first-order regression analysis to evaluate nonlinearity [8]. Alternatively, linear regression is commonly used if the relationship between the expected and observed values appears straight, without curvature. Linear regression determines the slope and intercept to create an equation for the best-fitting line (ideal values: slope=1, intercept=0, and r≥0.99).
Chung et al. [15] analyzed the linearity range using a polynomial evaluation according to the CLSI guideline EP06-A. As an alternative approach to the CLSI guidelines presented, some studies on molecular assays used a coefficient of determination r2 of the curve [16] or total error [17] to determine the linear range.
To detect rare mutations using quantitative PCR (real-time PCR and ddPCR), it is important to verify the detection limits. The principle of measurement of ddPCR allows for the detection of up to one copy of a target sequence and the application of a Poisson distribution at very low copy number concentrations. The limit of blank (LOB) and LOD are values that describe the sensitivity of an analytical procedure [4, 18]. The LOB is the highest measurement result that is likely to be observed for a blank sample, and the LOD is the lowest amount or concentration of analyte (e.g., DNA) in a sample that can be detected with a given probability [19]. Commonly, verification is performed to evaluate whether the method detects the presence of the analyte in at least 95% of cases (samples) at the LOD [4]. As in other methods, measurements should be acquired from multiple independent blank samples and samples containing a low amount of analyte (low-level samples) [4]. In the ddPCR method, the LOB can be defined as the frequency of positive droplets measured in wild-type samples or in no template controls (NTCs).
The analyte, which refers to a CRM sample containing a mutation to be detected, is serially diluted with a diluent. For CRM, a commercial cell line or reference DNA material can be purchased. If it is difficult to obtain a CRM, a patient sample containing a mutation can be used.
A key contributor to the LOD in ddPCR is the number of samples screened. A minimum amount of starting DNA is required to achieve a low LOD. Table 4 depicts the amount of starting material required to achieve a certain LOD based on the Rule of Three [20]. This rule states that to reach 95% confidence that the frequency is one in 1,000, three in 3,000 events must be detected. For example, when 10 ng of DNA (approximately 3,000 copies) is assayed in a well, three positive events out of 3,000 (0.1% sensitivity) is the theoretical LOD. Although more measurements provide better estimates, the number of measurements can be limited by sample availability and budget concerns in molecular genetics. The literature often recommends 20 measurements at, above, and below the probable LOD as determined by preliminary dilution studies. A minimal design would be one instrument system, three days, two samples, and two replicates per day, following the recommendations of the CLSI EP17-A2 guidelines [4]. To better distinguish the difference between the claimed and measured detection limits, the number of replicates can be increased above 20. Chung et al. [15] demonstrated these verification steps with 24 replicate measurements in their evaluation of BCR-ABL %IS ddPCR.
A strategy for the empirical determination of ddPCR analytical sensitivity is presented in Table 5. A plate can be configured using NTC wells, wild-type only (mutation-negative) control wells, and a serial dilution of the positive control mutant template in a constant background of wild-type DNA [19]. LOD verification samples representing the range above and below the expected LOD based on prior knowledge of the LOD of the assay (e.g., from validation data or a suggested value from the manufacturer) are tested. If the observed count is greater than or equal to the minimal number of measurements, the claimed LOD is considered verified. If the verification is rejected, the measurement results should be reviewed for possible errors and, depending on the situation, verification or validation of a new detection limit may be needed.
The percentage of positive results that are greater than or equal to the LOD claim can be calculated. The observed percentage can then be compared to the minimum percentage (see Supplemental Data Table S1). If the observed percentage is greater than or equal to the lower bound value, the claimed LOD is considered verified.
The reference interval is defined as the range of values typically found in individuals who do not have the disease or condition that is being assayed [10]. Defining the reference interval for a test gives clinicians practical information about what is “normal” and “abnormal” that can be used to guide patient management. To be clinically useful, the reference interval must be appropriate to the population being served.
The reference interval can be simply “transferred” without verification if it has already been determined based on an adequate reference interval study [10]. The reference interval to be considered for transfer could be the current laboratory range, the manufacturer’s range, a published reference range, or a locally established reference range. The laboratory director should review information from the original study, such as the similarity of geographics, demographics, and test methodology.
The number of samples to be tested is at the discretion of the laboratory director and depends in part on the conditions being tested and the availability of appropriate control specimens. It is recommended that 20 specimens from individuals who represent the laboratory’s reference population be analyzed. If clinically indicated for the condition being tested, the normal control range should include both males and females with representative ethnic backgrounds, age distributions, and other medical conditions.
If more than 90% of the samples are within the stated reference interval, the reference interval is considered verified. If fewer than 90% of the samples are within the interval, reevaluation of the reference range and qualifications of healthy volunteers are needed. Twenty additional samples should be collected and evaluated. If more than 90% of the additional samples are within the reference range, the reference interval is considered verified. If fewer than 90% of the additional samples are within the reference range, a new reference range needs to be established.
In conclusion, we have provided recommendations for the verification of quantitative hemato-oncology testing using real-time PCR and ddPCR. These recommendations can be referenced when setting verification protocols in individual laboratories, without needing to search for existing guidelines and literature. However, because the characteristics of each test and the environment of each laboratory are different, it was not possible to list specifics of the verification method, such as the number of samples, limit of acceptability, and methods of statistical analysis. Each laboratory can adjust and improve verification protocols by considering individual circumstances. For good laboratory practice in clinical molecular laboratories, practical guidelines for the verification of other quantitative molecular tests or techniques should be developed.
Acknowledgements
This study was supported by grants from the Korean Society for Genetic Diagnostics.
REFERENCES
1. Heuser M, Freeman SD, Ossenkoppele GJ, Buccisano F, Hourigan CS, Ngai LL, et al. 2021; 2021 Update on MRD in acute myeloid leukemia: a consensus document from the European LeukemiaNet MRD Working Party. Blood. 138:2753–67. DOI: 10.1182/blood.2021013626. PMID: 34724563. PMCID: PMC8718623.
2. Mazaika E, Homsy J. 2014; Digital droplet PCR: CNV analysis and other applications. Curr Protoc Hum Genet. 82:7. 24.1-13. DOI: 10.1002/0471142905.hg0724s82. PMID: 25042719. PMCID: PMC4355013.
3. Halling KC, Schrijver I, Persons DL. 2012; Test verification and validation for molecular diagnostic assays. Arch Pathol Lab Med. 136:11–3. DOI: 10.5858/arpa.2011-0212-ED. PMID: 22208481.
4. Clinical, Laboratory Standards Institute. 2012. Evaluation of detection capability for clinical laboratory measurement procedures; Approved guideline-Second edition. CLSI document EP17-A2. Clinical and Laboratory Standards Institute;Wayne, PA:
5. Association for Molecular Pathology. Molecular Diagnostic Assay Validation : Update to the 2009 AMP Molecular Diagnostic Assay Validation White Paper. https://www.amp.org/AMP/assets/File/resources/201503032014AssayValidationWhitePaper.pdf. Updated on Sep 2014.
6. Clinical, Laboratory Standards Institute. 2012. Molecular methods for clinical genetics and oncology testing; Approved guideline-Third edition. CLSI document MM01-A3. Clinical and Laboratory Standards Institute;Wayne, PA:
7. Clinical, Laboratory Standards Institute. 2012. Quality management for molecular genetic testing; Approved guideline. CLSI document MM20-A. Clinical and Laboratory Standards Institute;Wayne, PA:
8. Clinical, Laboratory Standards Institute. 2020. Evaluation of linearity of quantitative measurement procedures, 2nd Edition. CLSI guideline EP06-Ed2. Clinical and Laboratory Standards Institute;Wayne, PA:
9. Clinical, Laboratory Standards Institute. 2014. User verification of precision and estimation of bias; Approved guideline-Third edition. CLSI document EP15-A3. Clinical and Laboratory Standards Institute;Wayne, PA:
10. Clinical, Laboratory Standards Institute. 2010. Defining, establishing, and verifying reference intervals in the clinical laboratory; Approved guideline-3rd edition. CLSI guideline EP28-A3c. Clinical and Laboratory Standards Institute;Wayne, PA:
11. Standard: Establishment and verification of performance specifications. 42 CFR § 493.1253, Oct. 1, 2015. https://www.govinfo.gov/content/pkg/CFR-2015-title42-vol5/pdf/CFR-2015-title42-vol5-sec493-1253.pdf.
12. College of American Pathologists. Molecular pathology checklist. North-field, IL: College of American Pathologists;2021.
13. Mattocks CJ, Morris MA, Matthijs G, Swinnen E, Corveleyn A, Dequeker E, et al. 2010; A standardized framework for the validation and verification of clinical molecular genetic tests. Eur J Hum Genet. 18:1276–88. DOI: 10.1038/ejhg.2010.101. PMID: 20664632. PMCID: PMC3002854.
14. Jennings L, Van Deerlin VM, Gulley ML. 2009; Recommended principles and practices for validating clinical molecular pathology tests. Arch Pathol Lab Med. 133:743–55. DOI: 10.5858/133.5.743. PMID: 19415949.
15. Chung HJ, Hur M, Yoon S, Hwang K, Lim HS, Kim H, et al. 2020; Performance evaluation of the QXDx BCR-ABL %IS droplet digital PCR assay. Ann Lab Med. 40:72–5. DOI: 10.3343/alm.2020.40.1.72. PMID: 31432643. PMCID: PMC6713652.
16. Broeders S, Huber I, Grohmann L, Berben G, Taverniers I, Mazzara M, et al. 2014; Guidelines for validation of qualitative real-time PCR methods. Trends Food Sci Technol. 37:115–26. DOI: 10.1016/j.tifs.2014.03.008.
17. Jennings LJ, Smith FA, Halling KC, Persons DL, Kamel-Reid S. 2012; Design and analytic validation of BCR-ABL1 quantitative reverse transcription polymerase chain reaction assay for monitoring minimal residual disease. Arch Pathol Lab Med. 136:33–40. DOI: 10.5858/arpa.2011-0136-OA. PMID: 22208485.
18. Armbruster DA, Pry T. 2008; Limit of blank, limit of detection and limit of quantitation. Clin Biochem Rev. 29(Suppl 1):S49–52.
19. Shrivastava A, Gupta VB. 2011; Methods for the determination of limit of detection and limit of quantitation of the analytical methods. Chron Young Sci. 2:21–5. DOI: 10.4103/2229-5186.79345.
20. BIO-RAD. Rare Mutation Detection Best Practices Guidelines. https://www.bio-rad.com/webroot/web/pdf/lsr/literature/Bulletin_6628.pdf. Last accessed on Aug 2022.
21. Yuan D, Cui M, Yu S, Wang H, Jing R. 2019; Droplet digital PCR for quantification of PML-RARα in acute promyelocytic leukemia: a comprehensive comparison with real-time PCR. Anal Bioanal Chem. 411:895–903. DOI: 10.1007/s00216-018-1508-6. PMID: 30617397.
22. Wolstencroft EC, Hanlon K, Harries LW, Standen GR, Sternberg A, Ellard S. 2007; Development of a quantitative real-time polymerase chain reaction assay for the detection of the JAK2 V617F mutation. J Mol Diagn. 9:42–6. DOI: 10.2353/jmoldx.2007.060083. PMID: 17251334. PMCID: PMC1867420.
23. Schnittger S, Bacher U, Haferlach T, Wendland N, Ulke M, Dicker F, et al. 2012; Development and validation of a real-time quantification assay to detect and monitor BRAFV600E mutations in hairy cell leukemia. Blood. 119:3151–4. DOI: 10.1182/blood-2011-10-383323. PMID: 22331186.
Table 1
Table 2
Accuracy | Precision | Reportable range (AMR) | Linearity | Analytic specificity (interferences) | Limit of detection | Reference interval | |
---|---|---|---|---|---|---|---|
CLIA [10] | Verify | Verify | Verify | Verify | |||
CAP [11] | Verify | Verify | Verify; literature or manufacturer documentation OK | Verify | Verify; literature or manufacturer documentation OK | Verify | Verify; literature or manufac- turer documentation OK |
AMP [4] | Verify | Verify | Verify | Verify | |||
EGVG [12] | Verify | Verify | Verify |
Table 3
Reference | Test | Sample selection | Number of samples | Comparison method |
---|---|---|---|---|
Chung et al. [15] | Quantification of BCR-ABL1 fusion transcript using droplet digital PCR | Clinical samples ranging from 0.002% IS [MR4.7] to 20% IS [MR0.7] | 20 | Real-time PCR |
Yuan et al. [21] | Quantification of PML-RARA fusion transcript using droplet digital PCR | Clinical samples including from healthy individuals, newly diagnosed patients, and treated patients | 28 | Real-time PCR |
Wolstencroft et al. [22] | JAK2 V617F mutation using real-time PCR | Clinical samples from patients referred for investi- gation of myeloproliferative disorders | 200 | ARMS and allele-specific PCR |
Schnittger et al. [23] | BRAF V600E mutation using real-time PCR | Clinical samples from patients at diagnosis of hairy cell leukemia | 117 | Multiparameter flow cytometry |