Abstract
Background
While several factors contribute to breast cancer pathogenesis, hereditary breast cancer results from a genetic predisposition. Genes associated with hereditary breast cancer may be divided into high- and low-penetrance genes depending on their risk rates. BRCA1 and BRCA2 are typical high-penetrance genes that increase the risk of developing breast and ovarian cancers upon undergoing mutations. This study aimed to evaluate the clinical performance of BRCAaccuTest™ (NGeneBio, Korea).
Methods
BRCAaccuTest™ is a reagent used to produce libraries for analyzing BRCA1/2 genes using next-generation sequencing (NGS), which analyzes blood-derived genomic DNA. Libraries with adapters and barcodes compatible with the Illumina platform were produced. The clinical performance of NGS-based BRCAaccuTest™ in identifying BRCA1/2 mutations was compared with that of the traditional Sanger sequencing method. Both NGS and Sanger sequencing were performed in a single laboratory using archival DNA from blood samples of 212 patients with breast cancer.
초록
배경
본 연구는 유전성 유방암-난소암 증후군 환자, 비유전성 고위험군 유방암 환자에서 BRCA1/2 유전자의 돌연변이를 검출하는 “BRCA아큐테스트”(엔젠바이오, 서울, 대한민국)의 임상적 효용성을 평가하기 위해 시행되었다.
방법
212명의 유방암 환자의 전혈에서 추출한 DNA 검체로부터 차세대염기서열분석법을 이용해 BRCA아큐테스트가 시행되었다. 분석대상은 단일염기서열변이(single nucleotide variation)와 짧은 삽입/결실(short insertion/deletion)이며 BRCA아큐테스트 결과를 전통적인 Sanger sequencing 결과와 비교하여 일치도를 평가하였다.
Breast and ovarian cancers exert substantial social and financial burdens. Breast cancer was one of the most commonly diagnosed cancers in 2018 in the United States [1]. Every year, approximately 20,000 new cases of breast and ovarian cancers are detected in South Korea [2, 3]. The genetically well-characterized risk factors of breast and/or ovarian cancers include germline mutations of BRCA1 and BRCA2 genes. These genes are important tumor suppressors actively involved in the development of hereditary breast and ovarian cancer (HBOC) syndrome [4]. Up to 15% of germline mutations in the BRCA1 or BRCA2 gene are associated with the genetic risk factors of breast and ovarian cancers in the general population [5]. Deleterious germline mutations in either of the two genes may contribute up to 80% (by 70 years of age) lifetime risk of developing breast and ovarian cancer and are also related to early-onset disease [6-8]. BRCA1 and BRCA2 variants may be divided into three groups, namely, single-nucleotide variants or polymorphisms (SNVs or SNPs), small insertion/deletion (indels), and large genomic rearrangements [9].
The Sanger sequencing method is mainly used to detect mutations in BRCA1 and BRCA2 . However, it is time-consuming, labor-intensive, and expensive owing to the large size of BRCA1 and BRCA2 genes and the scattered mutations along these genes [10]. Next-generation sequencing (NGS) has been increasingly applied in cancer research and molecular diagnosis to simultaneously screen and intensively analyze different target genes [9]. This technique allows the genetic screening of BRCA1 and BRCA2 mutations to provide a high-throughput, quick, highly cost-effective, and comprehensive genome analysis.
We aimed to evaluate the clinical performance of BRCAaccuTest™ (NGeneBio, Seoul, Korea). We evaluated the analytical performance of the BRCAaccuTest™ kit as an NGS-based in vitro diagnostic (IVD) reagent. We also evaluated the ability of an NGS analytic software called NGeneAnalySys™ (NGeneBio) to automatically detect mutations in BRCA1 and BRCA2 genes and perform pathogenic classifications.
To investigate the analytical performance of BRCAaccuTest™, control DNA samples were purchased from Coriell Institute (https://www.coriell.org/; See Supplemental Data Table S1). To test sensitivity (limit of detection), the input DNA was used at 50-, 10-, 5-, 1-, and 0.5-ng concentrations. Three reference DNAs, NA13713, NA14636, and NA14624, were used as positive controls of deleterious variants of either BRCA1 or BRCA2 . Different amounts (50, 10, 5, 1, and 0.5 ng) of each reference DNA were tested twice in separate experiments to assess the range of input gDNA. For the specificity test, healthy blood samples were supplemented with interfering substances such as bilirubin, hemoglobin, and cholesterol. Genomic DNA (gDNA) was extracted using a QIAGEN Blood DNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. In addition, the wash buffer from the DNA extraction kit was added as an interfering substance in the test sample (See Supplemental Data Table S2). To evaluate reproducibility, the experiment was carried out at two different locations, on different dates, with different batches of kits (different lot numbers), and/or performers. The input DNA was used at 10–50-ng concentrations, and two to three technical replicates were prepared for each test.
The clinical samples used in this study were obtained from the Seoul National University Hospital (SNUH). Briefiy, gDNA was extracted from patients who underwent BRCA testing from 2008 to 2015. The previous BRCA test was applied to sequence BRCA1 and BRCA2 by the typical Sanger sequencing method at SNUH; the sequencing results were accumulated and well-documented by SNUH. Therefore, the leftover gDNA were identical to those used in the Sanger method and exploited to investigate the diagnostic consistency between the Sanger sequencing method and BRCAaccuTest™.
The residual gDNAs were maintained and controlled as per the regulations of the Ministry of Food and Drug Safety (MFDS) in the Republic of Korea and the Institutional Review Board (IRB) at SNUH. The samples satisfied the following criteria: 1) quantity was more than 100 ng for NGS or 5 μg for the Sanger method, 2) quality (A260/280) was 1.8–2.0, and 3) the storage period was less than 10 years. Samples were excluded from the study if they did not meet these criteria.
This was a retrospective study performed by a single laboratory at SNUH under the regulation of MFDS in the Republic of Korea and IRB. It was a comparative study to assess the clinical utility and diagnostic consistency of BRCAaccuTest™ in comparison to the traditional Sanger sequencing method. As shown in Fig. 1, the target number of test samples was calculated based on the following criteria: diagnostic positive/negative agreement, 0.99; power, 80%; and statistical significance, 5% [11] with an additional 10% to compensate for drop-outs such as samples out of criteria. Therefore, the target number was initially set to 190 plus 22 extras for testing both BRCA1 and BRCA2 . Thus, the test samples for BRCA1 or BRCA2 were prepared with 106 samples each (total of 212 samples). Approximately 10% of the 212 test subjects (22 samples) were sequenced using the Sanger method to verify stability using an Applied Biosystems 3730xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA) and a BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) according to the manufacturer’s instructions. Ultimately, 207 samples met the calculated sample size (N=190).
BRCAaccuTest™ was used to generate the NGS library of BRCA1 (NM_007294.3) and BRCA2 (NM_000059.3) genes according to the manufacturer’s instructions. Briefiy, the target regions of BRCA1 and BRCA2 genes were amplified using 10–50 ng of sample DNAs by the polymerase chain reaction (PCR). The amplicons were ligated using adaptors, and the library was amplified at the final step. Quantity and quality of the amplified library were evaluated using a Qubit® 3.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) and Tapestation 2200 (Agilent Technologies, Santa Clara, CA, USA), respectively. After quality control (QC), the final library was normalized to a concentration of 4 nM and prepared for sequencing using an Illumina MiSeqDx® with MiSeq Reagent Nano Kit V2 (300 cycles) (Illumina, San Diego, CA, USA) as per the manufacturer’s instructions to generate pair-end reads.
The pipeline workfiow of data analysis was as follows: data QC, adapter trimming, alignment, variant calling, and annotation. Sequence reads generated were trimmed using Sickle v1.33 [12], a windowed adaptive trimming tool for sequencing adapters, and then aligned to the human hg19 reference genome using the BWA-MEM algorithm v0.7.10 [13]. Genetic variants, including SNVs and short insertions/deletions (indels), were identified using GATK v2.3 [14] and FreeBayes v9.9.2 [15], and the identified variants were annotated by snpEff v4.2 [16]. This bioinformatic pipeline was fully automated in NGeneAnalySys™ software. Sequence reads generated as FASTQ files could be uploaded on NGeneAnalySys™ to analyze, annotate, classify, and visualize NGS sequencing results, including clinical reports. Variants were classified according to the ACMG Standards and Guidelines for the Interpretation of Sequence Variants [17]. According to NGeneAnalySys™, the QC criteria for uniformity were set as follows: minimum coverage at >20× and average coverage at >200×.
Peripheral blood samples were collected from family members of HBOC patients, non-familial breast cancer patients, or hereditary breast cancer patients. Genomic DNAs were extracted using the QIAGEN QIAamp DNA Blood Mini Kit (Qiagen) according to the manufacturer’s instructions. The residual genomic DNAs were stored at -70°C after Sanger sequencing for BRCA1 and BRCA2 . The BRCA1 and BRCA2 target regions were amplified using target-specific primers. The purified PCR products were sequenced using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) according to the manufacturer’s instructions and analyzed using an Applied Biosystems 3730xl DNA Analyzer (Applied Biosystems).
The BRCA1 and BRCA2 alterations detected using BRCAaccuTest™ assay were compared with the results of the Sanger assay performed at SNUH. Positive percent agreement (PPA) and negative percent agreement (NPA) were defined as follows:
To investigate the precision of BRCAaccuTest™, its reproducibility was tested by changing experiment performers, laboratories, dates, and lot numbers. The experiments for all combinations were carried out with three positive controls (NA13713, NA14636, and NA14624) and one negative control (NA12878). All samples were duplicated.
The analytical performance of BRCAaccuTest™ was tested to confirm its sensitivity, specificity, and precision. The test results demonstrated the successful production of all the sequenced libraries with an average 300× coverage depth in the target regions. Each representative BRCA mutation in the control DNA was also perfectly detected with 0.5 ng input DNA (Table 1). The NGS results illustrated no effects of the interfering substances on library preparation using BRCAaccuTest™ (Table 1). The precision of BRCAaccuTest™ showed 100% reproducibility in all combinations, namely, within-run, between-run, between-person, and between-lot (Table 1).
Approximately 10% of 212 test subjects (22 samples) were sequenced by the Sanger method to verify the stability of residual gDNA samples. All Sanger resequencing results were consistent with the previous ones. As the stability of the remaining samples was verified, subsequent NGS analysis was performed on 212 samples. In total, 207 samples (B001 to B207) were selected to create NGS libraries. These were divided into nine groups (runs) of 23 samples that were run on MiSeqDx® because the appropriate capacity of BRCAaccuTest™ is up to 24 samples, including 1 control (NA12878). The input DNA range of the 207 samples was 11.6–48.8 ng, and the average input DNA amount was 30.1±7.5 ng. The average time for library preparation for each group was 4.5–5.0 hours.
In total, 206 libraries of 207 samples were qualified and quantified at a success rate of 99.5%. Only one library, B021, was excluded owing to an insufficient amount of sample. As shown in Fig. 2A, all 206 libraries ranged from 15.2 to 100.3 nM, with an average of 55.2±15.0 nM (mean±SD). The BRCAaccuTest™ yielded 67.3 and 69.4 nM libraries with the input DNA of B15 (48.8 ng) and B101 (11.6 ng), respectively, indicating its highly consistent performance that was independent of input DNA amount. The average size of the final libraries, including the adapters, was 394.0±9.3 bp (Fig. 2B).
We successfully performed nine runs of NGS. Overall QC rejection rate was 4%. Only one sample, B022, showed a 13× minimum coverage at a QC failure rate of 0.04%. Therefore, there was no issue in analyzing the variants, as the minimum coverage region was irrelevant (data not shown). In addition, 95.6% of samples met the QC criteria of the average coverage.
For variant analysis, ubiquitous homozygous SNPs in Asian populations, including c.4563A>G (rs206075), c.6513G>C (rs206076), and c.7397T>C (rs169547), found in BRCA2 were excluded (https://www.ncbi.nlm.nih.gov/clinvar/). A total of 1,704 variants were detected in 206 samples (including 8 samples as within-run and between-run intra-controls); 930 and 774 variants were detected in BRCA1 and BRCA2 , respectively. As 8 duplicated samples were excluded from the analysis, 1,640 variants were found in 198 samples. By excluding overlapped variants, 199 different types of variants (82 in BRCA1 and 117 in BRCA2 ) were estimated: 143 different SNVs (71.9%), 10 insertions (5.0%), 43 deletions (21.6%), and 3 indels (1.5%) were found in BRCA1/2 (See Supplemental Data Table S3). Among the variants detected in the coding regions, 75 (39.7%) were nonsense (stop-gain) or frameshift mutations causing protein function loss, 6 (3.2%) were detected in the canonical splice sites (splice donor and acceptor), and 91 (48.1%) were missense mutations affecting amino acid sequences (See Supplemental Data Table S3).
The details of the most frequent variants predicted to be pathogenic or likely pathogenic using NGeneAnalySys™ are listed in Table 2. The pathogenic variant c.5496_5506delGGTGACCCGAGinsA (p.Val1833Serfs) in BRCA1 was most predominantly found in eight patients, followed by c.390C>A (p.Tyr130Ter), c.5445G>A (p.Trp-15Ter), c.5470_5477delATTGGGCA (p.Ile1824Aspfs), and c.922_924delAGCInsT (pSer308fs). Five pathogenic variants in BRCA1 (c.5467+1G>A, c.5266C>T, c.4981G>T, c.3627dupA, and c.928C>T) were found at least thrice. For the BRCA2 gene, each two patients carried c.1399A>T (p.Lys467Ter), c.5576_5579delTTAA (p.Ile1859-Terfs), and c.8951C>G (p.Ser2984Ter) variants.
The BRCAaccuTest™ results at all 199 positions were compared with those of the Sanger sequencing assay. Within the 198 individual samples, 1,640 mutations and 37,762 wild-type calls were detected. The agreement analysis results are shown in Table 3. Variant-level concordance (PPA and NPA) was 100% for all results with 95% confidence intervals of 99.8–100.0% for mutations (PPA) and 99.9–100.0% for wild-type location (NPA).
Many diagnostic laboratories use NGS technology to enhance throughput and reduce turn-around time and cost. However, NGS may introduce complexity resulting from the selection of components of the BRCA test workfiow, including the NGS platform, enrichment methods, and bioinformatic analysis processes.
The report described an NGS-based IVD reagent and analysis software for BRCA1/2 gene testing that are more efficient than the existing Sanger sequencing method. Further, this study tested the analytical performance of an NGS approach for BRCA1/2 mutation analysis in HBOC patients. The comparison of NGS and the gold standard Sanger sequencing revealed 100% sensitivity and 100% specificity in all coding exons of BRCA1/2 and 10 bp into the introns from intron/exon junctions of 212 HBOC samples.
NGS exhibits great potential by allowing rapid mutational analysis of multiple genes in HBOC [18]. However, a reliable tool is desirable to take advantage of these opportunities to extract information from the big data generated by NGS. False-positive SNVs were directly adjacent to the 3′-end of one of the first PCR primers and were detected in only one strand [19, 20]. False-negative results have also been reported, missing the pathogenic BRCA mutation owing to differences in the NGS platform and library formation [21]. Hence, caution is required while sequencing to avoid false-positive results.
The bioinformatic pipelines provide an automated workfiow for the processing of NGS data for BRCA1/2 genes using variant filters adapted to the amplicon-based target NGS data [18]. However, in the present study, the entire NGS technique from wet laboratory steps to bioinformatic analysis was managed using a single kit; this process is critical to maintaining accuracy. BRCAaccuTest™, including NGeneAnalySys™, is an all-in-one method comprising the complete NGS procedure from library preparation to automated post-processing bioinformatic pipeline and interpretation.
This study has a limitation stemming from the exclusion of large genomic rearrangement in the NGS assay for BRCA1/2. PCR enrichment is unsuitable for the reliable detection of copy number variants (CNVs) [9]. A recent study showed that the detection of BRCA1/2 rearrangement is still challenging, especially in the case of exon 2 in BRCA1 , where false-positive CNV calls may be observed [22]. The assay for BRCA1/2 rearrangements by NGS requires substantial improvement in algorithms in order to be used in clinical practice.
In conclusion, BRCAaccuTest™ can be exploited for clinical purposes, as it provided positive, negative, and overall diagnostic consistency of 100% with Sanger sequencing at a significance level of 5%. In addition, NGeneAnalySys™ can be used clinically to analyze data generated from BRCAaccuTest™ for screening pathogenic variants of the BRCA1/2 genes.
REFERENCES
1. Heymach J, Krilov L, Alberg A, Baxter N, Chang SM, Corcoran RB, et al. 2018; Clinical Cancer Advances 2018: Annual report on progress against cancer from the American Society of Clinical Oncology. J Clin Oncol. 36:1020–44. DOI: 10.1200/JCO.2017.77.0446. PMID: 29380678.
2. Jung KW, Won YJ, Kong HJ, Lee ES. 2018; Prediction of cancer incidence and mortality in Korea, 2018. Cancer Res Treat. 50:317–23. DOI: 10.4143/crt.2018.142. PMID: 29566480. PMCID: PMC5912149.
3. Jung KW, Won YJ, Kong HJ, Lee ES. Community of Population-Based Regional Cancer R. 2018; Cancer statistics in Korea: Incidence, mortality, survival, and prevalence in 2015. Cancer Res Treat. 50:303–16. DOI: 10.4143/crt.2018.143. PMID: 29566481. PMCID: PMC5912151.
4. Park HS, Park SJ, Kim JY, Kim S, Ryu J, Sohn J, et al. 2017; Next-generation sequencing of BRCA1/2 in breast cancer patients: potential effects on clinical decision-making using rapid, high-accuracy genetic results. Ann Surg Treat Res. 92:331–9. DOI: 10.4174/astr.2017.92.5.331. PMID: 28480178. PMCID: PMC5416916.
5. Janavičius R. 2010; Founder BRCA1/2 mutations in the Europe: implications for hereditary breast-ovarian cancer prevention and control. EPMA J. 1:397–412. DOI: 10.1007/s13167-010-0037-y. PMID: 23199084. PMCID: PMC3405339.
6. Antoniou A, Pharoah PD, Narod S, Risch HA, Eyfjord JE, Hopper JL, et al. 2003; Average risks of breast and ovarian cancer associated with BRCA1 or BRCA2 mutations detected in case series unselected for family history: a combined analysis of 22 studies. Am J Hum Genet. 72:1117–30. DOI: 10.1086/375033. PMID: 12677558. PMCID: PMC1180265.
7. Mavaddat N, Peock S, Frost D, Ellis S, Platte R, Fineberg E, et al. 2013; Cancer risks for BRCA1 and BRCA2 mutation carriers: results from prospective analysis of EMBRACE. J Natl Cancer Inst. 105:812–22. DOI: 10.1093/jnci/djt095. PMID: 23628597.
8. D'Argenio V, Esposito MV, Telese A, Precone V, Starnone F, Nunziato M, et al. 2015; The molecular analysis of BRCA1 and BRCA2: Next-generation sequencing supersedes conventional approaches. Clin Chim Acta. 446:221–5. DOI: 10.1016/j.cca.2015.03.045. PMID: 25896959.
9. Wallace AJ. 2016; New challenges for BRCA testing: a view from the diagnostic laboratory. Eur J Hum Genet. 24 Suppl 1:S10–8. DOI: 10.1038/ejhg.2016.94. PMID: 27514839. PMCID: PMC5141576.
10. Ruiz A, Llort G, Yagüe C, Baena N, Viñas M, Torra M, et al. 2014; Genetic testing in hereditary breast and ovarian cancer using massive parallel sequencing. Biomed Res Int. 2014:542541. DOI: 10.1155/2014/542541. PMID: 25136594. PMCID: PMC4098986.
11. Hajian-Tilaki K. 2014; Sample size estimation in diagnostic test studies of biomedical informatics. J Biomed Inform. 48:193–204. DOI: 10.1016/j.jbi.2014.02.013. PMID: 24582925.
12. Joshi NA, Fass JN. Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ les (Version 1.33) [Software]. https://github.com/najoshi/sickle. Updated on Jul 2014.
13. Li H, Durbin R. 2009; Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25:1754–60. DOI: 10.1093/bioinformatics/btp324. PMID: 19451168. PMCID: PMC2705234.
14. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. 2011; A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 43:491–8. DOI: 10.1038/ng.806. PMID: 21478889. PMCID: PMC3083463.
15. Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. https://arxiv.org/abs/1207.3907. Updated on Jul 2012.
16. Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, et al. 2012; A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 6:80–92. DOI: 10.4161/fly.19695. PMID: 22728672. PMCID: PMC3679285.
17. Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. 2015; Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 17:405–24. DOI: 10.1038/gim.2015.30. PMID: 25741868. PMCID: PMC4544753.
18. Kechin A, Khrapov E, Boyarskikh U, Kel A, Filipenko M. 2018; BRCA-analyzer: Automatic work ow for processing NGS reads of BRCA1 and BRCA2 genes. Comput Biol Chem. 77:297–306. DOI: 10.1016/j.compbiolchem.2018.10.012. PMID: 30408727.
19. Ermolenko NA, Boyarskikh UA, Kechin AA, Mazitova AM, Khrapov EA, Petrova VD, et al. 2015; Massive parallel sequencing for diagnostic genetic testing of BRCA genes--a single center experience. Asian Pac J Cancer Prev. 16:7935–41. DOI: 10.7314/APJCP.2015.16.17.7935. PMID: 26625824.
20. Suryavanshi M, Kumar D, Panigrahi MK, Chowdhary M, Mehta A. 2017; Detection of false positive mutations in BRCA gene by next generation sequencing. Fam Cancer. 16:311–7. DOI: 10.1007/s10689-016-9955-8. PMID: 27848044.
21. Strom CM, Rivera S, Elzinga C, Angeloni T, Rosenthal SH, Goos-Root D, et al. 2015; Development and validation of a next-generation sequencing assay for BRCA1 and BRCA2 variants for the clinical laboratory. PLoS One. 10:e0136419. DOI: 10.1371/journal.pone.0136419. PMID: 26295337. PMCID: PMC4546651.
22. Capone GL, Putignano AL, Trujillo Saavedra S, Paganini I, Sestini R, Gensini F, et al. 2018; Evaluation of a next-generation sequencing assay for BRCA1 and BRCA2 mutation detection. J Mol Diagn. 20:87–94. DOI: 10.1016/j.jmoldx.2017.09.005. PMID: 29061375.