Genetics of Alzheimer's Disease

Jong Hun Kim

doi:10.12779/dnd.2018.17.4.131

Abstract

Alzheimer's disease (AD) related genes have been elucidated by advanced genetic techniques. Familial autosomal dominant AD genes founded by linkage analyses are APP, PSEN1, PSEN2, ABCA7, and SORL1. Genome-wide association studies have found risk genes such as ABCA7, BIN1, CASS4, CD33, CD2AP, CELF1, CLU, CR1, DSG2, EPHA1, FERMT2, HLA-DRB5-HLA-DRB1, INPP5D, MEF2C, MS4A6A/MS4A4E, NME8, PICALM, PTK2B, SLC24A4, SORL1, and ZCWPW1. ABCA7, SORL1, TREM2, and APOE are proved to have high odds ratio (>2) in risk of AD using next generation sequencing studies. Thanks to the promising genetic techniques such as CRISPR-CAS9 and single-cell RNA sequencing opened a new era in genetics. CRISPR-CAS9 can directly link genetic knowledge to future treatment. Single-cell RNA sequencing are providing useful information on cell biology and pathogenesis of diverse diseases.

INTRODUCTION

The new genetics techniques that decode the human genome have revealed the risk genes for various diseases. Based on these, there have been many developments in diagnosis, prediction, and understanding of diseases. The genetic pathogenesis of Alzheimer's disease (AD) has also been revealed in the last decade. This review is about the introduction of the basic genetics concept and the AD genes revealed by each genetics technique, and promising genetics research techniques other than sequencing.

ALLELE, MUTATION, SINGLE NUCLEOTIDE POLYMORPHISM (SNP), AND VARIANT

The “allele” refers to the mutation of a particular gene. They can exist as major or minor allele. A minor allele is an allele form that is less frequent in the population while a major allele is more frequent. Minor allele frequency (MAF) is a concept of frequency. MAF can be calculated by a following equation.

M A F = \frac{h e t e r o z y g o t e + 2 \times h o m o z y g o t e}{2 \times i n d i v i d u a l}

The MAF should be understood as the frequency of the allele in the chromosome number, not the individual number. For an example, MAF 0.3 does not mean that 30% of the population have a minor allele. An individual has 2 chromosomes inherited from his mother and father, and homozygotes are considered when counting the number of alleles. The “SNP” means that 1 DNA nucleotide is replaced with another DNA nucleotide. Polymorphism can also mean that MAF is more than 1%. The “mutation” is less than 1% MAF and does not include the concept of DNA length. The “variant” covers all the concepts of allele, SNP, and mutation, which are described above. Allele, mutation, SNP, and variant are not strictly separated and used interchangeably. If the length of the changed nucleotides is more than 2, SNP cannot be used.

COMMON VS. RARE ALLELE

In genetic study, MAF in the population are important, because clinical significances of mutations are different, and research methods are different according to MAF. Common alleles refer to alleles where their MAFs greater than 1% and can be studied using the genome-wide association study (GWAS). Rare alleles are those with less than 1% of MAF and are studied using next generation sequencing. The statistical power can be calculated by MAF and odds ratio (OR). The higher the MAF and the OR, the higher the statistical power.

GWAS

GWAS is based on a technology that can detect hundreds of thousands of mutations at once using DNA chips. DNA chips can detect predetermined site mutations. Therefore, GWAS is conducted for the common alleles. If a DNA chip is designed for rare alleles, only small information can be obtained, because most of the results will be monomorphic. During meiosis, chromosomes perform homologous recombination which gives more diversity to the offspring. The nucleotides closely located to each other tend to move together after the homologous recombination. These highly related DNA fragments are called linkage disequilibrium blocks. Therefore, mutations that show significant association with diseases in GWAS results may be the cause of some of those disease, but they may be detected as markers associated with causal mutations. After performing GWAS, it is necessary to perform sequencing around the significant markers to find the most significant mutations. Before and after the GWAS, it is necessary to perform several quality control tasks to minimize false associations. Finally, significant results are selected from the multiple test corrections. The variants with p value <5×10^–8 (0.05/10⁶) are recognized as the significant variants in the GWAS.

GWAS was first published in 2005 and CFH was found to play an important role in age-related macular pigmentosa.1 GWAS associated with AD was also performed to identify several risk genes. The largest studies to date have been performed by Lambert et al.,2 in 2013, in which 74,046 people of International Genomics of Alzheimer's Project (IGAP) were involved. No further large-scale study has been performed since the study. The authors identified 20 genes (ABCA7, BIN1, CASS4, CD33, CD2AP, CELF1, CLU, CR1, DSG2, EPHA1, FERMT2, HLA-DRB5-HLA-DRB1, INPP5D, MEF2C, MS4A6A/MS4A4E, NME8, PICALM, PTK2B, SLC24A4, SORL1, and ZCWPW1) associated with AD.

NEXT-GENERATION SEQUENCING (NGS)

NGS is a technique that can dramatically increase the efficiency compared to the existing sequencing techniques and can decipher individual's genome quickly. Because NGS can read DNA, it can study both common and rare alleles. However, it is still more expensive than GWAS and is mainly used to study rare alleles. In 2013, TREM2 was elucidated as an AD risk gene using NGS.3 In addition, the A673V mutation of the APP gene, which is known as a gene of autosomal dominant familial AD, was elucidated as a protective mutation of AD. Persons with this mutation had a low production of amyloid β_1-42 and subjects with APOE ε4 homozygote were able to survive without dementia if they had A673V mutation in APP. The result showed the hope of anti-amyloid treatment.4

The studies targeting rare alleles have low statistical power and try to overcome such problems using gene-based analysis. The frequency can be increased by adding the rare alleles that occurs in the boundary of genes. Another advantage of the gene-based analysis is that the threshold of multiple test corrections can only achieve p value = 0.05/20,000 (number of genes in human genome) = 2.5×10^–6. However, the disadvantage is that the mutations used in the assay are mixed with risk, protective, and neutral mutations, which may offset the results.

The Kari Staffanson's team of Icelandic deCode genetics, which discovered TREM2 and a protective mutation in APP, uncovered a new powerful AD risk gene with interesting approach. In 2010, Dickson et al.,5 suggested a synthetic association hypothesis in order to explain the low heritability of GWAS genes. The various rare alleles may be synthetically associated with the common variants that are significant in GWAS and are the real cause of the diseases. The Kari Staffanson's team tested the cryptic association theory and raised the threshold for multiple corrections by targeting only AD genes found in the large-scale GWAS of IGAP.2 In addition, they performed gene-based analysis using only loss-of-function variations in order to overcome the disadvantage of gene-based analysis, and found that ABCA7 is an AD risk gene with a high OR.6

FAMILIAL & EARLY-ONSET AD (EOAD) GENES

APP, PSEN1, and PSEN2 are well-known autosomal dominant familial AD genes. These genes are found in about 70% of families with more than 1 EOAD, and account for only 20% of sporadic EOAD when the sporadic EOAD are younger than 50 years of age.7 ABCA7 and SORL1 can cause autosomal dominant AD depending on the types of mutations.6 7 8 9 TREM2 is also known to be associated with EOAD.10

POLYGENIC RISK SCORES (PRSs)

The heritability of AD is about 70%, but only 30% of the AD heritability can be explained by the known genes.11 The phenomenon that cannot account for heritability by known genes was common in complex genetic diseases/traits. Purcell et al., published excellent results in 2009 through schizophrenia GWAS.12 They showed that a large number of mutations with a p value <0.5 were highly associated with schizophrenia and bipolar disorder. Mutations with small effect sizes showed additive effects on diseases and explained the expected heritability of the complex genetic disease/trait.13

A PRS study of AD was first published in 201514 where the IGAP study, a large-scale GWAS for AD,2 was divided into discovery set and replication set. PRS was constructed by adding the β values of the mutations with p value <0.5 which are selected in discovery set. The PRS showed high association with AD and had predictive values in replication set.

POPULAR GENETICS TECHNIQUES

CRISPR-CAS9

The fascinating thing about genetics is that the outcome is crucial to finding the pathogenesis of the disease. In AD, the discovery of PSEN1, PSEN2, APP, APOE, and TREM2 played a crucial role in making pathogenesis of AD. In addition to finding pathogenesis, CRISPR-CAS9 opened the possibility of applying genetic research results directly to therapy. CRISPR-CAS9 is a third-generation gene scissors, which has the advantage of being highly accurate and efficient compared to the previous generations gene scissors. Also, it can be manufactured within a day in a laboratory.15

Currently, the most popular CRISPR-CAS9 is made from Streptococcus pyogenes. Within CRISPR-CAS9, there is a guide RNA, which makes CRISPR-CAS9 find a specific site in a genome. The guide RNA of Streptococcus pyogenes is 20 base pair long and researchers can modify the guide RNA sequence in order to change the action site of CRISPR-CAS9. The primary role of CRISPR-CAS9 is cutting the DNA. After cutting DNA, hosts can restore the defect and cause mutations in 2 ways. First, the non-homologous end joining (NHEJ) causes 0–10 base pair deletion when it links the breaking DNA sequence. If the repair occurs without deletion, the CRISPR-CAS9 cut it again because of its high efficiency, so only the deletion of 1–10 base pairs can be observed. Because 3 DNA bases encode 1 amino acid, gene knock-out due to frameshift occurs at 60% when targeting the exon. Second, homology-directed repair (HDR) can cause mutations. If external DNA oligonucleotides that have the same shape around the truncated part and contain another nucleotide sequence segment in the middle is inserted, a new nucleotide sequence segment will be inserted when the truncated part is restored. In the cells, NHEJ are more frequent than HDR, which makes intended mutagenesis difficult. Recently, researches are solving this problem.16 In addition, off-target effects need to be solved in order to apply CRISPR-CAS9 to clinical setting.

Single-cell RNA sequencing

The genomes of all cells in a person are the same, however, the expressed proteins differ according to tissues or cells. Therefore, in order to observe actual events in a cell, DNA studies are not helpful, rather, transcribed RNA are the helpful ones. RNA can be reversely transcribed into DNA and be sequenced. RNA can be quantified and the kinds of proteins produced in a cell found by NGS. In recent years, a technique called multiple displacement of amplification (MDA) has been developed that amplifies very small amounts of RNA using bacteriophage DNA polymerase.17 This MDA made possible single-cell RNA sequencing. The advantage of single-cell RNA sequencing is that possibilities can be controlled and the results obtained are more refined. Single-cell RNA sequencing is widely used in a variety of researches. Scientists are trying to make human cell atlas by analyzing the RNA profiles of all cells in all tissues using single-cell RNA sequencing.18

CONCLUSIONS

Many researches have been carried out on AD, but most of the known knowledge to date is based mostly on genetics findings. These are the powerful aspects of genetics. Based on genetic knowledge for AD, scientists are increasing therapeutic targets.16 19 20 21 22