| Literature DB >> 33727708 |
Beatrice Spedicati1, Massimiliano Cocca2, Roberto Palmisano1, Flavio Faletra2, Caterina Barbieri3, Margherita Francescatto1, Massimo Mezzavilla2, Anna Morgan2, Giulia Pelliccione2, Paolo Gasparini1,2, Giorgia Girotto4,5.
Abstract
Whole genome sequencing (WGS) allows the identification of human knockouts (HKOs), individuals in whom loss of function (LoF) variants disrupt both alleles of a given gene. HKOs are a valuable model for understanding the consequences of genes function loss. Naturally occurring biallelic LoF variants tend to be significantly enriched in "genetic isolates," making these populations specifically suited for HKO studies. In this work, a meticulous WGS data analysis combined with an in-depth phenotypic assessment of 947 individuals from three Italian genetic isolates led to the identification of ten biallelic LoF variants in ten OMIM genes associated with known autosomal recessive diseases. Notably, only a minority of the identified HKOs (C7, F12, and GPR68 genes) displayed the expected phenotype. For most of the genes, instead, (ACADSB, FANCL, GRK1, LGI4, MPO, PGAM2, and RP1L1), the carriers showed none or few of the signs and symptoms typically associated with the related diseases. Of particular interest is a case presenting with a FANCL biallelic LoF variant and a positive diepoxybutane test but lacking a full Fanconi anemia phenotypic spectrum. Identifying KO subjects displaying expected phenotypes suggests that the lack of correct genetic diagnoses may lead to inappropriate and delayed treatment. In contrast, the presence of HKOs with phenotypes deviating from the expected patterns underlines how LoF variants may be responsible for broader phenotypic spectra. Overall, these results highlight the importance of in-depth phenotypical characterization to understand the role of LoF variants and the advantage of studying these variants in genetic isolates.Entities:
Mesh:
Year: 2021 PMID: 33727708 PMCID: PMC8384846 DOI: 10.1038/s41431-021-00850-9
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Fig. 1Study workflow.
The scheme represents the overall workflow of the study. WGS and clinical data from Italian isolated populations have been combined to select putative HKOs. In case of positive genotype–phenotype correlation, preventive strategies could be applied, whereas whenever the correlation results negative, functional studies are needed to better clarify the role of the identified variant.
Selected loss of function variants.
| Gene | Chromosome | HGVS genomic nomenclature | HGVS coding DNA nomenclature | rsID | gnomAD frequency | Total/partial; n° KO coding transcripts/n° coding transcripts | Identified subjects | Age |
|---|---|---|---|---|---|---|---|---|
| 5 | NC_000005.9:g.40980013T>C | NM_000587.2:c.2350+2T>C | rs201240159 | 0.00028 | Total: 1/1 | Individual_1 | 55 | |
| 5 | NC_000005.9:g.176829461C>T | NM_000505.3:c.1681-1G>A | rs199988476 | 0.00039 | Total: 1/1 | Individual_2 | 79 | |
| 14 | NC_000014.8:g.91700389C>A | NM_003485.3:c.1006G>T | rs61745752 | 0.00092 | Total: 4/4 | Individual_3 | 68 | |
| 10 | NC_000010.10:g.124797364G>A | NM_001609.3:c.303+1G>A | rs147936696 | 0.00027 | Partial: 2/3 | Individual_4 Individual_5 | 82* 86 | |
| 2 | NC_000002.11:g.58468447A>G | NM_018062.3:c.2T>C | rs761291501 | 0.00005 | Partial: 6/7 | Individual_6 | 74 | |
| 13 | NC_000013.10:g.114322401G>A | NM_002929.2:c.699+1G>A | rs1191610272 | 0 | Total: 1/1 | Individual_7 | 69* | |
| 19 | NC_000019.9:g.35622287del | ENST00000591633.1:c.636del | rs770752678 | 0.00003 | Partial: 1/4 | Individual_8 | 75* | |
| 17 | NC_000017.10:g.56350831_56350844del | NM_000250.1:c.1552_1565del | rs536522394 | 0.00078 | Total: 3/3 | Individual_9 | 77 | |
| 7 | NC_000007.13:g.44104494del | NM_000290.3:c.532del | rs747947171 | 0.00004 | Total: 1/1 | Individual_10 | 84 | |
| 8 | NC_000008.10:g.10480385_10480386insA | NM_178857.5:c.326_327insT | rs771427543 | 0.00143 | Total: 1/1 | Individual_11 | 70 |
Gene: Genes carrying the selected variants. NM_ is referred to the canonical transcript of each gene, when the variant is reported also on the canonical transcript. HGVS genomic nomenclature: variants description according to the Human Genome Variation Society recommendations for linear genomic reference sequence; genomic data are aligned to the GRCh37/hg19 reference sequence. HGVS coding DNA nomenclature: variants description according to the Human Genome Variation Society recommendations for coding DNA reference sequence. rsID: Reference SNP cluster ID; rsIDs are updated to the latest dbSNP build (154). gnomAD frequency: variant frequency reported in gnomAD total allele frequency. Total/partial: each LoF variant has been classified as “Total” if it falls on all coding transcripts of a gene or as “Partial” if it falls only in some coding transcripts; n° KO coding transcripts/n° coding transcripts: number of coding transcripts for which the variant is a LoF over the total number of coding transcripts of each gene. Identified subjects: HKOs identification number. Age: age of identified subjects at follow-up (2019); individuals marked with an asterisk are deceased and age at first examination is reported.
Fig. 2Homozygous loss of function variants selection process.
The scheme highlights the variants selection and filtering workflow. WGS data were obtained for 947 samples and processed applying GATK best practice pipelines to generate variant calls for both SNPs and INDELs. After functional annotation with the VEP tool, 506 LoF variants with at least one homozygous carrier and CADD score ≥20 were selected. Further filtering overlapping those variants with OMIM disease-associated genes reduced our target list to 62 variants. A minor allele frequency upper limit of 1% according to data from the gnomAD database (gnomad_AF field) was applied, to retain only low-frequency variants. A final filtering step to select variants in genes causative of autosomal recessive disorders (AR) was performed. The resulting variants underwent confirmation through Sanger Sequencing.
OMIM autosomal recessive diseases description and HKOs phenotypical features.
| Gene | OMIM disease (MIM number) | Expected phenotypical features | Detected phenotypical features |
|---|---|---|---|
| C7 deficiency (#610102) | Increased susceptibility to systemic infections | Meningococcal meningitis, pericarditis, pneumonia, soft tissue infection | |
| Factor XII deficiency (#234000) | Prolonged APTT | Prolonged APTT | |
| Amelogenesis imperfecta, hypomaturation type, IIA6 (#617217) | Enamel abnormalities, multiple caries | Multiples caries and recurrent tooth decay | |
| 2-methylbutyrylglycinuria (#610006) | Developmental delay and neurological signs | – | |
| Fanconi anemia, complementation group L (#614083) | Bone marrow failure, skeletal abnormalities, increased cancer risk | Head and neck carcinoma, short stature | |
| Oguchi disease type 2 (#613411) | Night blindness | – | |
| Arthrogryposis multiplex congenita, neurogenic, with myelin defect (#617468) | Neurogenic defect with poor or absent myelin formation around peripheral nerves; prenatal onset; usually lethal in utero or in early childhood | – | |
| Myeloperoxidase deficiency (#254600) | Candidiasis | – | |
| Glycogen storage disease X (#261670) | Muscle cramps, exercise intolerance, elevated serum creatine phosphokinase, myoglobinuria | – | |
| Retinitis pigmentosa 88 (#618826) | Decreased visual acuity | – |
OMIM disease: autosomal recessive diseases associated with variants in the selected genes; MIM reference numbers are detailed in brackets. Expected phenotypical features: main clinical features associated with each specific syndrome. Detected phenotypical features: identified HKOs clinical presentations.
Fig. 3Diepoxybutane (DEB) chromosome fragility test.
The figure shows the outcome of the DEB test performed on the carrier of the NM_018062.3:c.2T>C variant in the FANCL gene. Red arrows indicate the typical triradial and quadriradial chromosome breakage patterns.
Fig. 4LGI4 expression, protein-coding transcripts.
The figure displays on the y axis the logarithmic value of the LGI4 gene expression for protein-coding transcripts, measured in transcripts per millions (TPM), and on the x axis the four LGI4 protein-coding transcripts. The red box represents the transcript of interest (ENST00000591633.1).