Literature DB >> 35627305

A Study of the Genomic Variations Associated with Autistic Spectrum Disorders in a Russian Cohort of Patients Using Whole-Exome Sequencing.

Ekaterina A Gibitova1, Pavel V Dobrynin1,2,3, Ekaterina A Pomerantseva4, Elizaveta V Musatova4, Anna Kostareva5,6, Igor Evsyukov1, Sergey Y Rychkov3, Olga V Zhukova3, Oxana Y Naumova3,7,8, Elena L Grigorenko2,7,8,9.   

Abstract

This study provides new data on the whole-exome sequencing of a cohort of children with autistic spectrum disorders (ASD) from an underexplored Russian population. Using both a cross-sectional approach involving a control cohort of the same ancestry and an annotation-based approach involving relevant public databases, we explored exonic single nucleotide variants and copy-number variation potentially involved in the manifestation of ASD. The study results reveal new potential ASD candidate-variants found in the studied Russian cohort and show a high prevalence of common ASD-associated genomic variants, especially those in the genes known to be associated with the manifestation of intellectual disabilities. Our screening of an ASD cohort from a previously understudied population allowed us to flag at least a few novel genes (IGLJ2, FAM21A, OR11H12, HIP1, PRAMEF10, and ZNF717) regarding their potential involvement in ASD.

Entities:  

Keywords:  Autistic Spectrum Disorder; Russian cohort; copy number variation; whole-exome sequencing

Mesh:

Year:  2022        PMID: 35627305      PMCID: PMC9141003          DOI: 10.3390/genes13050920

Source DB:  PubMed          Journal:  Genes (Basel)        ISSN: 2073-4425            Impact factor:   4.141


1. Introduction

Autism spectrum disorders (ASDs) represent a broad spectrum of neurodevelopmental conditions characterized by severe impairment in social interactions and communication skills and the manifestation of restricted, stereotypical behaviors. In 50–70 percent of cases, an ASD co-occurs with intellectual disability [1,2,3]. In addition, multiple other comorbid medical, psychiatric, neurological, and psychological conditions are commonly observed in ASD [3,4,5]. Emerging in early childhood, ASD are lifespan disorders that can be highly disabling [2]. ASD are high-incidence disorders. With slight regional variations, a worldwide average of ASD prevalence was estimated at ~7.6 per 1000 or one in 132 persons in 2010 [6]. As new practices for identifying ASD continue to be developed, these estimates tend to increase. For example, in the USA, the 2016 prevalence estimate of 18.5 per 1000 children was 2.8 times higher than that reported in 2000–2002 [7]. The etiology of ASD is not fully understood. Similar to many other developmental disorders, genetic risk, environmental exposure, and their interplay appears to contribute to the causal landscape of ASD [2,8,9,10]. The genetic contribution to ASD is strongly supported by twin and family studies [9,11,12,13]. A recent large-scale multinational population-based investigation has demonstrated a remarkably high heritability of ASD of 80% [10]. The empirical literature provides a growing body of evidence that the genetic etiology of ASD might be linked to a complex architecture of rare, often pathogenic, de novo mutations of large effects and/or the cumulative impact of multiple common variants with small effects [14,15,16]. Both the rare occurrence of the former and the small effect sizes of the latter underlie the main challenges in detecting ASD-associated genetic factors; addressing these challenges would require a considerable increase in sample size, an extension of the research geography, and the involvement of multinational patient cohorts. Over the past decade, research on ASD—ASD prevalence, the spectrum variability, and the contribution of environment and genetics to ASD etiology—has actively extended to previously less outreached minority groups: various racial, ethnical, and cultural communities. This research has revealed reduced differences in ASD prevalence for different ethnic cohorts destroying the stereotype of the significant prevalence of ASD in white populations and attributing this difference in prevalence mainly to limited access to healthcare services and screenings for minority communities; see, e.g., [17,18]. While the ASD prevalence can be relatively similar across various ethnic and cultural groups, the frequencies of different syndromes and disorders falling under the ASD umbrella, the disorders’ severity, and comorbid health conditions seem to vary across populations. While the ASD prevalence can be relatively similar across various ethnic and cultural groups, the frequencies of different syndromes and disorders falling under the ASD umbrella, the disorders’ severity, and the patterns of comorbid health conditions seem to vary across populations [19,20,21]. Collectively, the studies reporting on such phenotypic variability in ASD among various ethnic groups have attributed this to differences in both the cultural environment and the genetic background. The latter may shape the population-specific pattern of ASD via (1) elevated frequencies of some common ASD-related genetic variants, e.g., as has been shown for several SNPs in the SCN2A, FOXP1, and SYNGAP1 genes in indigenous American populations in the Amazon [22], rare population-specific causal variants, and (3) additive effects of common disease-associated variants and rare, likely pathogenic variants in the genetic background [23]. Thus, exploring ASD-associated genetic variants in various ethnic and geographic populations may increase diversity in reference genetic databases connecting specific variants to ASD and comorbid conditions that, in turn, may significantly improve genetic testing and its interpretation. To fill a ‘geographical’ gap, here we report data from a genomic study of a Russian cohort of children with ASD using exome sequencing.

2. Materials and Methods

2.1. Participants

The study participants were 193 children with ASD (mean age = 7 ± 4 years; 18 girls and 175 boys), ascertained through the ‘Genetico’ Center for Genetics and Reproductive Medicine (Moscow, Russia). The children were enrolled in a project focused on the genetic screening of ASD in a west-Russian population, based on ASD diagnoses established by child psychiatrists. The diagnostics of ASD in Russia are carried out in accordance with the Clinical Recommendations for Autism Spectrum Disorders (ID:594) issued by the Association of Psychiatrists and Psychologists for Evidence-Based Practice and approved by the Ministry of Health of the RF (https://cr.minzdrav.gov.ru/recomend/594_1, accessed on 15 May 2022). The diagnosis of ASD was established following the Autism Diagnostic Observation Schedule Second Edition (ADOS-II) based on the clinical picture and developmental history characteristics of ASD—a combination of symptoms of qualitative disorders of social interaction, communication, and limited, stereotyped, repetitive behavior. According to the Clinical Guidelines, the following tools to screen for the ASD symptoms were used: Checklist for Autism Spectrum Disorders (CASD), Social Communication Questionnaire (SCQ), and Autism Diagnostic Interview-Revised (ADI-R). The children’s demographics and comorbid diagnoses, along with the participants’ family histories, are shown in Supplementary Table S1. The data on the clinical features in ASD individuals are summarized in Table 1. These data were collected from the children’s medical records and interviews with parents. In addition, 51 individuals without ASD from the west-Russian general population enrolled via the Almazov National Medical Research Centre (St. Petersburg, Russia) were involved in the study as a comparison group; hereafter, they are referred to as nonASD. The nonASD comparison group included children and adults (mean age = 23 ± 16 years; 29 females and 23 males) who did not have a lifetime or current medical history of ASD, and any neurodevelopmental and psychiatric disorders, as per the interview with the participants and their medical records.
Table 1

Summarized data on clinical features of ASD individuals derived from the participants’ medical records.

PhenotypeOccurrenceFrequency (%)
Syndromes/conditions:
Fragile X syndrome52.59
Epilepsy21.04
Angelman syndrome10.52
Asperger’s syndrome10.52
Ehlers–Danlos syndrome10.52
Phelan–McDermid syndrome10.52
Autoaggression10.52
Macrocephaly10.52
Large head (probably macrocephaly)10.52
Microcephaly10.52
Brachycephaly10.52
Dyspepsia10.52
Macrosomia10.52
Hygroma10.52
Neutropenia10.52
Other affected anatomical systems and structures:
Skin(hypopigmentation; «coffee» stains; intra-areolar polythelia; inverted nipples; hypertrichosis; unusual hair growth; skin prone to scarring; transverse palmar fold; hemangioma on the arm, vascular mesh on the chest)115.70
Palpebral fissures(epicanthus; lower epicanthus; slightly elongated palpebral fissures; antimongoloid slanting palpebral fissures; very long eyelashes)94.66
Ears(macrotia; protruding auricles; dysplastic and low-set auricles; double helix; notches on both earlobes, asymmetric auricles; deformation of the right auricle upper edge; preauricular fossa of the left ear)84.15
Central Nervous System(focal cortical dysplasia; corpus callosum dysplasia; cerebral palsy, strabismus, ventricular dilatation, hippocampal hypoplasia; formations in the brain; stereotypical shaking of hands; ataxia, unusual hand movements; premature puberty)73.63
Nose(short nose, slightly twisted nostrils, depressed nose bridge; upturned nose; wide nose; nasal bridge folds; low columella; wide nose bridge; sunken nose bridge)73.63
Forehead(protruding frontal bones; high forehead)63.11
Orbits(deep-set eyes; hypotelorism)63.11
Connective tissue(joint hypermobility, skin hyperelasticity; connective tissue dysplasia; hereditary connective tissue disorder; severe myopia, marfanoid habitus)63.11
Fingers/toes(clinodactyly; an additional right thumb phalanx; wide terminal phalanges of fingers and toes)52.59
Face(“elfin-like” facial features; facial dysmorphisms; broad face; dysplastic face)42.07
Jaws(high palate, malocclusion, uneven teeth; macrognathia; absence of two lower incisors)42.07
Muscles(hypotonia, lack of tripod grasp; clumsy walking and movements; walking on tiptoes)42.07
Midface(midfacial hypoplasia)21.04
Torso(funnel chest, scoliosis)21.04

Note. Individual data are represented in Supplementary Table S1. All syndromes have been recorded as suspected; only for two of five individuals having records on Fragile X syndrome, the syndrome has been confirmed by genetic testing.

2.2. Exome Library Preparation and WES Data Processing

For the sequencing library preparation, three different exome capture assays were utilized. For the ASD cohort, the TruSeq DNA Exome and the SureSelect Human All Exon V7 were used for 73 and 120 individual samples, respectively. Whole-exome sequencing (WES) was performed using the Illumina HiSeq 4000 platform. The library preparation and sequencing were conducted at the ‘Genetico’ Center. WES data for the controls—51 nonASD individuals—were obtained using the SureSelect V6 r2 (hg19 build) capture assay for library preparation followed by sequencing on the Illumina platform. The sequencing was performed at the Almazov National Medical Research Centre. Sequencing data quality control was performed using the FastQC v. 0.11.9 [24]. Sequencing reads were aligned using the BWA v. 0.7.17-r1188 [25] to a reference genome depending on the exome enrichment panel utilized: the hg19 (UCSC) was used for the samples prepared with the TruSeq DNA Exome and the SureSelect V6 panels, and the hg38 (UCSC) for the samples prepared with the SureSelect V7 exome capture assay. Subsequently, the aligned sequences were subjected to sorting, deduplication, and base quality score recalibration as per the GATK Best Practices [26].

2.3. Single Nucleotide Variant Calling

Detection of single nucleotide polymorphisms (SNPs) was performed as follows. First, base recalibrated BAM files were submitted to the Germline short variant (SNPs and indels) discovery workflow implemented in the GATK v. 4.1.6. Second, each sample was submitted to the HaplotypeCaller separately with an appropriate BED file of intervals. Since three different capture assays were used for the sequencing library preparation, the corresponding three datasets were subjected to joint-genotyping separately. Third, the consolidated GVCF files were submitted to an exact test of the heterozygosity excess, applying the hard filtering parameter of ExcessHet > 54.69. The subsequent filtering procedures were performed using the VariantRecalibrator algorithm implemented in the GATK with the following truth sensitivity level parameters: 99.95 and 99.90 for the SureSelect and Truseq ASD-data subsets, respectively, and 99.50 for the nonASD subset. The established thresholds were chosen empirically to increase the possibility of detecting rare variants. Specifically, this allowed the detection of the maximum number of true-positive calls with minimal risk for false-positive calls, as can be seen in the tranches plots presented in Supplementary Figures S1 and S2). The hg38 SNP call-sets were converted to the hg19 build using the Assembly Converter. Three SNP call-sets were merged using the bcftools v. 1.10.2 [27]; the ‘include-non-variant-sites flag’ function implemented in the GenotypeGVCF pipeline was applied to account for the missing and reference genotypes. The indels were excluded, and the final merged SNP call-set was used in the downstream analysis. The SNP call-set was annotated to several databases: refGene, cytoBand, ClinVar, ExAC, avSNP147, dbNFSP v. 3.0a, and gnomAD Exome, using the ANNOVAR tools [28].

2.4. CNV Detection in WES Data

The detection of copy number variations (CNVs) in the WES data was performed by the ‘exomecn.mops’ function implemented in the cn.mops R package v. 1.32.0 [29]. The following parameters were adjusted: priorImpact = 100, upperThreshold = 0.59, lowerThreshold = −0.99, and useMedian = TRUE. Both segmentation algorithms, fastseg [29] and DNAcopy [30], were applied. Sixteen ASD samples were excluded: one sample prepared with the SureSelect V7 panel, due to a low genome coverage; and 15 samples prepared with the TruSeqExome panel, due to a batch-effect related to the use of IDT adapters. The CNV calling was performed in three separate runs for the subsets of samples processed with different exome capture assays for the library preparation—the TruSeq Exome (N = 58 ASD), the SureSelect V7 (N = 119 ASD), and the SureSelect V6 (N = 51 nonASD).

2.5. Analysis of ASD-Associated Variants

The association analysis was performed using the PLINK v1.9 toolset [31]. Prior to the association analysis, the SNP call-set underwent linkage disequilibrium (LD) pruning and clumping procedures. The LD pruning was conducted according to the following parameters of the ‘indep’ command: window size = 50, window shift = 5, and VIF threshold = 2. The SNP call-set was adjusted based on the following QC parameters: –maf 0.01, –geno 0.05, –hwe 0.001. After relatedness testing, nine individuals from closely related pairs (PI HAT ≥ 0.125) in the ASD cohort were removed. A case-control association analysis was performed using Fisher’s exact test to generate significance values adjusted for multiple testing using Bonferroni correction. Established candidate SNP variants had to meet inclusion criteria based on the predicted pathogenicity score thresholds: a SIFT score < 0.05 [32] and PolyPhen2 HDIV score ≥ 0.453 [33]. The polymorphisms with unknown pathogenicity were also considered as potential candidate variants. In addition, the SNP and CNV call-sets were intersected with an autism gene database, AutDB [34]. For the CNV call-sets, the intersection with the list of CNVs reported in the AutDB (validated) was performed using the following command: ‘bedtools intersect -a test.bed -b autdb.bed -f 0.70 -r -wa -wb > output.’ For a successful query, a test region had to overlap at least 70% of an AutDB record.

3. Results

3.1. Genome-Wide SNP Association Analysis

A summary of the variant calling statistics across the comparison groups and the WES-data subsets is shown in Supplementary Table S2. Altogether we detected 237,019 SNPs across all individuals from both comparison cohorts, ASD and nonASD. The mean number of the detected SNPs per individual varied from 23 to 30K depending on the exome enrichment assay applied, which corresponds to the value of ~25K SNPs per individual expected for the exome data [35]. The transition/transversion (Ti/Tv) ratio varied from 2.73 to 2.93, which corresponds to the value of 3.0 observed in the exonic regions [36]. After variant filtering and LD pruning, 22,249 remaining SNPs were included in the case-control association analysis. The association analysis identified ten variants related to eight genes that surpassed the genome-wide significance threshold (Table 2). These ten SNPs were novel variants not previously reported in association with ASD. Moreover, the eight genes harboring these variants also were not found among the over a thousand human genes implicated in ASD as per records in relevant databases—the SFARI [37] and AutDB [34]. According to the SNP functional annotation, the following variants might be highlighted: two likely pathogenic synonymous substitutions in the IGLJ2 (rs8033) and HIP1 (rs1167801) genes and five missense variants with moderate or high deleterious effects located in the PRAMEF10, ZNF717, FAM21A, and OR11H12 genes. The annotation of these genes against the databases on human diseases MalaCards [38] and OMIM [39], and human phenotype ontologies HPO [40], did not reveal associations with ASD and ASD-related phenotypes (Table 2).
Table 2

The results of the case-control, ASD vs. nonASD, association analysis of SNPs. Ten genome-wide significant ASD-associated SNPs are shown along with their pathogenicity scores and genomic annotations.

dbSNP IDPosition (hg19)SubstitutionVariant FunctionPathogenicity, C-Score AF ††PadjGene Name **Gene PrimaryFunctionAssociatedPhenotype
rs3121398chr1:12954987T > Amissense20.300.17579.338 × 10−3 PRAMEF10 Retinoic acid receptor binding protein; RAR-mediated signaling
rs3009023chr3:75786628G > Cmissense8.320.23781.260 × 10−5 ZNF717 DNA-binding transcription factor; Transcriptional regulation
rs2918517chr3:75786942C > Amissense11.550.21082.788 × 10−4
rs2669761chr10:51889683C > Amissense13.310.18829.828 × 10−3 FAM21A WASH complex subunit 2A; ExocytosisLeri–Weill dyschondrosteosis.
rs200662012chr14:19378348C > Tmissense20.400.19525.088 × 10−4 OR11H12 Olfactory receptor 11H12Hereditary breast-ovarian cancer syndrome
rs200891589chr14:19377614G > Tmissense10.300.16401.625 × 10−2
rs1167801chr7:75176300T > Csynonymous10.320.17651.015 × 10−2 HIP1 Huntingtin interacting protein 1; Clathrin-mediated endocytosis and traffickingHuntington disease; Chronic myelomonocytic leukemia; Williams–Beuren syndrome
rs1279304945chr9:39358227G > Asynonymous4.350.18681.583 × 10−3 SPATA31A1 Spermatogenesis-associated protein 31A1Familial glucocorticoid deficiency; Foramen magnum meningioma
rs1435247730chr19:40389752G > Asynonymous0.140.17402.930 × 10−2 FCGBP IgG Fc binding protein; Maintenance of the mucosal structureLynch syndrome; Von Willebrand disease; Congenital hypogammaglobulinemia
rs8033chr22:23243367T > Csynonymous10.070.24606.870 × 10−6 IGLJ2 Immunoglobulin lambda joining protein

Note.  † The CADD (Combined Annotation-Dependent Depletion) score [41] indicates a predicted deleterious effect of the variant on protein function: a C-Score > 20 defines a pathogenic variant, and a C-score between 10 and 20—a likely pathogenic variant [42,43]. †† Allele frequencies in the ASD cohort are shown. ⁑ The data on the associations with phenotypes are provided based on the human diseases, MalaCards [38] and OMIM [39], and human phenotype ontologies, HPO [40], databases. ** The genes detected in this study have not been previously reported in association with ASD, as per records in the most representative relevant databases, SFARI and AutDB.

3.2. CNV Burden in the ASD Cohort Compared to nonASD

Altogether 4991 CNVs across all individuals from both comparison cohorts were detected: 4084 and 907 CNV events were identified in the ASD and nonASD cohorts, respectively (Supplementary Table S3). As seen in Figure 1, the distribution of the CNV sizes differed between the comparison cohorts; the results of the Kolmogorov–Smirnov test indicated a statistical significance of this difference (D = 0.0934, p = 4.749 × 10−6). In particular, we observed a wider range in CNV length with a lower prevalence of smaller CNVs and a higher prevalence of larger CNVs in the ASD group compared to nonASD. In addition, we found a remarkable difference in the occurrence of different CNV-types between the comparison groups. In both comparison groups, deletions predominated over duplications: the deletions/duplications ratio was 2.22 and 1.48 for the ASD and nonASD cohorts, respectively. However, the predominance of deletions was more profound in the ASD group, where a significantly greater proportion of deletions was found (OR = 1.5; z = 5.365; p = 1.222 × 10−7).
Figure 1

Histograms showing the distributions of CNVs of different sizes (X-axis: log-width) in the comparison groups, ASD and nonASD. A statistically significant difference in the distributions was found (the Kolmogorov–Smirnov D = 0.0934, p = 4.749 × 10−6): the ASD cohort was characterized by a wider range in CNV length with a lower prevalence of smaller CNVs and a higher prevalence of larger CNVs compared to the nonASD cohort.

3.3. Genome-Wide Screening of Common ASD-Associated Variants, SNPs and CNVs

The total unfiltered call-set of 237K SNPs identified in both comparison cohorts, ASD and nonASD, was intersected with the list of 891 common ASD-associated variants derived from the AutDB repository [34]. We found 138 of such SNPs in the studied groups (Supplementary Table S4). The distribution of these SNPs across the groups is shown in Figure 2. Although we did not find a significant overrepresentation of the common candidate variants in the ASD cohort compared to the nonASD controls, a greater number of such SNPs were identified in the discovery group—137 SNPs in ASD vs. 102 SNPs in nonASD. Additionally, Fisher’s exact test did not reveal a significant difference in the frequency of the ASD-associated SNPs between the comparison groups.
Figure 2

Venn diagrams represent the distributions of common ASD-associated SNPs (a) and CNVs (b) across the two comparison groups, ASD and nonASD. Both diagrams reflect a greater number of the common ASD-associated genomic variants in the ASD cohort compared to the nonASD controls.

The CNV-call sets (ASD and nonASD) were also intersected with relevant data on the common ASD-related variations from the AutDB repository as per the procedures described in the Methods section. The list of overlapping CNVs was filtered according to the following criteria: localization within the same chromosome band, the same type of variation (either deletions or duplications), and the same gene content. The CNVs located on sex chromosomes and the CNVs containing HLA (human leukocyte antigen) genes, having the most extensive variability, were removed from the analysis. Detailed results of the intersection analysis are reported in Supplementary Table S5, and a summary is shown in Table 3. Altogether 29 CNVs previously associated with ASD were detected in the studied cohorts—23 and 13 CNVs in the ASD and nonASD groups, respectively, including 8 CNVs that overlapped between the comparison groups (Table 3; Figure 2). Similar to the SNP-based analysis results, despite the lack of significance in the CNV frequencies between the comparison groups, we observed a greater number of the common ASD-associated CNV-events in the ASD cohort than in the nonASD controls.
Table 3

The distribution of 29 common ASD-associated CNVs in the studied ASD and nonASD cohorts. Despite a lack of significant differences in the variants’ frequencies between the comparison groups, a greater number of the ASD-associated CNVs were detected in the ASD cohort compared to the nonASD controls, 23 vs. 13 CNVs.

Chromosome BandCNVGenes ReferenceFRQASD ††(N = 168)FRQnonASD(N = 51)
1p21.1NC_000001.11:g.103564908_103612675dupAMY2A, AMY2B[44]0.00600
1q11–q11.2NC_000001.11:g.120324463_ 149528945del SRGAP2C [45]0.00600
1q31.3NC_000001.11:g.196773605_196830172delCFHR1, CFHR3[46]0.07140
1q44NC_000001.11:g.248547045_248631695delOR2T10, OR2T11, OR2T29, OR2T34, OR2T35, OR2T5[44,47,48]0.00600.0196
2p22.1NC_000002.12:g.38729555_38746213dupGALM, SRSF7[47]0.00600
2q31.2NC_000002.12:g.178432096_178451050dup PRKRA [47]0.07740
2q35NC_000002.12:g.218818920_218956937dupCDK5R2, FEV, WNT10A, WNT6[49]0.00600.0392
2q37.1NC_000002.12:g.232371368_232459781dupALPG, ALPI, ALPP[49]00.0196
2q37.3NC_000002.12:g.240678256_240774012dupAQP12A, AQP12B, KIF1A[49]0.01790
3q12.2NC_000003.12:g.100646568_100713869dup ADGRG7 [47]0.02980.0392
4q13.2–q13.3NC_000004.12:g.69137075_69381445delUGT2B11, UGT2B28[49]00.0196
6p22.2NC_000006.12:g.26132436_26251373del17 genes of the HIST1H gene family[49]00.0196
9q34.3NC_000009.12:g.136887096_137799700dup45 genes including GRIN1, PNPLA7, ABCA2, NSMF, and others[47,50]00.0196
11q11NC_000011.10:g.55573260_55685410delOR4C11, OR4C15, OR4C16, OR4P4, OR4S2[48]0.05360.0588
13q12.11NC_000013.11:g.21155096_21172702dup SKA3 [47]0.08330.1176
13q34NC_000013.11:g.113809317_113841915dupGAS6, TMEM255B[51]00.0196
14q11.2NC_000014.9:g.22773609_22780051del SLC7A7 [47]0.00600
14q11.2NC_000014.9:g.19729152_19954640dupOR4K1, OR4K2, OR4K3, OR4K5, OR4M1, OR4N2, OR4Q3[52]0.05950.1176
14q24.3NC_000014.9:g.73528468_73582354delACOT1, ACOT2, HEATR4[46]0.00600.0980
14q32.33NC_000014.9:g.106112755_106318409del LINC00226 [53]0.00600
14q32.33NC_000014.9:g.105142694_105157763dup JAG2 [47]00.0196
17p13.1NC_000017.11:g.10443374_10453538del MYH4 [47,54]0.01190
17p13.3NC_000017.11:g.2452259_2691244dupMETTL16, PAFAH1B1[48,53]0.01190
17q21.2NC_000017.11:g.40399039_40417791dup TOP2A [47]0.00600
17q21.31NC_000017.11:g.45616241_46136454delARHGAP27, ARL17A, ARL17B, CRHR1, KANSL1, CRHR1, MAPT, PLEKHM1, SPPL2C, STH[53,55,56,57,58]0.02980
19p13.11NC_000019.10:g.17332929_17341703dupANO8, GTPBP3[47]0.00600
19q13.31–q13.2NC_000019.10:g.42738643_43237158delPSG1, PSG11, PSG2, PSG4, PSG5, PSG6, PSG7, PSG8, PSG9[53]0.01190
20p12.1NC_000020.11:g.13599877_13834151dupESF1, NDUFAF5, TASP1[53]0.00600
22q13.1NC_000022.11:g.38963107_38989480delAPOBEC3A, APOBEC3B[53]0.00600

Note. The complete list of detected CNVs is represented in Supplementary Table S3. † The genes reported in the SFARI database as those related to ASD are marked in bold. †† The CNV frequencies in the ASD (FRQASD) and nonASD (FRQnonASD) groups are shown.

In addition, the CNV call-sets were compared with the CNV morbidity map [59,60], or the list of structural genomic variants that have been linked to severe pediatric diseases, including developmental delays, intellectual disability, and ASD. The development delay track derived using the UCSC Genome Browser tools consists of over 29 thousand individual entries for case subjects. An intersection of these records with the ASD and non-ASD CNV call-sets resulted in 52 overlapping entries (Supplementary Table S6). It was remarkable that only four of 51 nonASD individuals (7.8%) had those CNVs; in contrast, among ASD participants, 21 of 148 individuals (14.2%) harbored CNVs linked to a developmental disorder, including the deletion CNVR6294.56 on chromosome 14, known as a very common variant [61].

3.4. Gene-Based and Gene Ontology-Based Analyses

Concerning the gene content of the loci harboring CNVs, we identified 1562 and 990 genes having CNVs in the ASD and nonASD cohorts, respectively (Supplementary Table S7). We performed several overrepresentation analyses (ORA) to identify particular gene ontology (GO) and human phenotype ontology (HPO) terms enriched in these gene sets. It is necessary to note that both the ASD and nonASD sets of genes harboring CNVs were extremely enriched in those encoding olfactory receptors (OR), which were assigned to the GO: olfactory receptor activity. Specifically, 173 of 1562 genes in the ASD gene-set (Fold Enrichment = 5.45; FDR = 5.61 × 10−59) and 167 of 990 genes in the nonASD gene-set (Fold Enrichment = 8.35; FDR = 1.88 × 10−84) were OR genes. Consequently, this superfamily of highly polymorphic OR genes was removed from the gene-sets prior to the ORAs. The ORA results are reported in the Supplementary Table S8. A graphical summary showing a top list of the most overrepresented (at an Enrichment FDR < 10−5) GO terms is shown in Figure 3. As can be seen in Figure 3, at the established significance threshold, the only functional category overrepresented among the genes having CNVs in the nonASD group is the JAK-STAT pathway. This signaling pathway mediates cellular transcriptional responses to cytokines and, as a consequence, is related, first of all, to the immune response. Notably, STAT protein-related pathways were also found among the GO terms overrepresented in the set of genes having CNVs in the ASD cohort (Figure 3). In turn, in comparison to the nonASD gene-set, the ASD gene-set was remarkably enriched in genes related to meiotic processes, in particular chromosome segregation, and in genes involved in the primary cilium assembly and organization (Figure 3).
Figure 3

The plot shows functional categories (GO: biological process terms) significantly (at an Enrichment FDR < 10−5) enriched in the sets of genes harboring CNVs in the ASD cohort (green) and nonASD cohort (blue). A network indicates GO terms sharing 30% or more genes; thicker edges represent more overlapped genes. Bigger nodes correspond to larger gene sets, and darker nodes correspond to more significant enrichment FDR-values. The enrichment tests and the network constructions were performed using ShinyGO tools [62].

The lists of genes with CNVs were also submitted to the HPO-based ORA. No significant enrichment was found for the nonASD gene-set. In contrast, the ASD gene-set was significantly enriched in a number of phenotypes related, first of all, to intellectual disability and developmental delays (Table 4). Tracking HPO-related genes in the data on the CNV distribution across individuals (Supplementary Table S9), we observed that several highlighted phenotypes had been reported in the participants’ medical records. In particular, a representative number of individuals (N = 79) harbored CNVs in the genes associated with intellectual disability. One of the ASD individuals harboring CNVs in genes related to hypertelorism manifested this phenotype according to his medical history. Two of four participants who had records on the epicanthic fold carried CNVs in the genes associated with the epicanthus. Two ASD participants were recorded as having hypotonia and one with confirmed microcephaly; however, they did not have CNVs in the genes linked to these phenotypes as per the HPO.
Table 4

The human phenotype ontology (HPO) terms that were significantly overrepresented among those related to genes harboring CNVs in the ASD cohort.

Human Phenotype Ontology (HPO)Gene-Set, nTotal Genes, nEnrichment FDR
HP:0000007 Autosomal recessive inheritance27221878.81 × 10−22
HP:0001249 Intellectual disability16511107.17 × 10−20
HP:0001263 Global developmental delay14610841.35 × 10−13
HP:0000252 Microcephaly1046723.22 × 10−13
HP:0004322 Short stature1208333.43 × 10−13
HP:0001250 Seizures13610471.37 × 10−11
HP:0001347 Hyperreflexia744425.94 × 10−11
HP:0000639 Nystagmus956501.02 × 10−10
HP:0001511 Intrauterine growth retardation593212.38 × 10−10
HP:0001252 Muscular hypotonia805173.17 × 10−10
HP:0000957 Cafe-au-lait spot20492.14 × 10−9
HP:0000340 Sloping forehead301103.13 × 10−9
HP:0004209 Clinodactyly of the 5th finger462323.39 × 10−9
HP:0100615 Ovarian neoplasm17363.39 × 10−9
HP:0000028 Cryptorchidism765083.59 × 10−9
HP:0002007 Frontal bossing492593.59 × 10−9
HP:0000470 Short neck472423.59 × 10−9
HP:0000347 Micrognathia714709.72 × 10−9
HP:0000486 Strabismus785461.71 × 10−8
HP:0000286 Epicanthus543182.15 × 10−8
HP:0002650 Scoliosis836012.15 × 10−8
HP:0006101 Finger syndactyly351582.15 × 10−8
HP:0000316 Hypertelorism694715.42 × 10−8
HP:0002119 Ventriculomegaly472711.29 × 10−7
HP:0000268 Dolichocephaly281171.83 × 10−7
HP:0001631 Atrial septal defect402173.20 × 10−7
HP:0003202 Skeletal muscle atrophy442597.18 × 10−7
HP:0000494 Downslanted palpebral fissures462787.53 × 10−7
HP:0000426 Prominent nasal bridge301418.25 × 10−7
HP:0001257 Spasticity513278.81 × 10−7
Important to note that enrichment of functional groups of genes does not equate to the presence of phenotypes; thus, not necessary for every patient with perturbations in particular genes to develop the corresponding phenotype. However, the top list of HPO terms in Table 4 indicates a burden of CNVs in the genes directly related to the manifestation of severe developmental issues mostly known in disorders with autosomal recessive inheritance. Remarkable, in contrast to microcephaly, we did not observe an enrichment in the HPO Macrocephaly—a clinical feature also overrepresented in individuals with ASD, which reported rates are 10–20%; see, e.g., [63,64,65]. We tend to attribute this to a power insufficiency of the enrichment analysis for this particular HPO term, for which a limited amount of associated genes (about ten) are known compared to the HPO Microcephaly, which has been reliably linked to hundreds of various genes.

4. Discussion

In summarizing the results of this study exploring exonic variations in a Russian cohort of children with ASD, several findings and observations need to be pointed out and discussed. First, based on the data aggregated in relevant repositories, such as AutDB [34], SFARI [37], and the CNV morbidity map of developmental delay [59,60], we tracked the genomic variants known to be implicated in ASD and related developmental disorders in the studied ASD cohort compared to the nonASD individuals from the general population of the same origin. Although below the threshold of statistical significance, an elevated prevalence of common ASD-associated genomic alterations, both SNPs and CNVs, was observed in the ASD compared to the nonASD cohorts. Second, in comparing the CNV metrics (length, prevalence, and distribution) between the comparison groups, we observed that the ASD cohort is characterized by a higher CNV burden. Specifically, we noted a higher prevalence of larger CNV events and a remarkable predominance of deletions over duplications (about 1.5 times) in the ASD compared to the nonASD group. These observations are consistent with the consensus in the literature that CNVs are one of the most prominent sources of genetic risks for ASD. Specifically, it has been reported that CNVs, as a genomic event, are highly prevalent (i.e., observed in up to 20%) in individuals with an ASD [66]. Additionally, it has been shown that de novo CNV events occur almost five times more frequently in individuals with ASD than in unaffected siblings and controls (5–10% vs. 1–2%, respectively), and large CNVs were consistently observed in the cases with developmental delays or intellectual disability [49,51]. A recent study exploring the effect sizes of the CNV types on the development of multiple cognitive domains and overall ASD risk suggested a differential effect of deletions and duplications on different phenotypic features of ASD. Specifically, whereas both CNV types may equally affect motor skills, IQ-related cognitive deficits in ASD have been predominantly attributed to haploinsufficiency due to deletions [66]. Remarkably, our phenotype-focused enrichment tests revealed a significant overrepresentation of the comorbid phenotypes related to intellectual disabilities and developmental delays among those associated with genes having CNVs, predominantly deletions, in the studied ASD cohort. It is important to note that, in addition to these two major phenotypes, several other comorbid conditions and health problems were highlighted by the CNV gene-set enrichment analysis; for example, conditions of hypertelorism and epicanthic fold were tracked in participants’ medical records. Altogether, these observations provide additional evidence supporting the essential role of structural genomic variations in the etiology of diverse ASD phenotypes often accompanied by multiple comorbid developmental conditions. Third, we explored potential functional outcomes of the CNV burden across the comparison groups based on tests of gene-set enrichment in the specific biological pathways that these genes control. In contrast to the nonASD controls, the list of genes harboring CNVs in the genomes of individuals with ASD showed a significant overrepresentation of gene ontologies related to meiosis and chromosome segregation, and to the biological pathways involved in primary cilium assembly and organization. Both of these biological pathways might be potentially linked to the ASD phenotype. Specifically, the former may indicate an aggravated risk of chromosomal rearrangements in ASD genomes; such rearrangements, both de novo and inherited, are known to be involved in the etiology of ASD [67,68,69]. In fact, defects and deficits in primary cilia, known to impact brain development and maturation [70,71,72,73], have been demonstrated, directly and indirectly, to contribute to ASD [74] and ASD-related phenotypes, such as Asperger syndrome [75] and Fragile X syndrome [76]. Fourth, our case-control association analysis allowed us to identify seven SNPs with a predicted deleterious effect, which showed a significantly higher prevalence in the ASD cohort compared to controls. Six genes harboring these variants: IGLJ2, FAM21A, OR11H12, HIP1, PRAMEF10, and ZNF717, have not been previously linked to ASD or ASD-related phenotypes. However, several studies have shown that ZNF717, HIP1, and several genes from the PRAME (Preferentially Expressed Antigen In Melanoma) family might be involved in cognitive development, learning disorders, and developmental disorders with autistic features [77,78]. Specifically, mutations in PRAMEF5 and PRAMEF7 were described in patients with delayed speech and language development, hearing deficits, and reading disability [77]. In the same study [77], a mutation in the ZNF717 gene has been identified among 16 other rare homozygous variants in at least two families, those of patients with Joubert syndrome—a disorder characterized by autistic behavior and intellectual disability. Two reciprocal microduplications inclusive of HIP1 have been described in three children from two unrelated families who had neurobehavioral problems: one child had an expressive language disorder, and two children had attention deficit hyperactivity disorder and manifested aggressive behavior [78]. The detection of variants within the OR11H12 gene encoding an olfactory receptor is also not surprising, as rare and common variants in at least several OR genes have been reported in association with autism, such as OR2M4 [79], OR2T10 [80,81], and OR52M1 [82]. In conclusion, despite the main limitation—a relatively small sample size—the study’s findings and observations are consistent with the growing body of evidence supporting the genetic bases of such heterogeneous disorders as ASD. We also report on several suggestive candidate genes that might be associated with ASD. Undoubtedly, follow-up research involving additional extended cohorts from the population is required to confirm the involvement of these genes in ASD; we consider these findings preliminary. To our knowledge, our study is one of the first attempts to investigate genome-wide polymorphic variants, SNPs and CNVs, in a previously understudied cohort of ASD from the Russian Federation, using a whole-exome sequencing technique.
  76 in total

1.  Axon guidance pathways served as common targets for human speech/language evolution and related disorders.

Authors:  Huimeng Lei; Zhangming Yan; Xiaohong Sun; Yue Zhang; Jianhong Wang; Caihong Ma; Qunyuan Xu; Rui Wang; Erich D Jarvis; Zhirong Sun
Journal:  Brain Lang       Date:  2017-07-08       Impact factor: 2.381

2.  Recurrent distal 7q11.23 deletion including HIP1 and YWHAG identified in patients with intellectual disabilities, epilepsy, and neurobehavioral problems.

Authors:  Melissa B Ramocki; Magdalena Bartnik; Przemyslaw Szafranski; Katarzyna E Kołodziejska; Zhilian Xia; Jaclyn Bravo; G Steve Miller; Diana L Rodriguez; Charles A Williams; Patricia I Bader; Elżbieta Szczepanik; Tomasz Mazurczak; Dorota Antczak-Marach; James G Coldwell; Cigdem I Akman; Karen McAlmon; Melinda P Cohen; James McGrath; Elizabeth Roeder; Jennifer Mueller; Sung-Hae L Kang; Carlos A Bacino; Ankita Patel; Ewa Bocian; Chad A Shaw; Sau Wai Cheung; Tadeusz Mazurczak; Paweł Stankiewicz
Journal:  Am J Hum Genet       Date:  2010-11-25       Impact factor: 11.025

3.  Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations.

Authors:  Brian J O'Roak; Laura Vives; Santhosh Girirajan; Emre Karakoc; Niklas Krumm; Bradley P Coe; Roie Levy; Arthur Ko; Choli Lee; Joshua D Smith; Emily H Turner; Ian B Stanaway; Benjamin Vernot; Maika Malig; Carl Baker; Beau Reilly; Joshua M Akey; Elhanan Borenstein; Mark J Rieder; Deborah A Nickerson; Raphael Bernier; Jay Shendure; Evan E Eichler
Journal:  Nature       Date:  2012-04-04       Impact factor: 49.962

4.  A copy number variation morbidity map of developmental delay.

Authors:  Gregory M Cooper; Bradley P Coe; Santhosh Girirajan; Jill A Rosenfeld; Tiffany H Vu; Carl Baker; Charles Williams; Heather Stalker; Rizwan Hamid; Vickie Hannig; Hoda Abdel-Hamid; Patricia Bader; Elizabeth McCracken; Dmitriy Niyazov; Kathleen Leppig; Heidi Thiese; Marybeth Hummel; Nora Alexander; Jerome Gorski; Jennifer Kussmann; Vandana Shashi; Krys Johnson; Catherine Rehder; Blake C Ballif; Lisa G Shaffer; Evan E Eichler
Journal:  Nat Genet       Date:  2011-08-14       Impact factor: 38.330

5.  Integrated analysis of whole-exome sequencing and transcriptome profiling in males with autism spectrum disorders.

Authors:  Marta Codina-Solà; Benjamín Rodríguez-Santiago; Aïda Homs; Javier Santoyo; Maria Rigau; Gemma Aznar-Laín; Miguel Del Campo; Blanca Gener; Elisabeth Gabau; María Pilar Botella; Armand Gutiérrez-Arumí; Guillermo Antiñolo; Luis Alberto Pérez-Jurado; Ivon Cuscó
Journal:  Mol Autism       Date:  2015-04-15       Impact factor: 7.509

6.  Complex nature of apparently balanced chromosomal rearrangements in patients with autism spectrum disorder.

Authors:  Anne-Claude Tabet; Alain Verloes; Marion Pilorge; Elsa Delaby; Richard Delorme; Gudrun Nygren; Françoise Devillard; Marion Gérard; Sandrine Passemard; Delphine Héron; Jean-Pierre Siffroi; Aurelia Jacquette; Andrée Delahaye; Laurence Perrin; Céline Dupont; Azzedine Aboura; Pierre Bitoun; Mary Coleman; Marion Leboyer; Christopher Gillberg; Brigitte Benzacken; Catalina Betancur
Journal:  Mol Autism       Date:  2015-03-25       Impact factor: 7.509

Review 7.  Roles of Primary Cilia in the Developing Brain.

Authors:  Sang Min Park; Hee Jin Jang; Jeong Ho Lee
Journal:  Front Cell Neurosci       Date:  2019-05-14       Impact factor: 5.505

8.  Exome Evaluation of Autism-Associated Genes in Amazon American Populations.

Authors:  Giovana E da Costa; Giordane L Fernandes; Juliana C G Rodrigues; Diana F da V B Leal; Lucas F Pastana; Esdras E B Pereira; Paulo P Assumpção; Rommel M R Burbano; Sidney E B Dos Santos; João F Guerreiro; Marianne R Fernandes; Ney P C Dos Santos
Journal:  Genes (Basel)       Date:  2022-02-18       Impact factor: 4.096

9.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

10.  Primary Ciliary Deficits in the Dentate Gyrus of Fragile X Syndrome.

Authors:  Bumwhee Lee; Shree Panda; Hye Young Lee
Journal:  Stem Cell Reports       Date:  2020-07-30       Impact factor: 7.765

View more
  1 in total

1.  Genomic Characterization by Whole-Exome Sequencing of Hypermobility Spectrum Disorder.

Authors:  Gerardo J Alanis-Funes; Saúl Lira-Albarrán; Jesús Hernández-Pérez; Mario A Garza-Elizondo; Rocío Ortíz-López; César V Elizondo; Augusto Rojas-Martinez; Rocío A Chávez-Santoscoy; Claudia Rangel-Escareño
Journal:  Genes (Basel)       Date:  2022-07-18       Impact factor: 4.141

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.