| Literature DB >> 25781923 |
Pía Villanueva1, Ron Nudel2, Alexander Hoischen3, María Angélica Fernández4, Nuala H Simpson2, Christian Gilissen3, Rose H Reader2, Lillian Jara5, María Magdalena Echeverry6, Maria Magdalena Echeverry, Clyde Francks7, Gillian Baird8, Gina Conti-Ramsden9, Anne O'Hare10, Patrick F Bolton11, Elizabeth R Hennessy12, Hernán Palomino13, Luis Carvajal-Carmona14, Joris A Veltman3, Jean-Baptiste Cazier15, Zulema De Barbieri4, Simon E Fisher7, Dianne F Newbury16.
Abstract
Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10-4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25781923 PMCID: PMC4363375 DOI: 10.1371/journal.pgen.1004925
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1Pedigree showing direct lines of descent between founder brothers and children in Robinson Crusoe validation cohort.
Founder brothers are individuals on the second line of the pedigree. Individuals with language impairment are colored in black. Individuals with typical language are denoted in white. Individuals with unknown phenotype are shaded grey. Genotypes at rs144169475 are represented by small circles; blue circles represent homozygote reference allele, red circles represent variant carriers, grey circles represent unknown genotype. Note that each individual may be represented through multiple lines of descent and so might appear more than once on this diagram. Children are labelled according to affection status—SLI1 to SLI15 and TLD1 to TLD17. Cases whose exomes were sequenced are indicated by asterisks. Three children (1 affected, 2 unaffected, none of whom carried the rs144169475 variant) are not represented on this figure since they were related to alternative founder families. SLI15 is known to be related to one of the founder brothers but the exact line of descent is unknown.
Association of novel nonsynonymous or canonical splice-site variants in 111 individuals from the Robinson Crusoe validation cohort.
| Chr | Variant Position (hg19) | Ref/variant | Average read depth across variant | Gene | Transcript ID | Gene element affected by variant | Amino Acid change | SLI/TLD[ | Variant Freq[ | SLI variant freq[ | TLD variant freq[ | MQLS p |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 113,245,326 | A/G | 60 |
| NM_001042678 | IVS3 | SA site | 49/62 | 0.059 | 0.071 | 0.048 | 0.625 |
| 1 | 248,308,783 | T/A | 415 |
| NM_001004690 | Exon 1 | C112S | 49/62 | 0.000 | 0.000 | 0.000 | - |
| 4 | 47,907,320 | A/T | 57 |
| NM_152995 | Exon 4 | N150K | 49/62 | 0.113 | 0.194 | 0.048 | 0.0002 |
| 10 | 31,134,425 | C/T | 119 |
| NM_001143766 | Exon 8 | R641H | 49/62 | 0.158 | 0.173 | 0.145 | 0.466 |
| 11 | 33,054,503 | T/G | 36 |
| NM_139160 | Exon 8 | N444K | 40/60 | 0.131 | 0.149 | 0.117 | 0.399 |
| 16 | 27,363,901 | G/A | 30 |
| NM_000418 | Exon 7 | R185H | 49/61 | 0.095 | 0.143 | 0.057 | 0.053 |
| 21 | 47,359,924 | C/T | 52 |
| NM_001130141 | IVS-12 | SA site | 48/59 | 0.266 | 0.292 | 0.246 | 0.228 |
| 22 | 41,257,834 | T/TA | 37 |
| NM_145174 | Exon 1 | V55VX | 49/62 | 0.261 | 0.245 | 0.274 | 0.554 |
| X | 48,682,972 | A/G | 30 |
| NM_006044 | Exon 29 | N1200D | 49/62 | 0.419 | 0.378 | 0.452 | 0.456 |
1 – The number of individuals with SLI genotyped / the number of individuals with typical language ability genotyped.
2 – Frequency of discovered variant in all genotyped Islanders
3 – Frequency of discovered variant in genotyped Islanders with SLI
4 – Frequency of discovered variant in genotyped Islanders with typical language ability
Note that all Islanders (both cases and controls) were related
*- this variant was not validated with Sanger sequencing and represents a false positive finding from the exome sequencing
Fig 2Putative contributory coding variants identified in NFXL1 by this study.
Position of putative NFXL1 coding variants with respect to exons and protein coding sequence. Genomic coding exons (exons 2–23) are shown by pink bands at the top. Protein motifs are represented by colored bands in the lower boxes. The red box represents a Znf RING motif, the yellow boxes represent Znf NFX1 motifs, the blue box represents a coiled-coil domain and the green box a transmembrane domain. Putative contributory coding variants are shown by arrows. Blue arrows denote synonymous changes, red arrows nonsynonymous changes. Sanger sequencing plots are given for all variants identified. Conservation of amino acid sequences across 11 species shown for all variants identified. The ref row shows the human reference allele and the variant row shows the observed variant in our samples. All sequences that differ from the reference sequence are shown in red.
Allele and genotype frequencies of rs144169475 in the Robinson Crusoe validation cohort.
| Robinson Crusoe population | Founder-related Islanders[ | Non-founder-related Islanders[ | SLI[ | TLD[ | Male Islanders[ | Female Islanders[ | European controls[ | Colombian Controls[ | Chilean Controls[ | PUR[ | CLM[ | MXL[ | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| freq allele T (variant) | 0.113 | 0.125 | 0.000 | 0.194 | 0.048 | 0.116 | 0.132 | 0.000 | 0.042 | 0.074 | 0.009 | 0.033 | 0.076 |
| freq genotype TT | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| freq genotype AT | 0.225 | 0.250 | 0.000 | 0.388 | 0.097 | 0.233 | 0.263 | 0.000 | 0.084 | 0.149 | 0.018 | 0.067 | 0.152 |
| freq genotype AA | 0.775 | 0.750 | 1.000 | 0.612 | 0.903 | 0.767 | 0.737 | 1.000 | 0.916 | 0.851 | 0.982 | 0.933 | 0.848 |
| No. individuals | 111 | 100 | 11 | 49 | 62 | 43 | 57 | 127 | 320 | 121 | 55 | 60 | 66 |
1 – Islanders who are directly related to one of the eight founder families (NB this sample includes affected andunaffected individuals)
2 – Individuals who live on the Island but have no known genetic connection to the eight founder families (NB this sample includes 4 affected and 7 unaffected individuals)
3 – Islanders who have been diagnosed with SLI as described in methods (NB, this sample included 45 related, founder-related individuals and 4 non-founder-related parents).
4 – Islanders who have been classified as having typical language ability as described in methods (NB, this sample included 55 founder-related Islanders and 7 non-founder-related parents).
5 – Male individuals who are directly related to one of the eight founder families (NB this sample includes affected and unaffected individuals)
6 – Female individuals who are directly related to one of the eight founder families (NB this sample includes affected and unaffected individuals)
7 - 127 in-house European controls (ECACC, HRC-1 DNA Panel)
8 - 320 South American (Colombian) controls (provided by Luis Carvajal-Carmona and Maria Magdalena Echeverry)
9 – 121 Chilean controls (provided by Lillian Jara and Pia Villanueva)
10 – 1000 genomes Puerto Ricans from Puerto Rico (Integrated phase I, accessed March 2014)
11 – 1000 genomes Colombians from Medellin, Colombia (Integrated phase I, accessed March 2014)
12 – 1000 genomes Mexican Ancestry from Los Angeles USA (Integrated phase I, accessed March 2014)
NFXL1 coding variants observed in 117 UK (SLIC) probands affected by SLI.
| Position (hg19) | Ref | Var | Estimated VAF in SLI probands[ | Median read depth[ | dbSNP ID | 1000G population VAF (ALL:AFR:AMR:ASN:EUR)[ | EVS VAF (EA:AA)[ | European VAF[ | Confirmed VAF in SLI probands[ |
| Amino Acid change[ | Fishers exact between European controls & SLIC[ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Chr4:g.47887536 | T | C | 0.7763 | 4531 | rs2053404 | 0.73:0.65:0.75:0.72:0.77 | 0.75:0.68 | 0.75 | NT | 14 | A601A | NT |
| Chr4:g.47887652 | T | C | 0.0035 | 5123 | rs151113647 | 0.00:0.00:0.00:0.00:0.00 | 0.00:0.0002 | 0.00 | 0.0043 | 14 | T563A | 0.0244 |
| Chr4:g.47887991 | G | A | 0.7835 | 6433.5 | rs6818556 | 0.73:0.65:0.75:0.72:0.77 | 0.75:0.68 | 0.75 | NT | 13 | T523T | NT |
| Chr4:g.47898575 | G | A | 0.0071 | 4817.5 | rs35139099 | 0.0005:0.00:0.00:0.00:0.0013 | 0.005:0.0005 | 0.004 | 0.0085 | 10 | R432C | ns |
| Chr4:g.47901088 | C | T | 0.0642 | 2986 | rs34323060 | 0.02:0.002:0.03:0.00:0.04 | 0.047:0.0098 | 0.05 | NT | 7 | K292K | NT |
| Chr4:g.47901476 | G | A | 0.3195 | 1212.5 | rs12651301 | 0.42:0.33:0.35:0.63:0.35 | 0.31:0.33 | 0.31 | NT | 6 | P246L | NT |
| Chr4:g.47916008 | G | A | 0.0071 | 2576.5 | NA | 0.00:0.00:0.00:0.00:0.00 | 0.00:0.00 | 0.00 | 0.0043 | 2 | A71A | 0.0244 |
1 – Variant allele freq (VAF) in 117 UK SLIC probands is estimated by Syzygy using the proportion of reads that have the variant
2 – Median read depth for given base across all pools
3 - Variant allele frequency (VAF) in 1000 genomes super-populations (Integrated phase I data, accessed March 2014). ALL – all 1000 genomes populations combined (No. alleles ∼ 2184), AFR – African populations (YRI, LWK, GWD, MSL, ESN, ASW & ACB, No. chromosomes = 492), AMR – Ad mixed Americans (MXL, PUR, CLM, PEL, No. chromosomes = 362),ASN – East Asian (CHB, JPT, CHS, CDX & KHV, No. chromosomes = 572), EUR-European (TSI, FIN, GBR, IBS, no. chromosomes = 758).
4 – Exome Sequencing Project (ESP) variant allele frequency (VAF). EA – European Americans (no. chromosomes = 8600), AA – African Americans (no. chromosomes = 4268).
5 – Combined variant allele frequency across European controls from 1000 genomes and EVS (no. chromosomes = 9358)
6 – Allele frequency in SLI probands after confirmatory Sanger sequencing (no. chromosomes = 234)
7 – Amino acid change conferred by given sequence variant in protein NP_694540.3. If the change occurs within a conserved motif, this is noted.
8 – Fisher’s exact test for differences in allele frequencies between EVS European Americans and SLIC probands. ns = non-significant P<0.05
NT = not tested
Ns = not significant
Fig 3Coding variants observed in SLIC probands and their families.
Pedigrees are shown for nuclear families of SLIC individuals carrying three coding variations in NFXL1. Individuals carrying the variants are identified with a black circle. Sequencing traces of each variant is shown. SLIC probands are colored in red and other family members with SLI (defined as expressive and/or receptive language skills >1.5SD below that expected for their age) are colored in orange. In pedigree 3 (rs151113647), the youngest sibling (colored in yellow) did not meet the criteria for SLI but had expressive and receptive language scores ∼1SD below that expected for his age. Individuals with no shading have typical language ability. DNA was not available for individuals colored in grey.