| Literature DB >> 26740553 |
Roman A Laskowski1, Nidhi Tyagi1, Diana Johnson2, Shelagh Joss3, Esther Kinning3, Catherine McWilliam4, Miranda Splitt5, Janet M Thornton1, Helen V Firth6, Caroline F Wright7.
Abstract
We present a generic, multidisciplinary approach for improving our understanding of novel missense variants in recently discovered disease genes exhibiting genetic heterogeneity, by combining clinical and population genetics with protein structural analysis. Using six new de novo missense diagnoses in TBL1XR1 from the Deciphering Developmental Disorders study, together with population variation data, we show that the β-propeller structure of the ubiquitous WD40 domain provides a convincing way to discriminate between pathogenic and benign variation. Children with likely pathogenic mutations in this gene have severely delayed language development, often accompanied by intellectual disability, autism, dysmorphology and gastrointestinal problems. Amino acids affected by likely pathogenic missense mutations are either crucial for the stability of the fold, forming part of a highly conserved symmetrically repeating hydrogen-bonded tetrad, or located at the top face of the β-propeller, where 'hotspot' residues affect the binding of β-catenin to the TBLR1 protein. In contrast, those altered by population variation are significantly less likely to be spatially clustered towards the top face or to be at buried or highly conserved residues. This result is useful not only for interpreting benign and pathogenic missense variants in this gene, but also in other WD40 domains, many of which are associated with disease.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26740553 PMCID: PMC4754046 DOI: 10.1093/hmg/ddv625
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Summary of the clinical features in children with diagnostic variants in TBL1XR1
| Reference | Patient ID | Age (years) | Sex | Mutation | HGVS | Clinical features | First words |
|---|---|---|---|---|---|---|---|
| DDDa | DECIPHER259340 | 11 | M | ENST00000430069.1c.1322A > G | Global developmental delay | 3 years | |
| DDDa | DECIPHER261213 | 14 | F | ENST00000430069.1:c.1108G > T | Global developmental delay | Non-verbal | |
| DDDa | DECIPHER271955 | 5 | M | ENST00000430069.1:c.983A > G | Global developmental delay | Non-verbal | |
| DDDa | DECIPHER273334 | 6 | F | ENST00000430069.1:c.1331C > G | Global developmental delay, autism | 2 years | |
| DDDa | DECIPHER280701 | 7 | M | ENST00000430069.1:c.639T > A | Global developmental delay, autism | 1 year | |
| DDDa | DECIPHER260965 | 5 | M | ENST00000430069.1:c.800dupG | Global developmental delay, autism | 2–2.5 years | |
| Saitsu | ClinVar 191371 | 5 | F | ENST00000430069.1:c.209G > A | Developmental delay, autistic features | Non-verbal | |
| O'Roak | NA | Not known | F | ENST00000430069.1:c.845T > C | Mild/moderate IQ, autism | Unknown | |
| O'Roak | NA | Not known | M | ENST00000430069.1:c.1190delT | Autism | Unknown | |
| Pons | NA | 8 | F | Maternally inherited gene deletion | 707 kb deletion | Intellectual disability, dysmorphism (also observed in mother) | Delayed |
| Tabet | NA | 6 | F | 1.6 Mb deletion | Intellectual disability, dysmorphism | 2.5 years |
See Supplementary Material, Table S1 for a more detailed clinical description. Variants are an notated using standard HGVS nomenclature (for simplicity, parentheses indicating missense prediction are omitted throughout the text).
aVariants deposited in DECIPHER database (https://decipher.sanger.ac.uk).
All missense variants identified in TBL1XR1 overlapping the WD40 domain of TBLR1 (June 2015; see also Fig. 4)
| Variation | Source (allele count) | Location (GRCh37) | Ref/alt | Predicted amino acid change |
|---|---|---|---|---|
| Population | ExAC (1) | chr3:176768368 | C/T | Gly153Glu |
| Population | dbSNP | chr3:176768338 | A/G | Val163Ala |
| Population | ExAC (1) | chr3:176768288 | C/T | Val180Ile |
| Population | ExAC (1) | chr3:176767892 | T/A | Ser199Cys |
| Population | ExAC (1) | chr3:176767879 | G/C | Thr203Ser |
| Diagnostic | DDD | chr3:176767848 | A/T | His213Gln |
| Population | ExAC (1) | chr3:176765173 | C/T | Ser260Asn |
| Population | dbSNP | chr3:176765158 | T/C | His265Arg |
| Diagnostic | O'Roak | chr3:176765107 | A/G | Leu282Pro |
| Population | ExAC (1) | chr3:176756189 | T/C | Asn320Ser |
| Population | dbSNP | chr3:176756189 | T/G | Asn320Thr |
| Population | EVA | chr3:176756187 | T/C | Thr321Ala |
| Diagnostic | DDD | chr3:176756165 | T/C | Asp328Gly |
| Population | ExAC (1) | chr3:176756102 | G/T | Thr349Lys |
| Population | ExAC (2) | chr3:176755930 | T/C | Thr360Ala |
| Population | dnSNP | chr3:176755930 | T/A | Thr360Ser |
| Population | ExAC (1) | chr3:176755923 | T/G | Asn362Thr |
| Diagnostic | DDD | chr3:176755900 | C/A | Asp370Tyr |
| Population | ExAC (1) | chr3:176752065 | T/C | Asn391Asp |
| Population | ExAC (2) | chr3:176752022 | C/T | Gly405Glu |
| Population | ExAC (5) | chr3:176752016 | T/C | Asn407Ser |
| Population | dbSNP | chr3:176752017 | T/C | Asn407Asp |
| Population | ExAC (1) | chr3:176752014 | T/C | Asn408Asp |
| Population | ExAC (1) | chr3:176750916 | A/C | Phe420Cys |
| Population | ExAC (1) | chr3:176750908 | T/C | Thr423Ala |
| Population | ExAC (1) | chr3:176750905 | C/G | Val424Leu |
| Population | ExAC (1) | chr3:176750884 | G/C | Arg431Gly |
| Population | dbSNP | chr3:176750883 | C/T | Arg431Gln |
| Population | dbSNP | chr3:176750860 | T/C | Thr439Ala |
| Population | dbSNP | chr3:176750855 | T/G | Lys440Asn |
| Diagnostic | DDD | chr3:176750853 | T/C | His441Arg |
| Diagnostic | DDD | chr3:176750844 | G/C | Pro444Arg |
| Population | ExAC (1) | chr3:176750817 | T/C | Asp453Gly |
| Population | ExAC (2) | chr3:176750811 | C/T | Arg455Lys |
| Population | ExAC (27) | chr3:176744255 | G/A | Ala475Val |
| Population | ExAC (1) | chr3:176744247 | G/C | His478Asp |
| Population | ExAC (1) | chr3:176744189 | T/C | Lys497Arg |
| Population | dbSNP | chr3:176743294 | G/A | Arg513Trp |
| Population | ExAC (1) | chr3:176743291 | T/G | Lys514Gln |
Figure 4.Z-axis location of all variants in the WD40 domain of TBLR1. (A) Graphical representation taken from the top to bottom face (PDB entry 4lg9). Likely pathogenic missense mutations are indicated in red (with new diagnoses from the DDD study completely filled), whereas population missense variants are indicated in green and other residues are indicated in blue. The backbone position of all residues is shown, based on the Z-axis location of the backbone carbonyl carbon in the crystal structure. Larger diamonds represent variants that are present multiple times across the databases, and crosses indicate the approximate interpolated location of residues that are absent from the PDB file. (B) Three-dimensional representation viewed from the side using PDB entry 4lg9, with all missense variants highlighted using-stick representation (space-filled for new DDD diagnoses). Likely pathogenic missense mutations are indicated in red, whereas population missense variants are indicated in green and the rest of the domain is represented using blue ribbons. (C) Boxplot of Z-axis location in PDF entry 4lg9 of diagnostic mutations (red), the conserved tetrads (beige), hotspot residues on the top face (grey), population variation (green) and all amino acid residues in the domain (blue) in the TBLR1 protein. P-values are not significant between the diagnostic/tetrad/top face residues or between population/all residues, but are significant between these groups (diagnostic versus population residues, P = 9×10−5).
Figure 1.Structure of TBLR1. (A) Domain structure with location of diagnostic missense mutations. The five new DDD mutations are indicated in black and the two previously published mutations in grey. (B) Three-dimensional β-propeller structure of the WD40 domain from PDB entry 4lg9, top and side views. The eight propeller blades are rainbow coloured, starting with red for the N-terminus through to violet for the C-terminus.
Figure 2.Conserved sequence elements of the WD40 motif. (A) PROSITE sequence logo for the WD40 motif, derived from a multiple sequence alignment of 6896 sequence fragments. The one-letter amino acid codes are coloured by type (blue basic, red acidic, green and purple polar and the rest black). The height of each corresponds to its frequency of occurrence in the alignment. (B) Structure-based alignment of the eight WD40 motifs in the crystal structure of TBLR1. The motifs were manually extracted from the 4lg9 PDB file and then aligned using the PDBeFold Server (36). The numbers on the left show the range of residue numbers in the sequence on that line. The one-letter amino acid codes are coloured as per the PROSITE sequence logo (A); lower-case letters correspond to residues not aligned in the 3D superposition. The numbers along the bottom roughly correspond to the sequence positions in the WD40 motif in (A). The amino acids having an orange background are those belonging to the Asp-His-Ser/Thr-Trp tetrad. The red borders identify the five amino acids involved in the DDD missense mutations: His213Gln, Asp328Gly, Asp370Tyr, His441Arg and Pro444Arg. The amino acids with the light grey backgrounds are the hotspot residues on the domain's top face, as identified by WDSPdb (31), being the ones likely to interact with β-catenin when it binds.
Figure 3.Representation of the hydrogen-bonding network of the DHSW tetrad. Taken from the fifth WD40 motif in the 3D structure of TBLR1 (PDB entry 4lg9). (A) Schematic representation showing the four sidechains involved: Asp370, His348, Ser366 and Trp376. Hydrogen bonds are shown by the green dotted lines. (B) Three-dimensional representation showing the location and sidechains of the four tetrad residues; the rest of the domain is represented only by backbone atoms N, Cα and C. Potential hydrogen bonds are shown by the dashed lines. Note the importance of the highly conserved Asp370, which can not only hydrogen-bond to the histidine, but also to the backbone of neighbouring strands, helping hold the propeller-blade structure together.