| Literature DB >> 27371955 |
Sima Drini1, Alexis Criscuolo2, Pierre Lechat2, Hideo Imamura3, Tomáš Skalický4, Najma Rachidi1, Julius Lukeš5, Jean-Claude Dujardin6, Gerald F Späth7.
Abstract
All eukaryotic genomes encode multiple members of the heat shock protein 70 (HSP70) family, which evolved distinctive structural and functional features in response to specific environmental constraints. Phylogenetic analysis of this protein family thus can inform on genetic and molecular mechanisms that drive species-specific environmental adaptation. Here we use the eukaryotic pathogen Leishmania spp. as a model system to investigate the evolution of the HSP70 protein family in an early-branching eukaryote that is prone to gene amplification and adapts to cytotoxic host environments by stress-induced and chaperone-dependent stage differentiation. Combining phylogenetic and comparative analyses of trypanosomatid genomes, draft genome of Paratrypanosoma and recently published genome sequences of 204 L. donovani field isolates, we gained unique insight into the evolutionary dynamics of the Leishmania HSP70 protein family. We provide evidence for (i) significant evolutionary expansion of this protein family in Leishmania through gene amplification and functional specialization of highly conserved canonical HSP70 members, (ii) evolution of trypanosomatid-specific, non-canonical family members that likely gained ATPase-independent functions, and (iii) loss of one atypical HSP70 member in the Trypanosoma genus. Finally, we reveal considerable copy number variation of canonical cytoplasmic HSP70 in highly related L. donovani field isolates, thus identifying this locus as a potential hot spot of environment-genotype interaction. Our data draw a complex picture of the genetic history of HSP70 in trypanosomatids that is driven by the remarkable plasticity of the Leishmania genome to undergo massive intra-chromosomal gene amplification to compensate for the absence of regulated transcriptional control in these parasites.Entities:
Keywords: HSP70; Leishmania; copy number variation; evolution; gene loss; heat shock protein; phylogeny; synteny
Mesh:
Substances:
Year: 2016 PMID: 27371955 PMCID: PMC4943205 DOI: 10.1093/gbe/evw140
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FComparative analyses of the L. major HSP70 family. (A) HSP70 gene coding density. Number of HSP70 members relative to the genome size of selected eukaryotes is shown. The phylogenetic relationship (left) is freely derived from consensus knowledge on eukaryote systematics (Adl et al. 2012; He et al. 2014; Williams 2014). (B) Unrooted ML phylogenetic tree of the TriTryp HSP70 family. E. coli DnaK (acc. no. WP_000516138.1), human HSPA9 (AAH24034.1), HSP1A1A (AAH18740.1), HSPA1B (EAX03529.1), HSPA5 (AAI12964.1), and HSPA8 (AAH07276.2) were included to identify the cluster of potential mitochondrial, cytoplasmic, or ER HSP70 family members. The different colors correspond to different phylogenetic clusters that are labeled according to table 1. Scale bar represents 0.4 amino acid substitutions per site.
FIdentification of L. major HSP70 family members. Domain structure (left) and unrooted ML phylogenetic tree (right) of L. major HSP70 family members (proposed and current gene names are denoted in the middle). Domain structures were built using the NCBI Batch CD-search (Marchler-Bauer et al. 2015) and curated manually respecting the identified domain boundaries denoted in table 1. Black, protein backbone; red, nucleotide binding domain (cd10233, cl17037, cd11733, cd10228, cd10230, and cd10170); gray, HSP70 domain (PTZ00009, pfam00012, PTZ00186, and PTZ00400); purple, TPR domain (pfam00515, cl22897, and pfam13424); cyan, protein domain of unknown function (DUF3919); green, Hydantoinase/oxoprolinase domain (pfam01968). Left and right scale bars represent 100 amino acids and 0.5 amino acid substitutions per site, respectively. Confidence support at the branches of the tree is based on 100 bootstrap replicates.
The L. major HSP70 protein family
| Gene ID | AA | GeneDB annotation | Namea | Domains ( | |
|---|---|---|---|---|---|
| LmjF.28.2770 | 71.7 | 658 | HSP70, putative | LmjF_HSPA1A | |
| LmjF.28.2780 | 71.7 | 658 | HSP70, putative | LmjF_HSPA1B | |
| LmjF.26.1240 | 70.6 | 641 | HSP70.4 | LmjF_HSPA2 | |
| LmjF.28.1200 | 71.9 | 658 | BiP | LmjF_HSPA5 | |
| LmjF.30.2460 | 68.9 | 635 | HSP70-related protein 1 | LmjF_HSPA9A | |
| LmjF.30.2470 | 71.9 | 662 | HSP70-related protein 1 | LmjF_HSPA9B | |
| LmjF.30.2480 | 71.9 | 662 | HSP70-related protein 1 | LmjF_HSPA9C | |
| LmjF.30.2490 | 71.6 | 660 | HSP70-related protein 1 | LmjF_HSPA9D | |
| LmjF.30.2550 | 70.6 | 652 | HSP70-related protein 1 | LmjF_HSPA9E | |
| LmjF.28.2820 | 72.3 | 662 | HSP70, putative | LmjF_HSPA3 | |
| LmjF.18.1370 | 91.8 | 823 | HSP110, putative | LmjF_HSPH1 | cd10228, 2-390 (0); cl17037, 2-390 (0); pfam02782, 286-388 (4.4e-06); |
| LmjF.26.0900 | 104.1 | 944 | HSP70-like protein | LmjF_HRP1 | |
| LmjF.35.4710 | 78.8 | 723 | Hypothetical protein | LmjF_HRP2 | |
| LmjF.01.0640 | 117.5 | 1112 | HSP70-like protein | LmjF_HRP3 | |
| LmjF.29.1240 | 78.9 | 722 | Hypothetical protein | LmjF_HRP4 | |
| LmjF.32.0190 | 81.5 | 768 | Hypothetical protein | LmjF_HRP5 |
Proposed name following the nomenclature assigned by the HUGO Gene Nomenclature Committee (http://www.genenames.org/genefamilies/HSP, last accessed June 18, 2016) and used in the NCBI Entrez Gene database for human HSPs. cd10233, Nucleotide-binding domain of HSPA1-A, -B, -L, HSPA-2, -6, -7, -8, and similar proteins (HSPA1-2_6-8-like_NBD); cl17037, nucleotide-binding domain of the sugar kinase/HSP70/actin superfamily (NBD_sugar-kinase_HSP70_actin); PTZ00009, heat shock 70 kDa protein domain; pfam00012, HSP70 domain; pfam01968, Hydantoinase/oxoprolinase domain; cl00668, Hydantoinase_A Superfamily; pfam13057, Protein domain of unknown function (DUF3919); cl16063, protein domain of unknown function (DUF3919); cd11733, nucleotide-binding domain of human HSPA9, Escherichia coli DnaK, and similar proteins (HSPA9-like_NBD); PTZ00186, heat shock 70 kDa precursor protein domain; cd10228, nucleotide-binding domain of 105/110 kDa heat shock proteins including HSPA4 and similar proteins (HSPA4_like_NDB); pfam02782, FGGY family of carbohydrate kinases, C-terminal domain (FGGY_C); cl17173, AdoMet_MTases superfamily; cd10230, nucleotide-binding domain of human HYOU1 and similar proteins (HYOU1-like_NBD); cd10170, nucleotide-binding domain of the HSP70 family (HSP70_NBD); PTZ00400, DnaK-type molecular chaperone; pfam00515, tetratricopeptide repeat domain TPR_1; cl22897, TPR_1 superfamily; pfam13424, tetratricopeptide repeat domain TPR_12. In bold are the domains shown in fig. 1.
FMultiple sequence alignment of cytoplasmic (A) and mitochondrial (B) canonical L. major HSP70 members. Sequences of the L. major HSP70 family members were aligned with E. coli DnaK (acc. no. EDX37800) and human HSC70 (acc. no. P11142). Blue boxes, sequence elements implicated in nucleotide binding (phosphate-1, phosphate-2, and adensoine); green box, linker; red boxes, sub-domains of the beta-sheet; blue, residues involved in allosteric switching and inter-domain function (P143 and R151 in E. coli DnaK; Vogel, et al. 2006); red, residues interacting with the DnaJ domain of HSP40 (Y145, N147, D148, N170, and T173 in E. coli DnaK; Gassler et al. 1998; Suh et al. 1998); purple, acidic Q stretch characteristic of mitochondrial HSP70; black arrow heads, residues in contact with the substrate (Mayer et al. 2000); red underlined, terminal EEV motif; blue underlined, terminal ER retention signal (Pelham 1989); green lines, lid sub-domains of DnaK. Adapted from Shonhai, et al. 2007.
FSpecies-specific copy number variation of the Leishmania HSPA1 and HSPA9 sub-families. (A) Synteny analysis. Synteny view of the HSPA1 (upper panel, colored in red) and HSPA9 loci (lower panel, colored in red) across the different Leishmania species indicated. Homologous genes are connected by lines. (B) Read depth analysis. The cumulative number of read counts that map to the annotated HSPA1 and HSPA9 genes in the various reference strains is plotted. Lmj, L. major Friedlin; Linf, L. infantum JCPM5; LdBPK, L. donovani LdBPK282A1; LdBPQ, L. donovani LdPBQ7IC8. (C) ML phylogenetic trees of HSPA1 and HSPA9 sequences. Thick branches correspond to bootstrap-based confidence support > 70%. Scale bar represents 0.05 nucleotide substitutions per site. Of note, same ML trees were inferred when using codon evolutionary models (Gil et al. 2013).
FAbsence of HRP4 in the Trypanosoma spp. genome is due to chromosomal deletion. (A) Re-annotation of the L. donovani HRP4 ORF. Schematic representation of the L. donovani HRP4 CDS region. The percent nucleotide identity is indicated for 5′ UTR (black), the putative additional coding sequence of 171 nucleotides of L. donovani HRP4 (white), and the currently annotated ORF (gray). The numbers indicate the percent of nucleotide identity to the L. tarentolae HRP4 CDS region (see supplementary fig. S11, Supplementary Material online). ATG, currently mis-annotated start codon; ATP*, correct start codon identified in this study. (B) Validation of the HRP4 ORF by proteomics analysis. The spectrum of the peptide VAIMDPAK that is diagnostic for the additional 57 amino acids of the longer ORF is shown as identified by LC-MS-MS analysis of L. donovani LdBob (Hem et al. 2010). (C) HRP4 domain structure. The domain structure for HRP4 was built using the NCBI Batch CD-search (Marchler-Bauer et al. 2015) and curated manually. The number indicates % identity to the L. major HRP4 protein across the domains indicated. Black, protein backbone; red, nucleotide binding domain (cd10170); purple, TPR domain (pfam13424); white arrows, TPR repeats. (D) ML phylogenetic tree of HRP4 and HSPA1 sequences. Thick branches correspond to bootstrap-based confidence support > 70%. Scale bar represents 0.5 amino acid substitutions per site. Of note, the long-branch joining the two clades was shortened for better reading. (E) Synteny analysis of HRP4. Synteny view of the HRP4 locus (red) and surrounding genes across the indicated Leishmania and Trypanosoma species. Homologous genes are connected by lines. The HRP4 gene is indicated in red.
FHeat map of gene copy number variation of HSP70 family members in LdBPK field isolates. The normalized read depth for the genes coding for the indicated HSP70 members is plotted against 204 L. donovani field isolates (Imamura et al. 2016), Leishmania infantum JPCM5 and L. donovani LV9. The color scale at the upper right shows normalized depth calibrated to correspond to one gene copy for a single copy gene on a diploid chromosome. The dendrogram shown at the top of the heat map was established based on euclidean distance of the depth of HSP70 genes.