| Literature DB >> 31340759 |
Maria Kamilari1, Aslak Jørgensen1, Morten Schiøtt2, Nadja Møbjerg3.
Abstract
BACKGROUND: Tardigrades are renowned for their ability to enter cryptobiosis (latent life) and endure extreme stress, including desiccation and freezing. Increased focus is on revealing molecular mechanisms underlying this tolerance. Here, we provide the first transcriptomes from the heterotardigrade Echiniscoides cf. sigismundi and the eutardigrade Richtersius cf. coronifer, and compare these with data from other tardigrades and six eukaryote models. Investigating 107 genes/gene families, our study provides a thorough analysis of tardigrade gene content with focus on stress tolerance.Entities:
Keywords: Cold shock domain; Functional gene categories; Model organisms; Stress genes; Tardigrada; Transcriptomics
Year: 2019 PMID: 31340759 PMCID: PMC6652013 DOI: 10.1186/s12864-019-5912-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Statistical summary of RNA-Seq on Echiniscoides cf. sigismundi and Richtersius cf. coronifer (data from BGI)
| Total Bases | 5,531,945,800 | 5,761,908,600 |
| Number of raw reads | 58,601,018 | 64,000,000 |
| Number of clean reads | 55,319,458 | 57,619,086 |
| GC% | 34.73% | 47.40% |
| Q20 | 97.54% | 96.63% |
| Number of contigs | 55,499 | 84,106 |
| Mean length of contigs (bp) | 450 | 454 |
| N50contigs | 1176 | 1259 |
| Number of Unigenes | 31,601 | 55,053 |
| Mean length of Unigenes | 830 | 1052 |
| N50Unigenes | 1524 | 2197 |
| Number of Unigenes > 1 kb | 9974 | 20,319 |
| Number of Unigenes > 3 kb | 1068 | 4589 |
| FPKMmean / (min-max) | 27.3587/(0.0–44,747.82) | 13.4950/(0.0–14,668.10) |
| FPKMmedian | 4.6156 | 2.1674 |
| FPKMsum | 864,563.2158 | 742,939.1452 |
| TPMmean / (min-max) | 31.6446 (0–51,757.72) | 18.1643 (0–19,743.35) |
| TPMmedian | 5.3387 | 2.9173 |
| Number of Unigenes after filteringa | 15,784 | 21,384 |
| Mean length of Unigenes after filtering | 1175 | 1160 |
aFor the detailed analysis of stress related genes (see Table 3), we applied the following filtering parameters: FPKM > 1; transcript length > 300 bp; longest contig for each locus (see text for details)
Annotation results for the Echiniscoides cf. sigismundi and Richtersius cf. coronifer transcriptomes (data from BGI)
| Annotation | ||
|---|---|---|
| Number of Unigenes | 31,601 | 55,053 |
| Unigenes with hits in NR database | 13,388 | 20,001 |
| Unigenes with hits in NT database | 4334 | 3865 |
| Unigenes with hits in Swiss-Prot database | 12,295 | 18,132 |
| Unigenes with KEGG pathways | 10,519 | 15,898 |
| Unigenes with hits in COG database | 6042 | 9195 |
| Unigenes with GO terms | 6047 | 10,853 |
| Total annotated Unigenes | 14,159 | 20,326 |
| Protein coding region prediction | ||
| Unigenes mapped to protein databasesa | 13,578 | 20,043 |
| Unigenes with predicted CDS (ESTscan) | 7359 | 3322 |
aNR, Swiss-Prot, KEGG and COG databases
Stress related genes within the ten investigated eukaryotes. Numbers reflect the number of putative genes retrieved within each gene category (See Additional file 1 for details)
| Gene class/category | Tardigrada | Arthropoda | Nematoda | Chordata | Ascomycota | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Es | Rc | Rv | He | Dm | Ce | Hs | Xt | Dr | Sc | |
| Tardigrade unique proteinsa | 0 | 23 | 32 | 25 | 0 | 0 | 0 | 0 | 0 | 0 |
| Late Embryogenesis Abundant ( | 5 | 6 | 9 | 7 | 0 | 2 | 0 | 0 | 0 | 0 |
| Heat shock proteinsc | 56 | 65 | 59 | 117 | 79 | 65 | 90 | 77 | 96 | 45 |
| RNA/DNA Chaperones_Cold Shock Domain containingd | 3 | 1 | 1 | 1 | 1 | 6 | 5 | 5 | 3 | 0 |
| DNA repair | (58) | (91) | (74) | (80) | (61) | (57) | (77) | (73) | (74) | (57) |
| | 0 | 1 | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 0 |
| Base excision repairf | 18 | 31 | 24 | 27 | 15 | 13 | 22 | 21 | 19 | 14 |
| Mismatch repairg | 15 | 20 | 13 | 16 | 10 | 12 | 14 | 14 | 14 | 14 |
| Nucleotide excision repairh | 18 | 23 | 18 | 21 | 15 | 16 | 18 | 17 | 19 | 15 |
| Non-homologous end-joiningi | 0 | 5 | 6 | 6 | 8 | 3 | 5 | 5 | 5 | 5 |
| Homologous recombinationj | 7 | 11 | 12 | 9 | 12 | 12 | 15 | 13 | 14 | 9 |
| Antioxidative stress | (70) | (79) | (79) | (89) | (82) | (89) | (66) | (53) | (68) | (29) |
| | 8 | 14 | 17 | 15 | 6 | 6 | 4 | 2 | 5 | 3 |
| | 0 | 4 | 4 | 4 | 2 | 3 | 1 | 2 | 1 | 2 |
| | 5 | 7 | 9 | 12 | 9 | 3 | 6 | 6 | 6 | 3 |
| | 12 | 13 | 10 | 12 | 14 | 11 | 14 | 12 | 14 | 8 |
| | 5 | 4 | 3 | 3 | 4 | 6 | 4 | 4 | 4 | 5 |
| | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 |
| | 2 | 1 | 1 | 1 | 2 | 8 | 8 | 6 | 9 | 3 |
| | 1 | 2 | 2 | 3 | 2 | 1 | 1 | 1 | 1 | 1 |
| | 35 | 30 | 31 | 34 | 38 | 49 | 22 | 13 | 20 | 2 |
| | 0 | 2 | 0 | 2 | 3 | 0 | 4 | 5 | 6 | 0 |
| | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 1 | 1 |
| Peroxisomal biogenesis factorsk | 4 | 3 | 5 | 2 | 17 | 13 | 21 | 21 | 23 | 12 |
| Trehalose metabolism | (3) | (9) | (10) | (9) | (5) | (8) | (2) | (2) | (2) | (7) |
| | 0 | 1 | 1 | 0 | 2 | 3 | 0 | 0 | 0 | 4 |
| | 3 | 8 | 9 | 9 | 3 | 5 | 2 | 2 | 2 | 3 |
Es: Echiniscoides cf. sigismundi, Rc: Richtersius cf. coronifer, Rv: Ramazzottius cf. varieornatus, He: Hypsibius exemplaris, Dm: Drosophila melanogaster, Ce: Caenorhabditis elegans, Hs: Homo sapiens, Xt: Xenopus tropicalis, Dr: Danio rerio, Sc: Saccharomyces cerevisiae
a= CAHS; SAHS; MAHS; RvLEAM; Dsup
b= LEA; DUR-1
c= HSP90; HSP70; HSP60; HSP40; HSP20; HSP10
d= CSP; lin28; Y-box
e= Tp53; p63/p73
f= UNG; XRCC1; XRCC3; XRCC2; PNKP; Tdp1; APTX; POLB; POLD; POLE; FEN1; PCNA; PARP1–4
g= MSH2; MSH6; MSH3; MSH4; MSH5; MLH1; PMS2; MLH3; Exo1; RFC
h= XPC; CETN2; Rad23; DDB; GTF2H1/TFIIH1; GTF2H2/TFIIH2; GTF2H3/TFIIH3; GTF2H4/TFIIH4; CDK7; ERCC3; ERCC2; ERCC1; XPA; ERCC5
i= XRCC6; XRCC5; CLP/XRCC4; LIG4; NHEJ1
j= MRE11; Rad50; NBS1; Rad51; CtIP; BRCA1; BRCA2; Slx1; SLX4; Mus81; EME1
k= PEX1; PEX2; PEX3; PEX5; PEX6; PEX7; PEX10; PEX11; PEX12; PEX13; PEX14; PEX16; PEX19; PEX26; PXMP2; PMP34; PXMP4; TYSND1
Fig. 1Global comparisons of transcripts and predicted protein sequences. Transcript and protein sequences of E. cf. sigismundi and R. cf. coronifer were compared to sequences from other tardigrades and model organisms. Global comparisons were conducted using BLASTX for the transcripts alignments (left) and BLASTP (right) for the predicted protein sequences. Heatmaps represent the percentage of transcripts (left) or predicted protein (right) with detectable sequence similarity at an e-value threshold of 10e−5
Fig. 2Comparison of shared and species-specific orthologous protein groups as revealed by OrthoMCL analyses. a Shared and species-specific orthologous protein groups within tardigrades; b Shared and species-specific protein groups between tardigrades and other ecdysozoans as revealed by a comparison between E. cf. sigismundi, R. cf. coronifer, D. melanogaster and C. elegans; c Comparison of shared and species-specific orthologous protein groups between E. cf. sigismundi and four model eukaryote organisms; d Comparison of shared and species-specific orthologous protein groups between R. cf. coronifer and four model eukaryote organisms
Fig. 3Comparative investigation of gene expression in tardigrades. Comparative data on the 10 most highly expressed protein coding genes within tardigrade transcriptomes and cumulative expression of the stress related gene categories under study (for the complete list of genes refer to Additional file 1). The predicted protein coding genes were obtained from the transcriptomes of three tardigrade species (Echiniscoides cf. sigismundi, Richtersius cf. coronifer and Ramazzottius cf. varieornatus). Values are depicted as TPM. Gradient purple columns: dark = 1st transcript, light = 10th transcript. Rv1,2,3,5,6,7,9 = hypothetical proteins; Rv4 = SAHS1 (Secretory Abundant Heat Soluble 1); Rv10 = SAHS2 (Secretory Abundant Heat Soluble 2); Rv8 = cuticular protein. Rc1,Rc6 = hypothetical proteins; Rc2,5 = uncharacterized proteins; Rc3 = PE-1(Peritrophin-1); Rc4 = rhogef domain containing protein; Rc7 = NOTCH1-like (Neurogenic locus notch-like protein 1); Rc8 = SAHS1 (Secretory Abundant Heat Soluble 1); Rc9 = COI (Cytochrome Oxidase subunit I); Rc10 = periostin. Es1,6,7,8,=hypothetical proteins; Es2,3,9 = uncharacterized proteins; Es4 = CSRP3 (cysteine and glycine rich protein 3 precursor); Es5 = proactivator polypeptide-like; Es10 = HSP20 (Heat Shock Protein 20)
Fig. 4Alignment of amino acid sequences containing a Cold Shock Domain (CSD). Data obtained from representative bacteria and animals including tardigrades. Tardigrade sequences are indicated by bold in the left margin of the figure. RNP1 and RNP2 (shaded in grey) represent consensus RNA binding domains. DNA binding sites are highlighted in yellow. Note the RGG (green) and RG (orange) repeats. In the graphical representation below the CSD, the overall height of the stack indicates the sequence conservation at the specific position, while the height of symbols within the stack indicates the relative frequency of each amino acid at the position
Fig. 5Phylogenetic analysis of Cold Shock Domain proteins. Protein sequences of Cold Shock Domain containing proteins from various species of bacteria and animals aligned using Muscle. The Maximum Likelihood phylogenetic tree was constructed using RAxML. Bootstrap values (1000 trials) are shown on branches. Clades with bootstrap values < 50 have been collapsed into polytomies using TreeGraph2 [92]. All animal taxa are clustered together, and are separated into a clade containing the YB proteins and a clade containing the Lin28 proteins with Zn finger motifs. The Es_CSP form a separate clade as sister-group to the YB cluster of the rest of the animal taxa analyzed