| Literature DB >> 31451757 |
Takahiro Sanada1, Kyoko Tsukiyama-Kohara2,3, Tadasu Shin-I4, Naoki Yamamoto1, Mohammad Enamul Hoque Kayesh5,6,7, Daisuke Yamane1, Jun-Ichiro Takano8, Yumiko Shiogama8, Yasuhiro Yasutomi8, Kazuho Ikeo9, Takashi Gojobori9,10, Masashi Mizokami11, Michinori Kohara12.
Abstract
The northern tree shrew (Tupaia belangeri) possesses high potential as an animal model of human diseases and biology, given its genetic similarity to primates. Although genetic information on the tree shrew has already been published, some of the entire coding sequences (CDSs) of tree shrew genes remained incomplete, and the reliability of these CDSs remained difficult to determine. To improve the determination of tree shrew CDSs, we performed sequencing of the whole-genome, mRNA, and total RNA and integrated the resulting data. Additionally, we established criteria for the selection of reliable CDSs and annotated these sequences by comparison to the human transcriptome, resulting in the identification of complete CDSs for 12,612 tree shrew genes and yielding a more accurate tree shrew genome database (TupaiaBase: http://tupaiabase.org ). Transcriptome profiles in hepatitis B virus infected tree shrew livers were analyzed for validation. Gene ontology analysis showed enriched transcriptional regulation at 1 day post-infection, namely in the "type I interferon signaling pathway". Moreover, a negative regulator of type I interferon, SOCS3, was induced. This work, which provides a tree shrew CDS database based on genomic DNA and RNA sequencing, is expected to serve as a powerful tool for further development of the tree shrew model.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31451757 PMCID: PMC6710255 DOI: 10.1038/s41598-019-48867-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Statistics of whole-genome sequencing data.
| Pair-end libraries | Insert size | Read length (bp) | Total Data (Gb) | Sequence depth (fold)* | Physical depth (fold)* |
|---|---|---|---|---|---|
| Illumina Reads | 170 bp | 100 | 60.96 | 19.82 | 16.84 |
| 500 bp | 100 | 72.67 | 23.63 | 59.06 | |
| 800 bp | 100 | 51.25 | 16.66 | 66.65 | |
| 2 Kb | 57 | 56.64 | 18.41 | 322.49 | |
| 5 Kb | 49 | 35.94 | 11.68 | 596.03 | |
| 10 Kb | 49 | 41.24 | 13.41 | 1368 | |
| 20 Kb | 49 | 16.38 | 5.33 | 1087 | |
| Total | 335.08 | 108.93 | 3516.07 |
*The genome size is assumed to be 3.08 Gb.
Statistics of the assembled sequence length of whole-genome sequencing.
| Contig | Scaffold | |||
|---|---|---|---|---|
| Size (bp) | Number | Size (bp) | Number | |
| N50 | 33,380 | 24,001 | 1,149,110 | 679 |
| Longest | 267,380 | — | 13,631,494 | — |
| Total Size | 2,709,670,168 | — | 2,746,321,810 | — |
| Total number (>=100 bp) | — | 585,492 | — | 447,618 |
| Total number (>=2 kb) | — | 121,063 | — | 7,608 |
General statistics of predicted protein-coding genes of whole-genome sequencing.
| Gene set | Number | Average gene length (bp) | Average CDS length (bp) | Average exon per gene | Average exon length (bp) | Average intron length (bp) |
|---|---|---|---|---|---|---|
| GLEAN | 19,320 | 24,193 | 1,419 | 7.68 | 184.85 | 3,411 |
mRNA-seq overview.
| Sample | Total reads | % of >= Q30 bases | Trimmed reads | Total mapped reads | % of total mapped reads | Unmapped reads | % of unmapped reads |
|---|---|---|---|---|---|---|---|
| Uninfected tree shrew liver (#1) | 51.46 | 91.12 | 49.70 | 40.02 | 80.53 | 9.68 | 19.47 |
| Uninfected tree shrew liver (#2) | 52.91 | 90.82 | 50.98 | 45.73 | 82.96 | 9.39 | 17.04 |
| Uninfected tree shrew liver (#3) | 57.20 | 90.99 | 55.12 | 40.86 | 80.15 | 10.12 | 19.85 |
| Uninfected tree shrew liver (#4) | 53.87 | 90.86 | 51.85 | 42.14 | 81.26 | 9.72 | 18.74 |
| HBV-C 1 dpi tree shrew liver (#1) | 94.47 | 90.76 | 91.16 | 76.55 | 83.97 | 14.61 | 16.03 |
| HBV-C 1 dpi tree shrew liver (#2) | 111.28 | 90.31 | 107.00 | 87.40 | 81.68 | 19.60 | 18.32 |
| HBV-C 3 dpi tree shrew liver (#1) | 93.80 | 90.60 | 90.49 | 72.73 | 80.37 | 17.76 | 19.63 |
| HBV-C 3 dpi tree shrew liver (#2) | 129.30 | 90.44 | 124.43 | 99.89 | 80.27 | 24.54 | 19.73 |
| HBV-A 21 dpi tree shrew liver (#1) | 200.98 | 90.15 | 191.47 | 149.06 | 77.85 | 42.41 | 22.15 |
Read values represent millions of reads.
Total RNA-seq overview.
| Sample | Total reads | % of >= Q30 bases | Trimmed reads | Total mapped reads | % of total mapped reads | Unmapped reads | % of unmapped reads |
|---|---|---|---|---|---|---|---|
| Uninfected tree shrew liver (#1) | 117.38 | 94.87 | 115.39 | 100.37 | 86.99 | 15.02 | 13.01 |
| Uninfected tree shrew liver (#2) | 125.70 | 95.04 | 123.59 | 108.03 | 87.40 | 15.57 | 12.60 |
| Uninfected tree shrew liver (#3) | 92.43 | 94.74 | 90.28 | 82.01 | 90.84 | 8.27 | 9.16 |
| Uninfected tree shrew liver (#4) | 99.86 | 95.39 | 98.30 | 87.11 | 88.62 | 11.19 | 11.38 |
| HBV-C 1 dpi tree shrew liver (#1) | 106.45 | 95.33 | 104.75 | 92.07 | 87.90 | 12.68 | 12.10 |
| HBV-C 1 dpi tree shrew liver (#2) | 97.06 | 95.31 | 95.45 | 84.01 | 88.01 | 11.44 | 11.99 |
| HBV-C 3 dpi tree shrew liver (#1) | 99.51 | 95.51 | 97.87 | 85.30 | 87.15 | 12.57 | 12.85 |
| HBV-C 3 dpi tree shrew liver (#2) | 100.26 | 95.61 | 98.72 | 85.53 | 86.64 | 13.18 | 13.36 |
| HBV-A 21 dpi tree shrew liver (#1) | 94.93 | 95.57 | 93.48 | 80.42 | 86.02 | 13.07 | 13.98 |
| Uninfected tree shrew spleen (mix; #1, #2, and #4) | 91.34 | 95.43 | 89.69 | 79.00 | 88.09 | 10.69 | 11.91 |
| HBV-C 1 and 3 dpi tree shrew spleen (mix; 1 dpi [#1, #2], 3 dpi [#2]) | 88.74 | 95.55 | 87.09 | 78.24 | 89.83 | 8.85 | 10.17 |
| HBV-A 8 mpi tree shrew spleen (mix; #1, #2, and #3) | 93.64 | 95.56 | 91.72 | 82.19 | 89.62 | 9.52 | 10.38 |
Read values represent millions of reads.
Figure 1Schematic diagram of CDS identification of tree shrew genes.
Evaluation of assemblies by BUSCO.
| TupaiaBase (Sanada | TreeshrewDB (Fan | |
|---|---|---|
| Number of predicted genes | 53,935 | 119,898 |
| Number of predicted transcripts | 117,687 | 192,459 |
| BUSCO analysis | ||
| Complete BUSCOs (%) | 5,518 (89.1%) | 5,319 (85.9%) |
| Fragmented BUSCOs (%) | 398 (6.4%) | 562 (8.1%) |
| Missing BUSCOs (%) | 276 (4.5%) | 311 (5.0%) |
| Total BUSCOs (%) | 6,192 (100%) | 6,192 (100%) |
Figure 2Analysis of the accuracy of the gene sequences predicted on the basis of the combined genome and RNA sequencing. (a) Identities of nucleotide sequences of each gene when comparing between the cloned sequence and the sequence predicted from whole-genome sequencing or between the cloned sequence and the sequence predicted from the combined sequence data from whole-genome sequencing and RNA sequencing (b) Percentage of sequence-completely-matched genes between the cloned sequence and sequence predicted from whole-genome sequencing or combined sequence data. (c,d) Comparison of predicted and actual gene sequences for CD8A (c) and IL7 (d). Upper sequence: predicted sequence based on genome sequencing. Middle sequence: predicted sequence based on genome and RNA sequencing. Lower sequence: Cloned sequence.
Figure 3Expression level of liver-specific genes in tree shrew liver and in humanized liver in mouse. Correlation between gene expression level in humanized liver and homologous gene in tree shrew liver assessed for selected transcripts (a) or transcripts that failed to meet our criteria (b). Broken lines indicate regression curves.
Figure 4HBV infection in tree shrew. (a) Experimental schedule of HBV infection in tree shrew. (b–d) Viral DNA titers in sera (b) and liver (c), and serum ALT level (d), at 1 dpi or 3 dpi in HBV-infected tree shrew, or in uninfected tree shrew. Heavy bars indicate means of each group. (e) Histological analysis (hematoxylin-eosin staining; representative images) of liver from uninfected and HBV-infected (at 1 and 3 dpi) tree shrews. Bar, 100 μm.
Figure 5GO term analysis of differentially expressed genes in HBV-infected tree shrew at 1 dpi (a) and 3 dpi (b).
Figure 6Expression analysis of genes related to type I interferon signaling. (a) Expression levels (in uninfected tree shrew and in HBV-infected tree shrew at 1 and 3 dpi) of genes whose GO included the term “type I interferon signaling pathway” and exhibited the strongest differential expression genes at 1 dpi. (b) Expression levels (in uninfected tree shrew and in HBV-infected tree shrew at 1 and 3 dpi) of genes known to be central to the type I interferon signaling pathway. Horizontal bars indicate mean values in each group. Asterisks indicate significant differences (p < 0.05).