| Literature DB >> 29087374 |
Margherita Francescatto1, Marina Lizio2,3, Ingrid Philippens4, Luba M Pardo, Ronald Bontrop4, Mizuho Sakai2,3, Shoko Watanabe2,3, Masayoshi Itoh2,3,5, Akira Hasegawa2,3, Timo Lassmann3,6, Jessica Severin2,3, Jayson Harshbarger2,3, Imad Abugessaisa2, Takeya Kasukawa2, Piero Carninci2,3, Yoshihide Hayashizaki3,5, Alistair R R Forrest3,7, Hideya Kawaji2,3,5,8, Patrizia Rizzu9, Peter Heutink9.
Abstract
Rhesus macaque was the second non-human primate whose genome has been fully sequenced and is one of the most used model organisms to study human biology and disease, thanks to the close evolutionary relationship between the two species. But compared to human, where several previously unknown RNAs have been uncovered, the macaque transcriptome is less studied. Publicly available RNA expression resources for macaque are limited, even for brain, which is highly relevant to study human cognitive abilities. In an effort to complement those resources, FANTOM5 profiled 15 distinct anatomical regions of the aged macaque central nervous system using Cap Analysis of Gene Expression, a high-resolution, annotation-independent technology that allows monitoring of transcription initiation events with high accuracy. We identified 25,869 CAGE peaks, representing bona fide promoters. For each peak we provide detailed annotation, expanding the landscape of 'known' macaque genes, and we show concrete examples on how to use the resulting data. We believe this data represents a useful resource to understand the central nervous system in macaque.Entities:
Mesh:
Year: 2017 PMID: 29087374 PMCID: PMC5663209 DOI: 10.1038/sdata.2017.163
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
List of the macaque and matching human samples (15 for each species) included in this study, with corresponding library ID, tissue name and basic QC values (RIN, total number of mapped reads and mapping rate).
| We additionally report in this table the abbreviations used in the figures for each CNS anatomical region. | ||||||
|---|---|---|---|---|---|---|
| CNhs14182 | Rhesus macaque | Amygdala | Amy | 9 | 4,565,587 | 50.9 |
| CNhs14174 | Rhesus macaque | Caudate | Cau | 8.4 | 3,348,210 | 44.68 |
| CNhs14176 | Rhesus macaque | Cerebellum | Cer | 9.2 | 5,682,996 | 65.99 |
| CNhs14171 | Rhesus macaque | Globus Pallidus | GlP | 8.1 | 4,486,692 | 50.57 |
| CNhs14180 | Rhesus macaque | Hippocampus | Hip | 8.7 | 3,017,562 | 41.6 |
| CNhs14172 | Rhesus macaque | Locus Coeruleus | LocC | 8.4 | 4,140,956 | 49.5 |
| CNhs14177 | Rhesus macaque | Medial Frontal Gyrus | MFG | 8.1 | 5,576,066 | 65.76 |
| CNhs14179 | Rhesus macaque | Medial Temporal Gyrus | MTG | 8.2 | 3,968,759 | 47.6 |
| CNhs14175 | Rhesus macaque | Medulla Oblongata | MedO | 8.2 | 5,407,111 | 61 |
| CNhs14169 | Rhesus macaque | Occipital Gyrus | Occ | 8.6 | 5,269,914 | 61.3 |
| CNhs14181 | Rhesus macaque | Parietal Gyrus | Par | 8.6 | 3,883,216 | 48.8 |
| CNhs14178 | Rhesus macaque | Putamen | Put | 8.1 | 4,493,463 | 52.28 |
| CNhs14168 | Rhesus macaque | Spinal Cord | SpC | 7.7 | 4,282,060 | 51.33 |
| CNhs14173 | Rhesus macaque | Substantia Nigra | SubN | 7.7 | 3,167,510 | 40.84 |
| CNhs14170 | Rhesus macaque | Thalamus | Thal | 8.4 | 6,211,376 | 71.69 |
| CNhs12311 | Homo sapiens | Amygdala | Amy | 6.9 | 14,101,506 | 41.5 |
| CNhs12321 | Homo sapiens | Caudate | Cau | 7.2 | 15,501,176 | 44.06 |
| CNhs12323 | Homo sapiens | Cerebellum | Cer | 6.9 | 17,383,592 | 44.66 |
| CNhs12319 | Homo sapiens | Globus Pallidus | GlP | 6.8 | 14,100,137 | 40.64 |
| CNhs12312 | Homo sapiens | Hippocampus | Hip | 6.7 | 15,910,618 | 41.21 |
| CNhs12322 | Homo sapiens | Locus Coeruleus | LocC | 7.2 | 14,226,392 | 40.88 |
| CNhs12310 | Homo sapiens | Medial Frontal Gyrus | MFG | 7.2 | 16,201,206 | 43.34 |
| CNhs12316 | Homo sapiens | Medial Temporal Gyrus | MTG | 7 | 16,329,604 | 40.89 |
| CNhs12315 | Homo sapiens | Medulla Oblongata | MedO | 7.5 | 15,199,808 | 43.08 |
| CNhs12320 | Homo sapiens | Occipital Gyrus | Occ | 8.4 | 15,966,482 | 39.88 |
| CNhs12317 | Homo sapiens | Parietal Gyrus | Par | 7.1 | 10,980,255 | 33.08 |
| CNhs13912 | Homo sapiens | Putamen | Put | 8.3 | 2,262,558 | 27.58 |
| CNhs12227 | Homo sapiens | Spinal Cord | SpC | 7.3 | 5,957,420 | 39.95 |
| CNhs12318 | Homo sapiens | Substantia Nigra | SubN | 8.1 | 12,393,047 | 41.45 |
| CNhs12314 | Homo sapiens | Thalamus | Thal | 7.3 | 18,287,224 | 43.51 |
Figure 1Experimental design of this study.
The figure shows a schematic workflow from sampling of CNS anatomical regions to subsequent sample and data processing. The full sample names corresponding to the abbreviations used in the figure are reported in Table 1.
Number of CAGE peaks within 500 bp of at least one annotated gene model.
| Augustus | 11,435 | 44.2 |
| EST | 6,514 | 25.2 |
| Genscan | 5,953 | 23 |
| refGene | 6,248 | 24.1 |
| Ensembl Gene | 14,376 | 55.6 |
| hg38 refGene liftOver | 15,562 | 60.2 |
| hg38 GENCODE liftOver | 18,635 | 72 |
Figure 2Promoter features and quality assessment of CAGE data.
(a) Graphical representation of the Spearman correlation values between all 15 macaque expression profiles presented in this study. The top dendrogram shows that the samples cluster in four major groups, as highlighted by the colored bar above the heatmap. (b) Scatterplot and Spearman correlation value for a pair of regions showing a high degree of similarity (caudate and putamen). Axes represent log-transformed TPM expression. (c) Multi-dimensional scaling representation of the samples included in the study, color-coded to show the clustering of samples (colors corresponding to those in a). (d) Frequency profile showing TATA-box enrichment upstream of the promoter. Insert shows a zoom-in view of the region (−50, +20), with a clear enrichment around 35 bases upstream of the TSS.
Spearman correlation across human expression profiles derived from a matching set of CNS regions.
| 1 | 0.87 | 0.72 | 0.81 | 0.92 | 0.83 | 0.9 | 0.9 | 0.79 | 0.88 | 0.88 | 0.65 | 0.74 | 0.73 | 0.83 | |
| 1 | 0.69 | 0.79 | 0.85 | 0.8 | 0.84 | 0.83 | 0.76 | 0.81 | 0.82 | 0.69 | 0.7 | 0.68 | 0.82 | ||
| 1 | 0.67 | 0.72 | 0.7 | 0.73 | 0.73 | 0.73 | 0.73 | 0.72 | 0.52 | 0.64 | 0.62 | 0.68 | |||
| 1 | 0.84 | 0.89 | 0.8 | 0.81 | 0.86 | 0.83 | 0.77 | 0.65 | 0.85 | 0.9 | 0.93 | ||||
| 1 | 0.86 | 0.89 | 0.9 | 0.83 | 0.89 | 0.88 | 0.64 | 0.77 | 0.76 | 0.86 | |||||
| 1 | 0.82 | 0.82 | 0.89 | 0.83 | 0.8 | 0.63 | 0.84 | 0.86 | 0.89 | ||||||
| 1 | 0.94 | 0.78 | 0.92 | 0.93 | 0.64 | 0.72 | 0.71 | 0.82 | |||||||
| 1 | 0.78 | 0.93 | 0.93 | 0.64 | 0.72 | 0.71 | 0.83 | ||||||||
| 1 | 0.79 | 0.76 | 0.59 | 0.85 | 0.85 | 0.87 | |||||||||
| 1 | 0.91 | 0.64 | 0.75 | 0.75 | 0.84 | ||||||||||
| 1 | 0.63 | 0.7 | 0.68 | 0.8 | |||||||||||
| 1 | 0.58 | 0.58 | 0.66 | ||||||||||||
| 1 | 0.86 | 0.84 | |||||||||||||
| 1 | 0.89 | ||||||||||||||
| 1 |
Proportions (as %) of CAGE peaks associated to TATA-box and CpG islands in macaque and human. For TATA, a region 70 bp around the TSS was used.
| 2.1 | 55.4 | 2.7 | 39.8 | |
| 1.5 | 47.3 | 2.7 | 48.6 |
Figure 3Expression patterns and annotation of regionally enriched peaks.
The heatmap shows the TPM normalized expression profiles of the top 10 enriched CAGE peaks in each region. The black vertical lines highlight the separation of clustered regions as induced by the expression profiles (top dendrogram). Labels on the right side report the gene symbols (lifted over to rheMac8) corresponding to each peak, when such association is available. Black labels indicate symbols available in the native macaque annotations (either RefSeq or Ensembl), gray labels indicate symbols from human annotations (either RefSeq or GENCODE) lifted over to macaque.
Figure 4Visual exploration of the dataset presented and comparison with external data.
(a) Screenshot from ZENBU, one of the resources available to explore the expression data presented in this study, showing one example of missing gene annotation. (b) Example of a comparison between CAGE and NIH Blueprint NPH Atlas data. The colors on the top bar label distinct macroscopical anatomical regions (orange=basal ganglia, dodger blue=occipital cortex, dark blue=medial frontal cortex, yellow green=hippocampal cortex, cyan=amygdaloid complex). (c) Example of a comparison between CAGE and NHPRTR datasets. The top bar distinguishes CNS (blue) and non-CNS (red) samples.