| Literature DB >> 22726727 |
Nicolas Tchitchek1, Béatrice Jacquelin, Patrick Wincker, Carole Dossat, Corinne Da Silva, Jean Weissenbach, Antoine Blancher, Michaela Müller-Trutwin, Arndt Benecke.
Abstract
BACKGROUND: African Green Monkeys (AGM) are amongst the most frequently used nonhuman primate models in clinical and biomedical research, nevertheless only few genomic resources exist for this species. Such information would be essential for the development of dedicated new generation technologies in fundamental and pre-clinical research using this model, and would deliver new insights into primate evolution.Entities:
Mesh:
Year: 2012 PMID: 22726727 PMCID: PMC3539953 DOI: 10.1186/1471-2164-13-279
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Composition and alignment distribution of the EST library and the assembled distinct transcripts. (A) Distribution of the length of the 37,787 original ESTs. The median EST length is equal to 618 nucleotides, the mean EST length is 563 nucleotides, and the standard deviation of the distribution is 167 nucleotides. (B) Distribution of the number of ESTs per contig in the ESTs assembly. The median number of ESTs per contig is equal to 3 ESTs, the mean number of ESTs per contig is 7 ESTs, and the standard deviation of the distribution is 27 ESTs. (C) Distribution of the length of the 14,410 distinct transcripts. The median sequence length is equal to 847 nucleotides, the mean sequence length is 943 nucleotides, and the standard deviation of the distribution is 388 nucleotides. The contribution of the assembled ESTs is shown in red while the contribution of singleton ESTs is shown in blue. (D) Distribution of the number of matched cDNA reference-mapped for both the 37,787 original ESTs (shown in yellow) and the 14,410 distinct transcripts (shown in green).
Composition details of the cDNA references
| Cjacchus3.2.1.63 | 55,137 | 32,339 | |
| gorGor3.63 | 35,727 | 29,216 | |
| GRCh37.63 | 174,598 | 53,894 | |
| MMUL_1.63 | 44,725 | 30,247 | |
| micMur1.63 | 25,035 | 25,036 | |
| Nleu1.0.63 | 31,550 | 26,526 | |
| BUSHBABY1.63 | 22,804 | 22,800 | |
| CHIMP2.1.63 | 41,488 | 27,116 | |
| PPYG2.63 | 31,566 | 28,088 | |
| tarSyr1.63 | 20,261 | 20,215 |
For each species, the release version of the cDNA reference used and the number of transcripts and genes that composed the cDNA reference are indicated. All the cDNA references have been retrieved from the Ensembl [38] database.
Alignment results of ESTs on the different cDNA references and genomes
| cDNA ref. | 24,461 (64.73%) | 5,951 | 5,051 | 5,954 (41.31%) | 4,928 | 4,504 | |
| cDNA ref. | 23,633 (62.54%) | 5,162 | 4,948 | 6,008 (41.69%) | 4,622 | 4,527 | |
| cDNA ref. | 30,117 (79.70%) | 9,208 | 6,529 | 8,708 (60.43%) | 7,316 | 6,128 | |
| cDNA ref. | 24,213 (64.07%) | 5,439 | 4,763 | 5,657 (39,25%) | 4,585 | 4,273 | |
| cDNA ref. | 8,618 (22.80%) | 1,770 | 1,770 | 1,240 (08.60%) | 1,138 | 1,138 | |
| cDNA ref. | 22,600 (59.80%) | 4,949 | 4,749 | 5,672 (39.36%) | 4,389 | 4,296 | |
| cDNA ref. | 7,564 (20.01%) | 1,431 | 1,431 | 930 (06.45%) | 861 | 861 | |
| cDNA ref. | 25,196 (66.67%) | 5,699 | 5,156 | 6,332 (43.94%) | 5,012 | 4,756 | |
| cDNA ref. | 18,904 (50.02%) | 4,149 | 3,989 | 4,274 (29.65%) | 3,415 | 3,340 | |
| cDNA ref. | 5,327 (14.09%) | 1,348 | 1,346 | 908 (06.30%) | 854 | 854 | |
| d. scaf. | 37,409 (98.99%) | – | – | 14,139 (98,11%) | – | – | |
| d. assem. | 35,686 (94,44%) | – | – | 13,392 (92.93%) | – | – | |
For both the 37,787 originals ESTs and the 14,410 distinct transcripts, the number of aligned ESTs (a.e.) on the cDNA references (cDNA ref.), draft scaffold genome (d. scaf.), and draft assembly genome (d. assem.) are indicated. The number of mapped transcripts (m.t.) and mapped genes (m.g.) are also indicated for the cDNA references.
Figure 2Inter- and intra- species alignment comparisons. (A) 4-set Venn diagram showing the intersections among the 4 sets of original ESTs aligned on the H. sapiens, M. mulatta, P. troglodytes, and P. abelii species, and 2-set Venn diagram showing the intersections between the 2 sets of original ESTs aligned over the C. sabaeus and M. fascicularis species. (B) Idem as A for the distinct transcripts.
Figure 3Alignment details for the CXCL10 gene. Alignment details for the C-X-C motif chemokine 10 gene of the M. mulatta species (Ensembl ID: ENSMMUT00000029391). Assembled ESTs have been aligned at different positions of the gene: (1) Contig2229.
List of the 50 most expressed ortholog transcripts in present EST library
| ENSMMUT00000006876 | HBB_MACMU | Hemoglobin subunit beta | 941 |
| ENSMMUT00000045385 | LOC712934 | | 699 |
| ENSMMUT00000012750 | CD74 | | 526 |
| ENSMMUT00000015401 | Q3YAP9_MACMU | eukaryotic translation elongation factor 1 alpha 1 | 519 |
| ENSMMUT00000000859 | HBA_MACMU | Hemoglobin subunit alpha | 296 |
| ENSMMUT00000017004 | LOC712553 | | 257 |
| ENSMMUT00000038286 | | MTRNR2-like (LOC100499503) | 232 |
| ENSMMUT00000005322 | B2MG_MACMU | Beta-2-microglobulin | 212 |
| ENSMMUT00000005104 | LOC708526 | | 208 |
| ENSMMUT00000043999 | RPL3 | ribosomal protein L3 | 208 |
| ENSMMUT00000038271 | COX2_MACMU | Cytochrome c oxidase subunit 2 | 194 |
| ENSMMUT00000027050 | DRA_MACMU | Mamu class II histocompatibility antigen, DR alpha chain | 191 |
| ENSMMUT00000038268 | Q6IYH3_MACMU | ATP synthase F0 subunit 6 | 185 |
| ENSMMUT00000029930 | Q3YAP9_MACMU | eukaryotic translation elongation factor 1 alpha 1 | 173 |
| ENSMMUT00000045510 | Q9GMG8_MACMU | acidic ribosomal phosphoprotein PO | 173 |
| ENSMMUT00000023666 | LOC710590 | | 155 |
| ENSMMUT00000039116 | LOC714576 | | 144 |
| ENSMMUT00000027943 | B0Z9V5_MACMU | major histocompatibility complex, class I, E | 143 |
| ENSMMUT00000038267 | Q6IYH2_MACMU | cytochrome c oxidase subunit III | 135 |
| ENSMMUT00000010560 | B5MBT6_MACMU | ribosomal protein L13a | 133 |
| ENSMMUT00000032800 | UBB | polyubiquitin-B | 133 |
| ENSMMUT00000010558 | Q3YAQ2_MACMU | ribosomal protein S11 | 131 |
| ENSMMUT00000011109 | | ribosomal protein S2 (RPS2) | 131 |
| ENSMMUT00000015005 | LOC711043 | | 129 |
| ENSMMUT00000020179 | GZMB | | 126 |
| ENSMMUT00000033466 | Q6IEB8_MACMU | interferon alpha-inducible protein 27 | 123 |
| ENSMMUT00000038664 | LOC719242 | | 123 |
| ENSMMUT00000008204 | Q6IUG4_MACMU | glyceraldehyde-3-phosphate dehydrogenase | 122 |
| ENSMMUT00000029999 | RPS20 | | 116 |
| ENSMMUT00000032342 | TPT1 | | 116 |
| ENSMMUT00000012806 | Q9GMG8_MACMU | acidic ribosomal phosphoprotein PO | 115 |
| ENSMMUT00000005819 | SRGN | | 107 |
| ENSMMUT00000040341 | Q9MXS5_MACMU | MHC class I antigen | 106 |
| ENSMMUT00000014609 | LOC711421 | | 105 |
| ENSMMUT00000004034 | LOC710901 | | 104 |
| ENSMMUT00000009232 | EEF1G | eukaryotic translation elongation factor 1 gamma | 103 |
| ENSMMUT00000027208 | A2TJ58_MACMU | major histocompatibility complex, class II, DP alpha | 94 |
| ENSMMUT00000013155 | Q6RHR8_MACMU | actin, cytoplasmic 1 | 93 |
| ENSMMUT00000041082 | E0WHM2_MACMU | MHC class I antigen | 92 |
| ENSMMUT00000043841 | RPS3 | | 89 |
| ENSMMUT00000022628 | A8QWZ5_MACMU | MHC class I antigen | 86 |
| ENSMMUT00000000617 | RPL12 | 60S ribosomal protein L12 | 85 |
| ENSMMUT00000018897 | RPS6 | | 84 |
| ENSMMUT00000025324 | ARHGDIB | | 79 |
| ENSMMUT00000011502 | A3F8W8_MACMU | MHC class II antigen | 77 |
| ENSMMUT00000040916 | A3F8W8_MACMU | MHC class II antigen | 76 |
| ENSMMUT00000005540 | LOC718964 | | 75 |
| ENSMMUT00000018430 | | | 75 |
| ENSMMUT00000041189 | A9XN15_MACMU | major histocompatibility complex, class I, A | 73 |
| ENSMMUT00000015586 | Q6UIS1_MACMU | Actin beta subunit | 72 |
For each of the 44,725 transcripts of the M. mulatta cDNA reference, we calculated the number of original ESTs mapped, and obtained a list of the 50 most expressed M. mulatta ortholog transcripts in our EST library. For each of the most expressed M. mulatta ortholog transcript, the Ensembl transcript ID, the gene symbol, the gene description, and the number of mapped ESTs is given.
Figure 4Alignment details for the the IRF7 gene. Alignment details for the Interferon regulatory factor 7 gene of the M. mulatta species (Ensembl ID: ENSMMUT00000009923). Assembled ESTs have been aligned at different positions of the gene: (1) Contig3553, (2) Contig866, (3) Contig1898. Same legend and nomenclature as in Figure 3.
Top 50 canonical pathways found as significantly over-represented in present EST library
| Canonical pathway | ||
|---|---|---|
| Protein Ubiquitination Pathway | 16.80 | 148/274 (54%) |
| Glucocorticoid Receptor Signaling | 16.80 | 148/295 (50%) |
| Oxidative Phosphorylation | 15.00 | 92/159 (58%) |
| Mitochondrial Dysfunction | 14.00 | 86/175 (49%) |
| CD28 Signaling in T Helper Cells | 13.70 | 77/132 (58%) |
| Regulation of eIF4 and p70S6K Signaling | 11.90 | 69/132 (52%) |
| Role of NFAT in Regulation of the Immune Response | 10.70 | 97/200 (49%) |
| EIF2 Signaling | 10.60 | 57/101 (56%) |
| PI3K/AKT Signaling | 10.50 | 73/140 (52%) |
| iCOS-iCOSL Signaling in T Helper Cells | 10.50 | 67/122 (55%) |
| B Cell Receptor Signaling | 10.10 | 83/156 (53%) |
| Regulation of IL-2 Expression in Lymphocytes | 9.70 | 53/89 (60%) |
| Integrin Signaling | 9.48 | 104/209 (50%) |
| PKC | 8.93 | 68/142 (48%) |
| Hypoxia Signaling in the Cardiovascular System | 8.93 | 46/68 (68%) |
| CTLA4 Signaling in Cytotoxic T Lymphocytes | 8.52 | 57/98 (58%) |
| mTOR Signaling | 8.51 | 79/162 (49%) |
| T Cell Receptor Signaling | 8.43 | 59/109 (54%) |
| Type I Diabetes Mellitus Signaling | 8.32 | 64/121 (53%) |
| Production of Nitric Oxide and ROS in Macrophages | 8.32 | 83/187 (44%) |
| Ubiquinone Biosynthesis | 8.05 | 45/112 (40%) |
| Molecular Mechanisms of Cancer | 7.64 | 152/377 (40%) |
| Estrogen Receptor Signaling | 7.54 | 70/136 (51%) |
| Antigen Presentation Pathway | 7.20 | 30/43 (70%) |
| Apoptosis Signaling | 7.19 | 53/96 (55%) |
| G2/M DNA Damage Checkpoint Regulation | 7.19 | 32/49 (65%) |
| Prostate Cancer Signaling | 6.96 | 49/97 (51%) |
| Phospholipase C Signaling | 6.79 | 109/260 (42%) |
| Huntington’s Disease Signaling | 6.78 | 104/238 (44%) |
| Chronic Myeloid Leukemia Signaling | 6.65 | 54/105 (51%) |
| Pancreatic Adenocarcinoma Signaling | 6.61 | 59/119 (50%) |
| IL-8 Signaling | 6.60 | 86/193 (45%) |
| PI3K Signaling in B Lymphocytes | 6.50 | 69/143 (48%) |
| Breast Cancer Regulation by Stathmin1 | 6.49 | 93/208 (45%) |
| IL-2 Signaling | 6.31 | 35/58 (60%) |
| NF- | 6.26 | 44/82 (54%) |
| IL-15 Signaling | 6.23 | 39/68 (57%) |
| T Helper Cell Differentiation | 6.15 | 42/72 (58%) |
| TREM1 Signaling | 6.05 | 35/66 (53%) |
| Fc | 5.98 | 52/102 (51%) |
| Pyrimidine Metabolism | 5.85 | 70/213 (33%) |
| GM-CSF Signaling | 5.69 | 38/67 (57%) |
| Induction of Apoptosis by HIV1 | 5.64 | 37/66 (56%) |
| Dendritic Cell Maturation | 5.64 | 78/188 (41%) |
| NRF2-mediated Oxidative Stress Response | 5.63 | 86/193 (45%) |
| Purine Metabolism | 5.56 | 117/391 (30%) |
| fMLP Signaling in Neutrophils | 5.50 | 57/128 (45%) |
| JAK/Stat Signaling | 5.42 | 37/64 (58%) |
| HMGB1 Signaling | 5.40 | 51/100 (51%) |
| IL-4 Signaling | 5.40 | 40/73 (55%) |
List of the top 50 canonical pathways found as statistically significantly over-represented in the functional pathway analysis of the EST library. For each canonical pathway, the associated multiple testing corrected p-value (shown as −log(q-value)) is indicated as well as the ratio between the number genof genes of the pathway mapped by the EST library and the total number genof genes defining the pathway.
Figure 5Representation of the “Interferon Signaling” and “Toll-like Receptor Signaling” pathways. (A) Representation of the “Interferon Signaling” pathway. (B) Representation of the “Toll-like Receptor Signaling” pathway. Genes present in the EST library are shown in gray.
Pairwise genomic distance matrix of the 11 primate species
| 474 | 445 | 430 | 474 | 804 | 445 | 906 | 442 | 442 | 862 | |
| | 272 | 260 | 140 | 751 | 284 | 856 | 272 | 289 | 832 | |
| | | 103 | 263 | 741 | 191 | 858 | 117 | 200 | 830 | |
| | | | 248 | 712 | 173 | 835 | 78 | 176 | 808 | |
| | | | | 754 | 266 | 873 | 259 | 290 | 842 | |
| | | | | | 753 | 770 | 734 | 743 | 897 | |
| | | | | | | 876 | 185 | 224 | 842 | |
| | | | | | | | 850 | 880 | 1006 | |
| | | | | | | | | 191 | 824 | |
| 859 |
Pairwise genomic distance matrix computed using the ESTs of the original library and the cDNA references of the 10 primates species for which the cDNA references were available. For pairs of species, the average multiple alignment score calculated over the 1,628 commonly aligned sequences is given. Scores have been rescaled by multiplication by 104.
Figure 6Evolutionary relationships among primates species. (A) phylogenetic tree of the 11 primate species for which the cDNA references were available calculated based on the 1,628 common original ESTs. (B) phylogenetic tree of the old world monkeys and human species calculated based on the 8,788 common assembled ESTs. (C) phylogenetic tree of the old world monkeys and human restricted to the 5’UTR of the transcripts calculated based on the 1,016 common assembled ESTs. (D) phylogenetic tree of the old world monkeys and human restricted to the coding sequence of the transcripts calculated based on the 8,024 common assembled ESTs. (E) phylogenetic tree of the old world monkeys and human restricted to the 3’UTR of the transcripts calculated based on the 2,209 common assembled ESTs.