| Literature DB >> 28759564 |
Melissa M Liu1, Michael Farkas2,3, Perrine Spinnhirny4, Paul Pevet4, Eric Pierce5, David Hicks4, Donald J Zack1,6,7,8,9.
Abstract
Cone photoreceptors are required for color vision and high acuity vision, and they die in a variety of retinal degenerations, leading to irreversible vision loss and reduced quality of life. To date, there are no approved therapies that promote the health and survival of cones. The development of novel treatments targeting cones has been challenging and impeded, in part, by the limitations inherent in using common rodent model organisms, which are nocturnal and rod-dominant, to study cone biology. The African Nile grass rat (Arvicanthis ansorgei), a diurnal animal whose photoreceptor population is more than 30% cones, offers significant potential as a model organism for the study of cone development, biology, and degeneration. However, a significant limitation in using the A. ansorgei retina for molecular studies is that A. ansorgei does not have a sequenced genome or transcriptome. Here we present the first de novo assembled and functionally annotated transcriptome for A. ansorgei. We performed RNA sequencing for A. ansorgei whole retina to a depth of 321 million pairs of reads and assembled 400,584 Trinity transcripts. Transcriptome-wide analyses and annotations suggest that our data set confers nearly full length coverage for the majority of retinal transcripts. Our high quality annotated transcriptome is publicly available, and we hope it will facilitate wider usage of A. ansorgei as a model organism for molecular studies of cone biology and retinal degeneration.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28759564 PMCID: PMC5536302 DOI: 10.1371/journal.pone.0179061
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
A. ansorgei transcriptome assembly statistics.
| Pairs of raw reads | 321,226,931 |
| Pairs of cleaned reads | 312,472,306 |
| Q20 | 98.3% |
| Q30 | 92.1% |
| Total Trinity genes | 356,299 |
| Total Trinity transcripts | 400,584 |
| Total assembled bases | 324,826,766 |
| Percent GC | 47.2 |
| N50 length | 1,457 |
| Average length | 811 |
| Median length | 401 |
Q30: Percent of bases in cleaned reads with quality score 30 or greater; N50: length of longest Trinity transcript such that 50% of bases are in Trinity transcripts of length N50 or greater.
RNA-Seq read alignment statistics.
| S1 | S2 | |
|---|---|---|
| Total reads | 151,032,789 | 161,439,517 |
| Total aligned reads | 115,727,548 (76.6%) | 115,450,815 (71.5%) |
| Aligned reads in proper pairs | 87,084,546 (75.3%) | 81,044,720 (70.2%) |
| Aligned reads in improper pairs | 16,756,080 (14.5%) | 25,731,616 (22.3%) |
| Aligned right read only | 6,032,890 (5.2%) | 4,391,407 (3.8%) |
| Aligned left read only | 5,854,032 (5.1%) | 4,283,072 (3.7%) |
Proper pair: left and right reads map to a single Trinity transcript in the correct orientation.
Fig 1Trinity transcript length and level of expression.
A) Length and B) average level of expression between S1 and S2 in units of log10(TPM) (TPM = transcripts per million) for all Trinity transcripts (n = 400,584) and the subset of Trinity transcripts with Swiss-Prot BlastX homology (n = 63,242).
Fig 2Phylogenetic analysis for retinal genes.
Multiple sequence alignment for the CDS of A) rhodopsin B) short-wave-sensitive opsin 1 C) melanopsin D) cone-rod homeobox for A. niloticus and other model organisms performed using MUSCLE. The Maximum Composition Likelihood model was used to construct Neighbor-Joining phylogenetic trees.
Fig 3Coverage of M. musculus and R. norvegicus mRNA transcripts and proteins.
The coverage of M. musculus A) mRNA and B) protein and R. norvegicus C) mRNA and D) protein provided by transcripts from the de novo assembled A. ansorgei transcriptome. For a Trinity transcript in bin of percent coverage n, the Trinity transcript covers at least n-10% of the length of the target mRNA or protein.
Trinity transcripts with homology for selected retinal markers.
| Trinity transcript | BlastX hit | Percent identity | Hit length | Percent hit aligned | Hit description | Average TPM |
|---|---|---|---|---|---|---|
| TR103769|c1_g1_i4 | NP_446140.1 | 95.18 | 83 | 100 | retinal cone rhodopsin-sensitive cGMP 3',5'-cyclic phosphodiesterase subunit gamma | 75.0 |
| TR104093|c0_g2_i3 | NP_446278.2 | 98.35 | 424 | 97.7 | pyruvate dehydrogenase kinase, isozyme 1 precursor | 4.8 |
| TR107698|c10_g1_i2 | NP_001100357.1 | 97.3 | 185 | 91.58 | guanylyl cyclase-activating protein 1 | 287.0 |
| TR107698|c8_g2_i1 | NP_001101668.1 | 96.52 | 201 | 100 | guanylyl cyclase-activating protein 2 | 256.1 |
| TR113665|c4_g1_i1 | NP_446130.1 | 96.62 | 207 | 60.53 | retinal homeobox protein Rx | 8.5 |
| TR115337|c4_g5_i3 | NP_001102250.2 | 99.14 | 350 | 100 | guanine nucleotide-binding protein G(t) subunit alpha-1 | 1483.6 |
| TR116031|c7_g1_i1 | NP_446153.1 | 97.49 | 1635 | 82.53 | voltage-dependent L-type calcium channel subunit alpha-1F | 10.3 |
| TR116046|c3_g2_i3 | NP_037133.1 | 99.76 | 422 | 100 | paired box protein Pax-6 | 6.4 |
| TR130418|c8_g2_i3 | NP_446283.1 | 95.67 | 831 | 75 | retinal guanylyl cyclase 2 precursor | 12.5 |
| TR131335|c11_g1_i1 | NP_599182.1 | 100 | 86 | 20.87 | POU domain, class 4, transcription factor 2 | 1.3 |
| TR134023|c5_g1_i1 | NP_001099506.1 | 96.64 | 238 | 100 | neural retina-specific leucine zipper protein | 10.5 |
| TR135421|c0_g2_i1 | NP_112277.1 | 96.82 | 346 | 100 | short-wave-sensitive opsin 1 | 43.7 |
| TR137704|c6_g9_i2 | NP_001101191.1 | 93.24 | 518 | 22.62 | retinal-specific ATP-binding cassette transporter | 63.9 |
| TR137727|c7_g1_i1 | NP_037004.1 | 94.72 | 246 | 100 | phosducin | 382.4 |
| TR137897|c11_g6_i1 | NP_446000.1 | 96.47 | 255 | 71.03 | medium-wave-sensitive opsin 1 | 596.2 |
| TR142441|c12_g1_i1 | NP_001099183.1 | 99.77 | 442 | 69.5 | protein kinase C alpha type | 15.4 |
| TR205210|c0_g1_i1 | NP_446240.2 | 99.65 | 288 | 100 | syntaxin-1A | 2.6 |
| TR55594|c6_g2_i6 | NP_112358.1 | 95.57 | 564 | 100 | rhodopsin kinase precursor | 207.2 |
| TR56231|c0_g1_i2 | NP_037069.1 | 100 | 202 | 98.54 | beta-crystallin B2 | 1383.0 |
| TR58523|c4_g1_i2 | NP_543177.1 | 95.79 | 190 | 94.06 | recoverin | 112.1 |
| TR59222|c9_g2_i2 | NP_036796.1 | 99.57 | 235 | 76.55 | synaptophysin | 372.0 |
| TR70411|c9_g2_i6 | NP_001101112.1 | 60.42 | 141 | 30.19 | tubby-related protein 1 | 4.8 |
| TR70482|c3_g1_i5 | NP_254276.1 | 94.25 | 348 | 100 | rhodopsin | 4697.7 |
| TR73195|c4_g1_i4 | NP_001162599.1 | 98.89 | 361 | 100 | visual system homeobox 2 | 7.9 |
| TR81238|c2_g1_i2 | NP_445876.1 | 99.17 | 1079 | 100 | electrogenic sodium bicarbonate cotransporter 1 | 5.8 |
| TR85026|c0_g1_i2 | NP_001099744.1 | 96.21 | 317 | 100 | retinaldehyde-binding protein 1 | 187.7 |
| TR87913|c0_g1_i1 | NP_620215.1 | 89.66 | 474 | 100 | melanopsin | 3.5 |
| TR88772|c0_g1_i1 | NP_001102651.1 | 100 | 319 | 100 | transcription factor SOX-2 | 7.2 |
| TR92173|c5_g2_i2 | NP_058987.2 | 98.9 | 273 | 56.88 | gamma-aminobutyric acid receptor subunit rho-1 precursor | 13.2 |
| TR92173|c7_g1_i6 | NP_058988.1 | 96.81 | 408 | 87.74 | gamma-aminobutyric acid receptor subunit rho-2 precursor | 4.2 |
| TR93490|c3_g2_i13 | NP_068627.1 | 99 | 299 | 100 | cone-rod homeobox protein | 13.7 |
| TR99284|c0_g4_i1 | NP_114190.1 | 99.62 | 261 | 100 | calbindin | 3.5 |
Trinity transcripts were queried using BlastX against R. norvegicus RefSeq proteins. Results reported for selected retinal cell specific markers.
Fig 4Enrichment analysis of Gene Ontology annotations.
A) Enrichment of GOSlim annotations in the molecular function, cellular component, and molecular process subgroups. Nodes are enriched GO annotations, and their sizes are proportional to the number of genes with which they are associated. Color scale indicates Benjamin-Hochberg False Discovery Rate (FDR) corrected p-value from hypergeometric test for enrichment. B) Number of genes corresponding to GOSlim annotations with greater than 2% coverage for A. ansorgei as compared to both M. musculus and R. norvegicus references.