Literature DB >> 25371763

De novo Assembly and Analysis of the Northern Leopard Frog Rana pipiens Transcriptome.

Matthew K Christenson1, Andrew J Trease1, Lakshmi-Prasad Potluri1, Andrew J Jezewski1, Vincent M Davis2, Lindsey A Knight1, Alan S Kolok3, Paul H Davis4.   

Abstract

The northern leopard frog Rana (Lithobates) pipiens is an important animal model, being used extensively in cancer, neurology, physiology, and biomechanical studies. R. pipiens is a native North American frog whose range extends from northern Canada to southwest United States, but over the past few decades its populations have declined significantly and is now considered uncommon in large portions of the United States and Canada. To aid in the study and conservation of R. pipiens, this paper describes the first R. pipiens transcriptome. The R. pipiens transcriptome was annotated using Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Eukaryotic Orthologous Groups (KOG). Differential expression analysis revealed universal and tissue specific genes, and endocrine-related genes were identified. Transcriptome assemblies and other sequence data are available for download.

Entities:  

Keywords:  Northern Leopard Frog; Rana pipiens; Transcriptome.

Year:  2014        PMID: 25371763      PMCID: PMC4218947          DOI: 10.7150/jgen.9760

Source DB:  PubMed          Journal:  J Genomics


Results and Discussion

General Characteristics of the Rana pipiens Transcriptome

To characterize the transcriptome of the northern leopard frog Rana (Lithobates) pipiens (Figure 1), a widely distributed North American species (Figure 2), cDNA samples were separately prepared from gonad, liver, kidney, brain, and tadpole homogenates and sequenced using paired-end 100 bp reads 1. The resulting 1.166 billion reads, comprising 116.6 Gb of sequence, were used to construct a transcriptome with Velvet and Oases (Table 1 and Table 2) 2,3. Assembled transcriptomes and annotated characteristics are available at http://www.davislab.net/rana/, and raw reads have been deposited at NCBI under BioProject accession PRJNA240240.
Figure 1

The northern leopard frog From March to June mature R. pipiens gather at communal breeding ponds. Each female lays between 2,000 to 6,500 small, black and white eggs that hatch after two to three weeks 1. Tadpoles are greenish or brown, with yellow or black speckles and their bellies are white and somewhat transparent, reaching 84 mm in length 1. Metamorphosis typically occurs after 60 to 80 days, depending on conditions, and froglets are 20 to 30 mm long at metamorphosis 1. Sexual maturity is reached in one to three years and adult R. pipiens are slender, long-legged green or brown with a white or cream underside, prominent, light-colored dorsolateral ridges, and large, dark spots located on its back, sides, and legs and grow to an average length of 68 mm and mass of 38.0 g 1.

Figure 2

Geographic Distribution of R. pipiens is a native North American frog whose range extends from northern Canada to the southwest United States 1. Its life cycle includes an aquatic larval stage and semi-terrestrial juvenile and adult stages; thus, this frog is found residing within grassland, brushland, and forest environments, prefering static or slow-moving water 1.

Table 1

Information on sequencing reads for Rana pipiens.

TissueNumber of ReadsTotal Length
Male Gonad32,771,88165,543,762 bp
Male Liver35,791,82971,583,658 bp
Female Gonad77,777,594155,555,188 bp
Female Liver35,527,80471,055,608 bp
Female Brain111,972,214223,944,428 bp
Female Kidney108,243,825216,487,650 bp
Tadpole181,032,472362,064,944 bp
Total1,166,235,238116,623,523,800 bp

The seven RNA samples, from both adult and juvenile R. pipiens tissues, were converted into individual libraries and were subsequently run on three sequencing lanes (Lane 1: Male Gonad, Male Liver, Female Gonad, Female Liver; Lane 2: Female Brain and Female Kidney; and Lane 3: Two Tadpole Stages) capturing 100 bp paired end reads.

Table 2

Statistics for Rana pipiens transcriptome assembly.

Number of TranscriptsMean LengthMedian LengthN50GC Content
33,0862,639 bp2,004 bp3,783 bp44.02

The R. pipiens transcriptome was analyzed and the total number of transcripts, mean and median transcript length, N50, and GC content was determined.

Organism Homology and Species Confirmation of Rana pipiens

To identify transcript homologs, blastx was used to compare the R. pipiens transcriptome against the NCBI non-redundant (nr) protein database. Of the assembled transcripts, 91.7% (30,364) yielded significant BLAST hits. Among these, the western clawed frog Xenopus tropicalis and African clawed frog Xenopus laevis have the greatest number of top BLAST hits (71.6%). These result are consistent with accepted taxonomy but are strongly influenced by the completeness of protein annotation in the nr database; e.g. X. tropicalis is the closest, fully sequenced relative of R. pipiens. To further identify the completeness of the representative transcriptome, this set was compared via BLASTX against all Anura (frog) sequences in the nr database, resulting in more than 73% of the transcripts exhibiting a length coverage greater than 80% (Figure 3). To validate our transcriptome assembly as being from R. pipiens, we compared several genes (18S and 28S rRNA, rhodopsin, and histone H3) to published sequences. The concatenated results indicate a significant identity (99.42%) to previously published R. pipiens gene sequences, suggesting a preliminary level of genetic variation within the species (Table 3). Assembled sequence data also supported the previously published phylogenic differences between the Rana species established using these genes4.
Figure 3

Coverage of Anura (frog) proteins by Sequence coverage length of R. pipiens transcripts when compared to available Anura (frog) proteins. The generated representative R. pipiens transcriptome was BLASTed against all available Anura proteins (count=159,284). The length of sequence homology as reported by BLASTx was compared to the length of the Anura protein homolog. The majority of the R. pipiens transcripts (45%) had sequence homology with one or more Anura transcripts which covered at least 90% of the length of the Anura protein.

Table 3

Percent identity to published Rana species sequences.

SpeciesPercent identity
Rana pipiens99.42
Rana chiricahuensis95.69
Rana capito94.17
Rana yavapaeniensis93.57
Rana sylvatica90.57
Rana temoparia89.5

Transcripts generated in this study were aligned to previously published Rana species sequences and a percent identity matrix was computed (4). This study utilized Rana_pipiens_Transcript_030065, 050213, 496664, and 023205 for 18s rRNA, histone H3, rhodopsin, and 28s rRNA sequences, respectively.

Functional Annotation and Characterization of Rana pipiens Transcripts

In an effort to both review the putative functions of the R. pipiens transcripts, and to validate the completeness of its transcriptome, multiple functional analyses were performed against R. pipiens and the X. tropicalis and X. laevis transcriptomes. Gene Ontology (GO) analysis was performed with 63.6% (21,056) of the transcripts being assigned GO terms (Figure 4). Importantly, X. tropicalis and X. laevis displayed a similar ontology pattern with the average percent difference in GO categories between R. pipiens and X. tropicalis of 25.03%, as compared to X. tropicalis and X. laevis at 24.42% (calculated by summing the organism differences between each GO category, and dividing by the number of GO categories). Further examination using Eukaryotic Orthologous Groups (KOG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) reveled that 59.8% (19,785) and 33.4% (11,064) of the transcripts were assigned KOG terms and K-numbers, respectively (Figure 5 and Figure 6). Overall, the nearly identical patterns observed in the GO, KOG, and KEGG analyses by R. pipiens, X. tropicalis, and X. laevis suggests the completeness of the R. pipiens transcriptome.
Figure 4

Gene Ontology (GO) Analysis of the GO functional analysis was performed to evaluate transcript function and transcriptome completeness. GO terms were assigned to the R. pipiens transcripts and the mRNA RefSeq nucleotide entries of Xenopus tropicalis and Xenopus laevis retrieved from NCBI. The distributions of three transcriptomes closely resemble one another, suggesting the completeness of the R. pipiens transcriptome.

Figure 5

Eukaryotic Orthologous Groups (KOG) Characterization of Putative transcript functions were assessed and transcriptome completeness was evaluated using KOG analysis. The R. pipiens transcriptome and mRNA nucleotide entries from NCBI of Xenopus tropicalis and Xenopus laevis were assigned KOG terms. The three transcriptomes have similar distributions, supporting the completeness of the R. pipiens transcriptome.

Figure 6

Kyoto Encyclopedia of Genes and Genomes (KEGG) Classification of To review the putative functions of the transcripts and to assess the completeness of the transcriptome, KEGG analysis was performed. The R. pipiens transcripts and the mRNA nucleotide entries of Xenopus tropicalis and Xenopus laevis, retrieved from NCBI, were characterized by assigning K-numbers. The distributions of these three transcriptomes closely mimic one another, suggesting the completeness of the R. pipiens transcriptome.

Differential Expression of Transcripts in the Rana pipiens Tissues

Differential expression analysis was carried out to determine the relative abundance of the R. pipiens transcripts within each tissue. In total, 23,058, 18,711, 16,359, 18,325, 24,960, 27,247, and 28,603 transcripts were detected and 72, 12, 76, 33, 182, 612, and 1,272 transcripts were unique to only that tissue in the male gonad, male liver, female gonad, female liver, female kidney, female brain, and tadpole, respectively (Figure 7). A file containing the relative expression of each transcript in individual tissues is available at http://www.davislab.net/rana/.
Figure 7

Differential Expression of Transcripts Present in Separate Differential expression analysis was performed to determine the abundance of the transcripts within each R. pipiens tissue. In total, 23,058, 18,711, 16,359, 18,325, 24,960, 27,247, and 28,603 transcripts were detected and 72, 12, 76, 33, 182, 612, and 1,272 transcripts were unique to only that tissue (blue) in the male gonad, male liver, female gonad, female liver, female kidney, female brain, and tadpole, respectively. Moreover, 12,904 transcripts were present in all seven tissues (green) and a varied number of transcripts were present in more than one, but less than six, other tissues (red).

Identification of Rana pipiens Endocrine-Related Genes

Increasing evidence demonstrates that agricultural contaminants, such as veterinary pharmaceuticals, fertilizers, and pesticides, can alter endocrine activities in wildlife and other vertebrates, including humans 5,6. Therefore, because of its range and habitat, R. pipiens can serve as an ideal sentinel organism for monitoring the potential effects of these chemicals on other affected organisms. To this end, putative R. pipiens homologs of major endocrine-related genes were identified and exhibited a high degree of similarity to those genes of X. laevis (Table 4).
Table 4

Rana pipiens endocrine-related genes.

Gene NameGene SymbolGene IDQuery IDLength (bp)Subject IDE-ValueBit Score
Androgen Receptorar399456Rana_pipiens_Transcript_2505147358NP_001084353.101209
Cytochrome P450, Family 17, Subfamily A, Polypeptide 1cyp17a1100036774Rana_pipiens_Transcript_2806882963AAG42003.10777
Cytochrome P450, Family 19, Subfamily A, Polypeptide 1cyp19a1373656Rana_pipiens_Transcript_2638653388BAA90529.10837
Hydroxysteroid (17-beta) Dehydrogenase 12hsd17b12-b379747Rana_pipiens_Transcript_4785851367NP_001080055.19.00E-87264
Estrogen Receptor 1 (alpha)esr1-a398734Rana_pipiens_Transcript_0573184278AAQ84782.10924
Estrogen Receptor 2 (beta)esr2100174814Rana_pipiens_Transcript_3347802192NP_001124426.10917
Glucocorticoid Receptornr3c1-a378598Rana_pipiens_Transcript_4490226500CAA54804.10589
Hydroxy-Delta-5-Steroid Dehydrogenase, 3 Beta- and Steroid Delta-Isomerase 1hsd3b1734818Rana_pipiens_Transcript_2842462143NP_001089754.10536
Hypoxia Inducible Factor 1 Alphahif-1a445838Rana_pipiens_Transcript_1901404052ABF71072.101264
Steroid 11-Beta-Hydroxylase Protein--Rana_pipiens_Transcript_1418961464AAQ04666.13.00E-152431
Steroidogenic Acute Regulatory Proteinstar100381120Rana_pipiens_Transcript_3882503776NP_001167502.12.00E-168497
Vitellogeninvtga2100037071Rana_pipiens_Transcript_4780905777NP_001152753.101126

To aid in the use of R. pipiens as a sentinel organism, putative R. pipiens endocrine-related genes were identified and the top BLAST hits are displayed.

Methods

Organism Growth Conditions, RNA Sample Preparation and Generation of Sequence Data

Adult frogs and tadpoles were obtained from Science Kit & Boreal Laboratories, Rochester, NY (Item Number 67496-32 and 67040-12, wild-caught). Adult R. pipiens were sacrificed and their organs, including gonad, liver, kidney, and brain, were isolated according to approved IUCAC protocols. Upon their arrival, the tadpoles were housed overnight (Group 1) and for one week (Group 2) within a 40 gallon tank. Following each period, the tadpoles were starved and placed into a tetracycline (100 mg/L) solution for 24 hours then sacrificed in accordance with approved IUCAC protocols. To preserve the integrity of the RNA, the isolated tissues were stored in RNAlater (Qiagen) for 24 hours at 4°C then placed at -80°C. Total RNA was extracted from each tissue individually, using the RNeasy Plus Mini Kit (Qiagen). mRNA was purified from the total RNA preparations, using poly-T oligo-attached magnetic beads and converted into cDNA with random primers using the TruSeq® RNA Sample Preparation Kit v2 (Illumina). Libraries were sequenced using paired-end 100 bp reads with an Illumina HiSeq 2000 sequence analyzer at the University of Nebraska Medical Center.

Transcriptome Assembly

To facilitate an accurate transcriptome assembly, reads were processed using PRINSEQ (version 0.17.3 lite) then Khmer (version 181e441) 7,8. The processed reads were then assembled using Velvet (version 1.2.07) and Oases (version 1.2.07) in multi-k fashion 2,3. Putative coding transcripts were identified in the redundant transcriptome by removing any transcript less than 200 bp with PRINSEQ, translating with TransDecoder (release Jan 16, 2014), reducing redundancy of the predicted proteins with CD-HIT and simplicity with PRINSEQ (version 0.17.3 lite), and then using blat (version 35x1) to compare against UniProt (release 2014_02) and the X. tropicalis NCBI proteome (retrieved April 1, 2014). Potential non-coding RNAs were identified using blastn (E-value ≤ 1e-5) and the Rfam (version 11.0) and NONCODE (version 3.0) databases 9,10. Transcripts that closely matched a UniProt or X. tropicalis protein or non-coding RNA were counted and in the case of multiple query matches to a single subject, the transcript with the highest bit-score was selected. The command-line codes used for transcriptome assembly are available at http://www.davislab.net/rana/.

Similarity of Rana pipiens Transcripts to Other Species

Transcripts were analyzed for sequence similarity by scanning the NCBI nr database (retrieved March 30, 2014) using blastx (E-value ≤ 1e-5). The top-hit species for each BLAST query was counted using Blast2GO (version 2.7.1) 11. Representative transcripts were compared to Anura proteins (retrieved August 13, 2014) to determine their coverage using blastx (E-value ≤ 1e-5) and MuSeqBox 12.

Percent Identity Matrix Calculation

Sequences obtained from the current assembly were aligned with sequences previously published to evaluate phylogenic relationships within amphibians 4. Alignments were trimmed with Gblocks, concatenated into a single ordered sequence, and ClustalW alignment used to produce a percent identity matix 13,14.

Gene Ontology (GO), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG)

Multiple functional analyses were performed to assess the putative functions of the R. pipiens transcripts and to validate the transcriptome completeness, with X. tropicalis and X. laevis mRNA nucleotide entries from NCBI (retrieved April 1, 2014). First, blastx (E-value ≤ 1e-5) was used to scan the transcriptome against the NCBI nr database (retrieved March 30, 2014) and GO terms were assigned using B2G4Pipe (version 2.5.0) and Blast2GO (version 2.7.1) with the b2g_may13 GO database 11. Next, transcripts were translated by OrfPredictor (version 2.3) and these results were aligned using rps-blast (E-value of ≤ 1e-5) to the NCBI KOG database (version 3.0) 15,16. Lastly, transcripts were compared to the Eukaryotic and Amphibian GENES datasets using the KEGG Automatic Annotation Server (version 1.68x) 17.

Expression Profile of the Rana pipiens Tissues

To calculate differential expression of transcripts between tissues, reads from each tissue were aligned to the transcriptome with Bowtie (version 2.2.2) and TopHat (version 2.0.11), and then Cuffdiff (version 2.2.0) was used to estimate transcript abundance 18,19,20.

Identification of Endocrine-Related Genes

Endocrine-related genes present within the R. pipiens transcriptome were identified by comparing the transcriptome to a custom BLAST database containing endocrine-related genes from X. laevis (retrieved from NCBI on October 8, 2013) using blastx (E-value ≤ 1e-5).
  17 in total

1.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

Authors:  J Castresana
Journal:  Mol Biol Evol       Date:  2000-04       Impact factor: 16.240

2.  Multi-query sequence BLAST output examination with MuSeqBox.

Authors:  L Xing; V Brendel
Journal:  Bioinformatics       Date:  2001-08       Impact factor: 6.937

3.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

Review 4.  Demasculinization and feminization of male gonads by atrazine: consistent effects across vertebrate classes.

Authors:  Tyrone B Hayes; Lloyd L Anderson; Val R Beasley; Shane R de Solla; Taisen Iguchi; Holly Ingraham; Patrick Kestemont; Jasna Kniewald; Zlatko Kniewald; Valerie S Langlois; Enrique H Luque; Krista A McCoy; Mónica Muñoz-de-Toro; Tomohiro Oka; Cleida A Oliveira; Frances Orton; Sylvia Ruby; Miyuki Suzawa; Luz E Tavera-Mendoza; Vance L Trudeau; Anna Bolivar Victor-Costa; Emily Willingham
Journal:  J Steroid Biochem Mol Biol       Date:  2011-03-23       Impact factor: 4.292

5.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

6.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.

Authors:  Marcel H Schulz; Daniel R Zerbino; Martin Vingron; Ewan Birney
Journal:  Bioinformatics       Date:  2012-02-24       Impact factor: 6.937

7.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

8.  Rfam 11.0: 10 years of RNA families.

Authors:  Sarah W Burge; Jennifer Daub; Ruth Eberhardt; John Tate; Lars Barquist; Eric P Nawrocki; Sean R Eddy; Paul P Gardner; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2012-11-03       Impact factor: 16.971

9.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

10.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions.

Authors:  Daehwan Kim; Geo Pertea; Cole Trapnell; Harold Pimentel; Ryan Kelley; Steven L Salzberg
Journal:  Genome Biol       Date:  2013-04-25       Impact factor: 13.583

View more
  5 in total

1.  Kif2a Scales Meiotic Spindle Size in Hymenochirus boettgeri.

Authors:  Kelly E Miller; Adam M Session; Rebecca Heald
Journal:  Curr Biol       Date:  2019-10-17       Impact factor: 10.834

2.  Comparative transcriptome analyses reveal the genetic basis underlying the immune function of three amphibians' skin.

Authors:  Wenqiao Fan; Yusong Jiang; Meixia Zhang; Donglin Yang; Zhongzhu Chen; Hanchang Sun; Xuelian Lan; Fan Yan; Jingming Xu; Wanan Yuan
Journal:  PLoS One       Date:  2017-12-21       Impact factor: 3.240

3.  Genomic Takeover by Transposable Elements in the Strawberry Poison Frog.

Authors:  Rebekah L Rogers; Long Zhou; Chong Chu; Roberto Márquez; Ammon Corl; Tyler Linderoth; Layla Freeborn; Matthew D MacManes; Zijun Xiong; Jiao Zheng; Chunxue Guo; Xu Xun; Marcus R Kronforst; Kyle Summers; Yufeng Wu; Huanming Yang; Corinne L Richards-Zawacki; Guojie Zhang; Rasmus Nielsen
Journal:  Mol Biol Evol       Date:  2018-12-01       Impact factor: 16.240

4.  Effects of aqueous extract of Arctium lappa L. roots on serum lipid metabolism.

Authors:  Bo Hou; Wencheng Wang; Hui Gao; Shanglang Cai; Chunbo Wang
Journal:  J Int Med Res       Date:  2017-07-31       Impact factor: 1.671

5.  Mining Amphibian and Insect Transcriptomes for Antimicrobial Peptide Sequences with rAMPage.

Authors:  Diana Lin; Darcy Sutherland; Sambina Islam Aninta; Nathan Louie; Ka Ming Nip; Chenkai Li; Anat Yanai; Lauren Coombe; René L Warren; Caren C Helbing; Linda M N Hoang; Inanc Birol
Journal:  Antibiotics (Basel)       Date:  2022-07-15
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.