| Literature DB >> 26241739 |
Magdy S Alabady1, Willie L Rogers1, Russell L Malmberg1.
Abstract
We describe restriction site associated RNA sequencing (RARseq), an RNAseq-based genotype by sequencing (GBS) method. It includes the construction of RNAseq libraries from double stranded cDNA digested with selected restriction enzymes. To test this, we constructed six single- and six-dual-digested RARseq libraries from six F2 pitcher plant individuals and sequenced them on a half of a Miseq run. On average, the de novo approach of population genome analysis detected 544 and 570 RNA SNPs, whereas the reference transcriptome-based approach revealed an average of 1907 and 1876 RNA SNPs per individual, from single- and dual-digested RARseq data, respectively. The average numbers of RNA SNPs and alleles per loci are 1.89 and 2.17, respectively. Our results suggest that the RARseq protocol allows good depth of coverage per loci for detecting RNA SNPs and polymorphic loci for population genomics and mapping analyses. In non-model systems where complete genomes sequences are not always available, RARseq data can be analyzed in reference to the transcriptome. In addition to enriching for functional markers, this method may prove particularly useful in organisms where the genomes are not favorable for DNA GBS.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26241739 PMCID: PMC4524703 DOI: 10.1371/journal.pone.0134855
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
In silico restriction digestion of the pitcher transcriptome to select restriction enzyme combinations for generating cDNA fragments suitable for RARseq libraries.
The target fragment length ranged from 150bp to 700bp in all digestions.
| Enzyme combination | Fuzznuc search pattern (Mismatch = 0) | Reported sequences | Reported hit counts |
|---|---|---|---|
| MseI—BstYI | TTAAN(145,700)[AG]GATC[CT] | 5193 | 14896 |
| MspI—PstI | CCGGN(145,700)CTGCAG | 1033 | 1968 |
| MaeII—ApoI | ACGTN(145,700)[AG]AATT[CT] | 4614 | 8176 |
| EcoRI—MseI | GAATTCN(145,700)TTAA | 2247 | 3041 |
| TasI—SbfI | AATTN(145,700)CCTGCAGG | 61 | 189 |
| Sau3A—BstYI | GATCN(145,700)[AG]GATC[CT] | 4530 | 13693 |
| MseI—AflIII | TTAAN(145,700)AC[AG][CT]GT | 2737 | 7298 |
| BstYI—AflIII | [AG]GATC[CT]N(145,700)AC[AG][CT]GT | 1179 | 1681 |
| MseI—StyI | TTAAN(145,700)CC[AT][AT]GG | 4643 | 13280 |
Results summary of the STACKS’ population genome analysis using de novo approach.
Numbers in parenthesis are the results after applying STACKS correction module to make population-based correction.
| F2 individual | MseI RARseq | Mse1-Styl RARseq | ||||||
|---|---|---|---|---|---|---|---|---|
| Number of RARtags | Unique Stacks | Polymorphic Loci | SNPs | Number of RARtags | Unique Stacks | Polymorphic Loci | SNPs | |
| F2-3 | 108241 | 1794 (1701) | 211 (196) | 378 (327) | 303222 | 1929 (1817) | 293 (256) | 561 (467) |
| F2-4 | 186507 | 1049 (978) | 162 (142) | 299 (258) | 353812 | 1727 (1639) | 269 (240) | 494 (418) |
| F2-6 | 257352 | 2537 (2417) | 345 (319) | 629 (562) | 247579 | 2668 (2533) | 468 (424) | 848 (739) |
| F2-7 | 208467 | 2424 (2315) | 389 (369) | 678 (615) | 326714 | 2049 (1930) | 329 (293) | 623 (531) |
| F2-8 | 300259 | 3143 (2994) | 429 (397) | 790 (702) | 303183 | 1831 (1717) | 292 (265) | 526 (452) |
| F2-9 | 288290 | 2058 (1928) | 266 (238) | 490 (423) | 305761 | 1682 (1551) | 273 (233) | 491 (387) |
| Average | 224852.7 | 2167.5 (2055.5) | 300.3 (276.8) | 544 (481.2) | 306711.8 | 1981 (1868.5) | 320.7 (285.2) | 590 (499) |
Fig 1RARseq protocol.
The flowchart illustrates workflow and analysis guidelines of the RARseq method.
Results summary of the STACKS’ population genome analysis using reference approach.
Numbers in parenthesis are the results after applying STACKS correction module to make population-based correction.
| F2 individual | MseI RARseq | Mse1-Styl RARseq | ||||||
|---|---|---|---|---|---|---|---|---|
| %RARtags mapped to reference | Unique Stacks | Polymorphic Loci | SNPs | %RARtags mapped to reference | Unique Stacks | Polymorphic Loci | SNPs | |
| F2-3 | 84.97 | 2874 (2874) | 384 (385) | 1188 (1190) | 83.29 | 3190 (3190) | 513 (517) | 1785 (1795) |
| F2-4 | 81.65 | 2134 (2133) | 384 (384) | 1096 (1094) | 80.95 | 2636 (2635) | 480 (488) | 1723 (1723) |
| F2-6 | 82.65 | 5111 (5110) | 836 (837) | 2665 (2643) | 86.7 | 2985 (2984) | 477 (481) | 1642 (1614) |
| F2-7 | 85.34 | 4068 (4068) | 753 (755) | 2433 (2440) | 84.28 | 3587 (3587) | 723 (732) | 2492 (2507) |
| F2-8 | 83.82 | 5149 (5147) | 853 (860) | 2872 (2858) | 85.76 | 3076 (3076) | 572 (581) | 2094 (2112) |
| F2-9 | 68.5 | 2626 (2626) | 355 (356) | 1192 (1193) | 61.33 | 2358 (2358) | 407 (410) | 1430 (1434) |
| Average | 80.9 | 3660 (3659.7) | 594.2 (596.2) | 1907.7 (1903) | 80.4 | 2972 (2971.7) | 528.7 (534.8) | 1876 (1864) |
Fig 2Heterozygozity and homoygozity in RARseq data.
Illustration of the observed and expected hetero- and homozygosity calculated from MseI and MseI-Styl RARseq data using both de novo and reference approaches.
Summary of SNPs, Alleles, Haplotypes, and gene diversity inferred from RARseq data.
| SNPs and Alleles | Haplotype and Gene Diversity | ||||||
|---|---|---|---|---|---|---|---|
| Number of catalog loci | Average number of SNP | Average number of alleles | No. Of Haplotypes | Mean Haplotype count | Mean Gene diversity | Mean haplotype diversity | |
| De novo MseI | 549 | 1.767 | 2.111 | 28 | 2.214 | 0.437 | 0.686 |
| De novo MseI-Styl | 488 | 1.627 | 2.113 | 48 | 2.271 | 0.502 | 0.785 |
| Ref-based MseI | 377 | 1.934 | 2.228 | 351 | 1.912 | 0.585 | 0.949 |
| Ref-based MseI-Styl | 345 | 2.235 | 2.226 | 321 | 1.953 | 0.547 | 0.892 |
Fig 3The correlation between haplotype and gene diversity inferred from RARseq data.
Both haplotype and gene diversities were calculated from the de novo analysis (top plots) and the reference-based analysis (bottom plots) of MseI and MesI-Styl RARtags.
Fig 4Analysis of the cDNA, digested cDNA, and RARseq libraries using Fragment Analyzer.
(A) Double stranded cDNA, (B) MseI digested cDNA, (C) Styl digested cDNA, (D) MseI-Styl double digested cDNA, (E) MseI RARseq library, (F) MseI-Styl RARseq library, (G) Pool of MseI RARseq libraries, (H) Pool of MseI RARseq libraries after 1X SPRI cleaning.