| Literature DB >> 24116189 |
Carl J Rothfels1, Anders Larsson, Fay-Wei Li, Erin M Sigel, Layne Huiet, Dylan O Burge, Markus Ruhsam, Sean W Graham, Dennis W Stevenson, Gane Ka-Shu Wong, Petra Korall, Kathleen M Pryer.
Abstract
BACKGROUND: Molecular phylogenetic investigations have revolutionized our understanding of the evolutionary history of ferns-the second-most species-rich major group of vascular plants, and the sister clade to seed plants. The general absence of genomic resources available for this important group of plants, however, has resulted in the strong dependence of these studies on plastid data; nuclear or mitochondrial data have been rarely used. In this study, we utilize transcriptome data to design primers for nuclear markers for use in studies of fern evolutionary biology, and demonstrate the utility of these markers across the largest order of ferns, the Polypodiales. PRINCIPALEntities:
Mesh:
Substances:
Year: 2013 PMID: 24116189 PMCID: PMC3792871 DOI: 10.1371/journal.pone.0076957
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of the genes for which we designed primers.
| Length (CDS; in bp) |
| ||||||
|---|---|---|---|---|---|---|---|
| Abbreviation | Protein Name |
| Ferns | TAIR Gn# | # of Introns | Chromo. # | |
| 1 |
| appr-1-p processing enzyme family protein | 1689 | ~1650-1743 | AT1G69340 | 13 | 1 |
| 2 |
| cryptochrome 2 | 2046 | ~2000 | AT4G08920 | 3 | 4 |
| 3 |
| cryptochrome 4 | 1839 | ~2100 | AT1G04400 | 3 | 1 |
| 4 |
| Nuclear-localized regulator of plant development | 1632 | ~1600-2700 | AT4G10180 | 9 | 4 |
| 5 |
| Plastid-localized GAPDH, short copy | 1266; 1260 | 1315 | AT1G79530; AT1G16300 | 13 | 1 |
| 6 |
| IBA-Response 3 (acyl-CoA dehydrogenase) | 2475 | ~2445-2490 | AT3G06810 | 16 | 3 |
| 7 |
| glucose-6-phosphate isomerase / sugar isomerase family protein | 1683 | --a | AT5G42740 | 21 | 5 |
| 8 |
| Sulfoquinovosyldiacylglyerol 1 | 1431 | ~1515-1521 | AT4G33030 | 1 | 4 |
| 9 |
| a cytokinesis protein targeted to the cell plate | 3531 | --a | AT3G01780 | 6 | 3 |
| 10 |
| transducin family protein / WD-40 repeat family protein | 2868 | --a | AT3G21540 | 11 | 3 |
For each gene, we list its length in ferns and in Arabidopsis, provide the TAIR accession number for the Arabidopsis sequence (as well as its number of introns and chromosomal position). The TreeBASE accession number for our “all-in” fern alignments is S14616. Comparisons with Arabidopsis thaliana are based on the most closely related homolog(s). a These loci were trimmed to a focal region prior to completion, so the full length of the coding DNA sequence (CDS) is unknown.
Priming details for 20 novel nuclear markers.
|
| ||||
|---|---|---|---|---|
|
|
|
|
| |
|
| 1 | 4218Cf4, 4218Cr12 | GGACCTGGSCTYGCTGARGAGTG, | 6512035 |
|
| 1a | 4218Cf4, 4218Cr3 | GGACCTGGSCTYGCTGARGAGTG, | 5506035 |
|
| 1b | 4218Cf6, 4218Cr6 |
| 5506035 |
|
| 2 | 4218f25, 4218r7 |
| 5509035 |
|
| 3 | 4218f26, 4218r13 |
| 6512035 |
|
| 1 | CRY2F3289_Pt, CRY2R3838_Pt |
| 5209035 |
|
| 1 | CRY2F3289_Pt, CRY2R3838_Pt |
| 5209035 |
|
| 1 | det1-335all, det1-906all |
| 5506035 |
|
| 1 | gapCpShF1, gapCpShR2 | TGCACMACHAACTGCCTTGCRCCBCTTGC, | 6512035 |
|
| 1 | 4321F2, 4321R2 |
| 6312035 |
|
| 2 | 4321F5, 4321R6 | ATGACYGAACCAGATGTKGCDTCVTCRGATGC, TGRTGGAGYCTKCCTGGGCCTA | 6512035 |
|
| 1 | pgic_1156F, pgiC_1900R |
| 5812035 |
|
| 1 | EMSQD1E1F6, EMSQD1E1R2 |
| 5512035 |
|
| 1a | EMSQD1E1F6, EMSQD1E1R4 |
| 5512035 |
|
| 2 | EMSQD1E2F4, EMSQD1E2R8 |
| 5512035 |
|
| 1 | 6560_1630F, 6560_2329R | TGCYTAGTSGARAGYTGYTTTCA, | 5812035 |
|
| 2 | 6560_3136F, 6560_3686R |
| 5812035 |
|
| 1 | 6928_850F, 6928_1357R | TTRCGBGGRCAYARAGATCA, GGAWCSTTARTSGGYTGCCAA | 5812035 |
|
| 2 | 6928_1955F, 6928_2816R |
| 5812035 |
|
| 3 | 6928_3406F, 6928_3802R | TCBATTCGRMGATGGGAGCG, CAAACYCARGARWCYSTGAC | 5812035 |
The first two digits of the PCR program is the annealing temperature, followed by a three-digit elongation time (in seconds), followed by the number of cycles.
Sequence characteristics for the single-copy regions developed in this study.
|
|
| |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
| 1 | >38; >25a | 50; 25b | ? | ? | >751 | >690 | >829 | 721 | 931 | 761 | 846; >871c | 843; 837c | >849 | >866 | 932; 934c | 687 | 939 |
|
| 1a | >13; >12a | 10; 9b | 145 | ? | 258d | 321d | 281 | 300d | 293 | 267 | ~257 | ~255 | 201d | 245 | 161 | >225 | 255 |
|
| 1b | 6; 6a | 11; 3b | ? | ? | ? | 224 | 209d | 223d | 269 | 223 | 224; 223d | 216; 223d | 224 | 214d | 223d | 223 | 222 |
|
| 2 | 5 | ? | 382 | 334 | 360 | 360 | 340 | 357 | 406 | 359 | ? | ? | 360 | 359 | 358 | 359 | 357 |
|
| 3 | 14 | 11; 25 | ? | 377 | ? | >506 | >776 | 770 | 802 | 454 | 385 | 378; 384c | 490 | 482 | 461c | 489 | 457 |
|
| 1 | 9 | 14 | 516 | 516 | 516 | 516 | 516 | 516 | ? | 516 | 516 | 516 | 516 | 516 | 516 | 516 | 516 |
|
| 1 | 2 | ? | ? | 516 | 516 | 516 | 516 | ? | 516 | 516 | ? | 516 | 516 | 516 | 516 | 516 | ? |
|
| 1 | ~5 | 6 | ? | ? | ? | ~630 | ? | ? | ? | 667 | 668 | 668 | ~670 | 665 | 664 | 669 | 669 |
|
| 1 | ? | 14 | ? | 455 | 459 | ? | ? | 482 | 476 | 522c | 531 | 525 | ? | ? | 466 | 517 | 592 |
|
| 1 | >16 | 29; 31 | ? | ? | 870 | 817 | >700 | 827 | 836 | 819; 828c | 843c | 844 | 815 | >819 | 840 | >910 | 821 |
|
| 2 | ~6 | ~21 | ~600 | >766 | 611 | ~574 | 581 | 568 | ~1196 | ? | ~590 | 595 | 586 | ~580 | ~582 | 579 | 588 |
|
| 1 | >32 | >19 | 674 | ? | ? | ? | ? | ? | ? | 625 | >615 | 678 | >474 | >664 | >581 | 619 | 620 |
|
| 1 | 10 | >8e | ? | 700 | 668 | 700 | 700 | 700 | 700 | 700 | ? | 700 | 700 | 700 | 700 | 700 | 685 |
|
| 1a | 8 | 8 | 530 | 530f | 530f | 530f | 530f | 530f | 530f | 530f | 529 | 530f | 530f | 530f | 530f | 530f | 530f |
|
| 2 | 1 | 3 | 264 | 263 | 264 | 264 | 256 | 264 | 263 | 264 | 264 | 264 | 264 | 264 | 233 | 264 | ? |
|
| 1 | 12 | 14 | 719 | >657 | ? | 646 | >561 | >529 | 627 | 711 | 696 | 698 | >638 | 710 | >662 | >692 | 687 |
|
| 2 | 5 | 10 | >327 | ? | ? | ? | 551 | ? | 529 | 512 | 497 | 493 | 302 | 427 | 424 | 722 | 541 |
|
| 1 | 11 | ? | ? | ? | ? | 402 | 397 | 426 | 416 | ? | >381 | ? | 436 | 437 | 435 | 420 | 421 |
|
| 2 | 11 | >7 | >521 | ? | ? | ? | >426 | ? | 539 | 534 | 529 | >445 | 518 | 525 | 517 | 514 | 504 |
|
| 3 | 6 | 7 | ? | ? | ? | 227 | 308 | 231 | ? | 268 | 251 | 251 | 242 | 244 | 244 | 243 | 242 |
Cys: Cystopteris, Poly: Polypodium, Cya: Cyatheales, Lin.: Lindsaea, Sac.: Saccoloma, Adi: Adiantum, Che: Cheilanthes, Cry: Cryptogramma, Den: Dennstaedtia, P.am.: Polypodium amorphum, P.gly.: Polypodium glycyrrhiza, Ath.: Athyrium, C.bu.: Cystopteris bulbifera, C.pr.: Cystopteris protrusa, The: Thelypteris, Woo: Woodsia. a The two values come from comparing the single incomplete Cystopteris bulbifera sequence against two sequences cloned from C. protrusa. b This locus has a duplication in Polypodium; these values are the number of bp changes between each of the ortholog pairs. c Required cloning. d These lengths are derived from the corresponding portion of the APPEFP_C Region 1 alignment (we did not attempt to amplify Region 1a or 1b for all taxa). e For P. amorphum we were not able to amplify Region 1 for this locus, only Region 1a. f These lengths are derived from the corresponding portion of the SQD1 Region 1a alignment (Region 1a for all taxa).
Figure 1Schematic diagrams of the ten nuclear genes for which we developed fern-specific primers.
(A) ApPEFP_C; (B) CRY2; (C) CRY4; (D) DET1; (E) gapCpSh; (F) IBR3; (G) pgiC; (H) SQD1; (I) TPLATE; (J) transducin. Each subset of the figure represents one protein-coding locus, using the most closely related Arabidopsis thaliana homolog as the template. The coding sequence is measured (in base pairs) along the bottom of the thickened horizontal line, with each locus wrapping onto a new line every 2000 base pairs, when necessary. Intron location, number, and length (in base pairs in Arabidopsis) are given above the line. Also shown below the line are the priming locations for each of the markers we developed. For gapCpSh, intron locations are based on Arabidopsis gapCp1: the first two exons of Arabidopsis gapCp2 are each one codon shorter than in gapCp1.
Figure 2Maximum likelihood phylograms for each region, including only those taxa that were successfully sequenced from our 15-taxon genomic DNA test set.
Bold branches indicate strong support (≥70% bootstrap support). Scale bars are in units of substitutions per site. In the taxon names, “C.” and “P.” refer to Cystopteris and Polypodium, respectively. These phylograms are unrooted, but oriented as if rooted by the Cyatheales (or our best guess, when the Cyatheales accession did not sequence successfully), when space permits.
Model comparison, by the Bayesian Information Criterion (BIC).
| Support Values by Branch (ML bootstrap percent) | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | lnL | BIC | Subsets | Param. | A | B | C | D | E | F | G | H | I | J | K | L |
| 1: Pos. & Locus | -40574.4 | 83316.0 | 30 | 238 | 56 |
| 53 |
|
|
|
|
|
|
|
| 58 |
| 2a: Locus (each) | -42767.5 | 86819.0 | 19 | 141 | 49 |
|
|
|
|
|
|
|
|
|
| 61 |
| 2b: Locus (scheme) | -42748.0 | 86470.3 | 11 | 107 | 52 |
|
|
|
|
|
|
|
|
|
| 67 |
| 3: Pos. | -40875.4 | 82370.0 | 4 | 68 | 55 |
| 41 |
|
|
|
|
|
|
|
| 42 |
| 4: Unpartitioned | -43190.2 | 86717.3 | 1 | 37 | 66 |
| 61 |
|
|
|
|
|
|
|
| 41 |
Values in bold face indicate strong support (≥70%). Branch designations (A – L) refer to Figure 3. Model 1 is the best PartitionFinder scheme given each codon position, for each locus, as the data blocks. In model 2a each locus gets its own partition, across codon positions. Model 2b is the best PartitionFinder scheme given the loci as the data blocks. Model 3 is partitioned by codon position, across loci. Model 4 is not partitioned. For substitution model parameterization, see Appendix S2. Subsets = the final number of subsets (“partitions”) for that model. Param. = number of free parameters.
Figure 3Combined data maximum likelihood phylogram of our 15-taxon genomic DNA test set.
Analyses were performed under our best-fitting model (model 3, see Table 3). Bold branches indicate strong support (≥70% bootstrap support); internal branches are labeled A – L for ease of discussion.
Figure 4Flowchart of our transcriptome-mining pipeline.
Figure 5Example of our sequence-merging protocol.
(A) In this schematic of a transcriptome alignment, aligned sequence fragments are indicated by the horizontal bars. Included are four fragments (colored) from our focal accession, which group together in the maximum parsimony tree. However, the two fragments from the 5’ end of the protein (in red) have some base pair conflicts with each other, as do the fragments from the 3’ end (in blue). Since the two sets of fragments do not overlap, and they group in the same area of the MP tree, it is not possible to determine which 5’ fragment belongs with which 3’ one. In this case we merged the sequences arbitrarily (B). The resulting alignment retains the full nucleotide data for primer-design purposes, but the relationships at the tips of the tree may be erroneous due to the two potentially chimaeric sequences.