| Literature DB >> 26900589 |
Manolo F Perez1, Bryan C Carstens2, Gustavo L Rodrigues1, Evandro M Moraes1.
Abstract
Supportive data related to the article "Anonymous nuclear markers reveal taxonomic incongruence and long-term disjunction in a cactus species complex with continental-island distribution in South America" (Perez et al., 2016) [1]. Here, we present pyrosequencing results, primer sequences, a cpDNA phylogeny, and a species tree phylogeny.Entities:
Keywords: Molecular markers; Next generation sequencing; Non-model species; Phylogeography; Species tree
Year: 2015 PMID: 26900589 PMCID: PMC4716445 DOI: 10.1016/j.dib.2015.12.002
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Results from pyrosequencing runs and filtering steps.
| Total number of reads | 2,282,266 |
| Barcoded reads >100 bp and QC | 1,511,080 |
| Mean number of aligned loci (95% similarity) | 13,218 |
| Mean number of pre-loci (≥ 5 similar sequences in one individual) | 892 |
| Paralog filter (≤20 SNPs) | 530 |
| Loci in ≥ 10 individuals | 223 |
| Loci in all species | 167 |
| Loci in all pops | 48 |
| Manual paralog inspection | 36 |
| Without matches in Genbank | 26 |
| Amplified in all individuals tested (including outgroup) | 25 |
Blastn matches for the pyrosequencing 223 filtered loci occurring in more than 10 individuals.
| ANL ( | 121 |
| cpDNA | 8 |
| mtDNA | 27 |
| Retrotransposon | 5 |
| RNA | 62 |
Primers and statistics for each locus.
| PaANL008 | TCCTCTCTTTTCTAGGGACGAC | CCCCATTCTTTCTTCATTCTATC | 52 | 58 | 497 | 5 | 0.002 | 0.001 | −0.89 | −0.89 | −1.04 | F81 | −776.2717 | |
| PaANL010 | GAGAACGTCAATCCGACAGG | GAACATAGGCTGGCCTCTTC | 53 | 70 | 473 | 4 | 0.002 | 0.001 | −1.03 | 0.97 | 0.40 | JC | −743.16 | |
| PaANL015 | GACCCTAACGAGGGTGAGAC | AAATCATTTCATGAGGCATCG | 51 | 56 | 461 | 27 | 0.020 | 0.010 | −1.57 | 1.51 | 0.48 | F81+G | −1011.27 | |
| PaANL017 | TGTCCACCCCATAGAAGAGG | TTTAGATGAGTCCCAAAAGATACAC | 55 | 80 | 309 | 31 | 0.020 | 0.013 | −1.11 | 1.93 | 0.92 | K80 | −655.69 | |
| PaANL028 | CGTAGCAAACAGACATCCACTT | AAGAAATGCAACAAAAGAGTACCA | 54 | 48 | 459 | 13 | 0.009 | 0.003 | −2.01 | 0.50 | –0.40 | F81 | −746.59 | |
| PaANL035 | TCCTCTTTCCTACCATTCTTTCT | GTTTGAGGAAGGCAGAGGAG | 54 | 44 | 340 | 9 | 0.006 | 0.002 | −1.94 | −0.56 | −1.18 | HKY | −536.95 | |
| PaANL046 | ACTTTCCTGTRTCATATGTAA | CGAACTGGCCTCGGATTC | 50 | 48 | 404 | 25 | 0.014 | 0.006 | −1.87 | −1.61 | −2.02 | F81 | −845.07 | |
| PaANL050 | CGGGTCTAACTTGCCTTCAA | ACCCAACCGGTCAGATTGT | 58 | 52 | 450 | 29 | 0.017 | 0.016 | −0.10 | 1.27 | 0.93 | HKY+I | −942.70 | |
| PaANL080 | AAGAAGAACGGGCGAGTTG | AGGAGGTGGCAATGCAGTAG | 58 | 80 | 477 | 25 | 0.012 | 0.011 | −0.43 | 1.83 | 1.18 | HKY+G | −1013.74 | |
| PaANL082 | CCAAGCAATATCGCATAAACAA | GGCACTAACTGATTCAATAACTGGT | 55 | 64 | 383 | 6 | 0.003 | 0.001 | −1.72 | 1.16 | 0.27 | GTR+I | −674.07 | |
| PaANL087 | TCTTTATGGCGTTATTCACTCG | CGAAGGCCTAACTTGACAGG | 58 | 46 | 395 | 3 | 0.002 | 0.001 | −1.32 | 0.90 | 0.27 | K80 | −647.56 | |
| PaANL096 | AGAAATGTGGGTCAGGAGGA | GAAATGCACATGCCTAGTGA | 56 | 44 | 436 | 17 | 0.011 | 0.003 | −2.18 | −2.42 | −2.77 | F81 | −789.03 | |
| PaANL123 | TTGCATGTTTATACAATTTTTCTTG | TGATAGATGCCAATCAGTCCAC | 55 | 40 | 387 | 18 | 0.011 | 0.006 | −1.36 | 1.25 | 0.45 | HKY | −690.90 | |
| PaANL126 | TCCTAAACAAGGGCTACGAAG | TGTACCAATGGGCAGCAC | 60 | 52 | 451 | 15 | 0.008 | 0.005 | −1.21 | −0.75 | −1.07 | GTR+I | −901.97 | |
| PaANL134 | CGTGGTTTGACAAAACTTACCC | TCAGTGTTTCTAAGATGCTGCAC | 58 | 44 | 473 | 17 | 0.009 | 0.005 | −1.35 | −1.21 | −1.49 | HKY | −837.50 | |
| PaANL140 | TAGCCTCCTGAGCCCAAGC | GTTCATCAATGGGGAAGGTG | 60 | 36 | 478 | 5 | 0.003 | 0.002 | −1.45 | 0.39 | −0.20 | HKY | −759.99 | |
| PaANL142 | CAAGCCTCTCCCTATAAC | TATAGAGTCTAGGCAAGGC | 59 | 36 | 483 | 26 | 0.015 | 0.013 | −0.62 | 0.41 | 0.08 | K80 | −945.42 | |
| PaANL147 | CTGTTGGCTCTGCATAGCTG | TGCTACACTGGCTTCATTGC | 58 | 36 | 440 | 14 | 0.010 | 0.005 | −1.60 | −0.23 | −0.79 | F81+G | −940.03 | |
| PaANL155 | CTTTTCAGTCCAAAGCAAATTC | AAGGTCAGTAAGTCAAGCTCCTC | 56 | 60 | 458 | 5 | 0.003 | 0.001 | −1.61 | 1.08 | 0.27 | F81 | −683.15 | |
| PaANL160 | CGTGCTTTTACCTCCGTAAAG | CTAAGGGCTAATGGTGCTAGG | 56 | 44 | 489 | 26 | 0.014 | 0.010 | −0.93 | 1.86 | 0.96 | HKY | −839.39 | |
| PaANL165 | AGCCCTATATGTGGAAGG | GGAGTGCTTTCAAGCCTTTG | 58 | 38 | 478 | 37 | 0.024 | 0.013 | −1.59 | 0.62 | −0.17 | GTR | −954.68 | |
| PaANL182 | TTCAGGCTTAGGTTGGTGTTC | AGGGTCGTCACGATCATCC | 60 | 40 | 476 | 33 | 0.019 | 0.010 | −1.68 | −2.97 | −2.30 | HKY | −945.80 | |
| PaANL187 | CCGATTGAGGCTAGAAGCTG | TGTCTCTTGGCTTTACTTTAGGG | 58 | 40 | 485 | 28 | 0.015 | 0.007 | −1.92 | 1.24 | 0.20 | GTR | −772.03 | |
| PaANL196 | GCTTGGAGGTTTCCAATGAG | GAATGCTAAGGCCAAAAAGC | 56 | 38 | 435 | 43 | 0.028 | 0.022 | −0.91 | 1.38 | 0.70 | HKY+I | −818.35 | |
| PaANL205 | AAATCGGAGTCACAACAGAGA | TACCGAGATCTTGCGATGC | 54 | 52 | 382 | 23 | 0.013 | 0.008 | −1.46 | 1.43 | 0.49 | F81 | −819.18 |
Tm – melting temperature (°C) for each pair of primer, N – number of obtained sequences, bp – length in base pairs, S – number of segregating sites, θw – Waterson׳s theta, π – nucleotide diversity, Tajima׳s D, Fu and Li׳s D, Fu and Li׳s F. Numbers in bold represent the model with higher marginal posterior probabilities after the path sampling test.
Significance is shown at 0.05.
Significance is shown at 0.02.
Comparison of the divergence times (Mya) estimated for the plastid dataset and the combined multilocus dataset.
| Parameter | cpDNA | Combined |
|---|---|---|
| Mean | 1.7027 | 1.6862 |
| SD | 0.5938 | 0.2515 |
| Variance | 0.3526 | 0.0633 |
| 95% HPD | 0.6915–2.884 | 0.9131–1.766 |
| Subject area | Biology, Genetics and Genomics |
| More specific subject area | Phylogenetics and Phylogenomics |
| Type of data | Pyrosequencing filtering steps, primer sequences and characteristics, species tree analysis input and output, species tree and cpDNA phylogenetic tree |
| How data was acquired | Pyrosequencing filtering in pyRAD, primer sequences designed with Primer3, primer characteristics gathered with DNAsp, species tree and cpDNA phylogenetic tree generated with BEAST2 |
| Data format | Filtered and analyzed |
| Experimental factors | n/a |
| Experimental features | Pyrosequencing of reduced genomic libraries, development of primers and Sanger sequencing for primer validation and missing data reduction |
| Data source location | n/a |
| Data accessibility | With this article, GenBank accession numbers GenBank: KU161695–KU162858 |