Literature DB >> 30131898

Fluidigm2PURC: Automated processing and haplotype inference for double-barcoded PCR amplicons.

Paul D Blischak¹, Maribeth Latvis², Diego F Morales-Briones³, Jens C Johnson⁴, Verónica S Di Stilio⁴, Andrea D Wolfe¹, David C Tank^5,6,7.

Abstract

PREMISE OF THE STUDY: Targeted enrichment strategies for phylogenomic inference are a time- and cost-efficient way to collect DNA sequence data for large numbers of individuals at multiple, independent loci. Automated and reproducible processing of these data is a crucial step for researchers conducting phylogenetic studies. METHODS AND
RESULTS: We present Fluidigm2PURC, an open source Python utility for processing paired-end Illumina data from double-barcoded PCR amplicons. In combination with the program PURC (Pipeline for Untangling Reticulate Complexes), our scripts process raw FASTQ files for analysis with PURC and use its output to infer haplotypes for diploids, polyploids, and samples with unknown ploidy. We demonstrate the use of the pipeline with an example data set from the genus Thalictrum (Ranunculaceae).
CONCLUSIONS: Fluidigm2PURC is freely available for Unix-like operating systems on GitHub (https://github.com/pblischak/fluidigm2purc) and for all operating systems through Docker (https://hub.docker.com/r/pblischak/fluidigm2purc).

Entities: Chemical

Keywords: bioinformatics; haplotype inference; high‐throughput sequencing; microfluidic PCR; phylogenomics; polyploidy

Year: 2018 PMID： 30131898 PMCID： PMC6025812 DOI： 10.1002/aps3.1156

Source DB: PubMed Journal: Appl Plant Sci ISSN： 2168-0450 Impact factor: 1.936

The collection of large‐scale, multilocus data sets for phylogenomic inference has become an increasingly common method for understanding evolutionary relationships within a group of taxa. Coupled with recent implementations of coalescent‐based species tree estimation programs that take into account the independent histories of different genes (e.g., SVDquartets [Chifman and Kubatko, 2014]; ASTRAL‐II [Mirarab and Warnow, 2015]), targeted enrichment strategies are powerful methods for collecting more informative data sets for conducting phylogenomic investigations. Of the many types of targeted enrichment that exist, several recent studies have begun to use a method that combines both library preparation and target amplification into a single step. This process, known as double‐barcoded amplicon sequencing (Uribe‐Convers et al., 2016), allows for the time‐ and cost‐effective collection of multilocus sequence data for large numbers of individuals. Double‐barcoded amplicon sequencing combines the amplification of a targeted region in the genome with the addition of sample‐specific barcodes and Illumina sequencing adapters to the resulting PCR product for paired‐end sequencing on an Illumina MiSeq platform (Uribe‐Convers et al., 2016). This is done by adding conserved sequence tags to traditional PCR primers, which act as templates for adding barcodes and adapters when preparing the sequencing library. Parallel amplification is most often achieved using microfluidic PCR with the Fluidigm Access Array (Fluidigm, San Francisco, California, USA; e.g., Gostel et al., 2015; Uribe‐Convers et al., 2016; Kates et al., 2017), allowing for multiple samples and loci to be amplified simultaneously (minimum of 48 samples × 48 loci). The newer Fluidigm Juno system can also handle up to 192 samples in a single run, and multiplexing of primer pairs can allow for even higher throughput, provided that the primers do not interact during amplification. Double‐barcoded amplicons can also be generated by other means using approaches such as traditional or highly multiplexed PCR (e.g., Bybee et al., 2011; Dupuis et al., 2017). Previous methods to analyze these data have typically relied on generating consensus sequences using software packages such as Geneious (Kearse et al., 2012; e.g. Gostel et al., 2015), HiMAP (Dupuis et al., 2017), or an R script, reduce_amplicons.R, that is part of the dbcAmplicons package (but see comparison with “occurrence‐based” methods in dbcAmplicons in the “Example Analysis” section; Uribe‐Convers et al., 2016; Kates et al., 2017). However, using consensus sequences can often ignore important within‐individual level variation, such as differing alleles or levels of ploidy. To alleviate this issue and to facilitate the analysis of these data for haplotype inference, we developed Fluidigm2PURC. Fluidigm2PURC consists of two main Python scripts that process input data files using several external programs (Table 1) that automate quality filtering, read merging, and file formatting for downstream steps (Fig. 1). Although it can be used to process any double‐barcoded amplicons, the software derives its name from the method of PCR amplification that we used to generate our data (Fluidigm Access Array), as well as its primary dependency, PURC, a Python program that combines sequence clustering and PCR chimera detection (Rothfels et al., 2017). The final step in the Fluidigm2PURC pipeline processes clusters from PURC and outputs a FASTA file containing phased haplotypes for all targeted sequences. This last step has methods for haplotype inference that work on diploids, polyploids, individuals with unknown ploidy, or any mixture of the three. To demonstrate the utility of Fluidigm2PURC, we analyzed nuclear amplicon data from the genus Thalictrum L. (Ranunculaceae) and compared the results with those obtained from dbcAmplicons using the reduce_amplicons.R script (Uribe‐Convers et al., 2016).

Table 1

Dependencies for the Fluidigm2PURC pipeline

Dependency (Version)	Citation	Link
PURC (v1.02)	Rothfels et al., 2017	https://bitbucket.org/crothfels/purc
Sickle (v1.33)	Joshi and Fass, 2011	https://github.com/najoshi/sickle
FLASH2 (v2.2.00)	Magoč and Salzberg, 2011	https://github.com/dstreett/FLASH2
MAFFT (v7.237)	Katoh, 2013	http://mafft.cbrc.jp/alignment/software/
Phyutility (v2.7.1)	Smith and Dunn, 2008	https://github.com/blackrim/phyutility

Figure 1

Flowchart outlining the steps for haplotype inference using Fluidigm2PURC.

Dependencies for the Fluidigm2PURC pipeline Flowchart outlining the steps for haplotype inference using Fluidigm2PURC.

METHODS AND RESULTS

Input data

The input data for Fluidigm2PURC are paired‐end FASTQ files that have been demultiplexed using the program dbcAmplicons (Uribe‐Convers et al., 2016). dbcAmplicons demultiplexes reads using the original sample barcodes and amplicon primer sequences to annotate the reads with the sample and locus name that each read comes from, followed by trimming these identifying parts of the sequence. The resulting pair of FASTQ files is then input into the first script in the pipeline, fluidigm2purc.

Step 1: fluidigm2purc

The fluidigm2purc script takes the paired‐end FASTQ files, filters them using Sickle (Joshi and Fass, 2011; minimum length = 100 bp, PHRED threshold = 20), merges the filtered reads using FLASH2 (Magoč and Salzberg, 2011), and then converts the resulting FASTQ files into FASTA files (one for each locus) with sequence header information that is compatible with PURC. The sequence headers for PURC follow the format ‘>IndividualName|LocusName|UniqueID#’. When paired reads with low‐quality bases are trimmed by Sickle, or if the targeted region is too long so that the two reads do not overlap, we merge them artificially with multiple N's inserted in between. This allows us to keep paired reads together in downstream steps. The fluidigm2purc script writes two additional files: (1) the taxon table, a two‐column table listing each sequenced taxon and its ploidy level, and (2) the locus‐err table, a two‐column table listing each sequenced locus and the average level of sequencing error for all reads coming from that locus. The taxon table lists the ploidy as “None” for all individuals by default, but known ploidy levels can be included by the user (e.g., diploid has the value “2,” tetraploid has the value “4,” etc.). For the locus‐err table, the per‐locus levels of sequencing error are calculated individually from the input FASTQ files using the average PHRED score per read averaged across all reads coming from that locus.

Step 2: PURC

The output FASTA files from fluidigm2purc can be run through PURC using the purc_recluster.py script (Rothfels et al., 2017). This script is used to iteratively run chimera detection and sequence clustering (performed with USEARCH; Edgar, 2010; Edgar et al., 2011) on each locus individually to produce a reduced set of putative haplotypes that includes size information about the number of original reads forming each cluster. Details on running PURC can be found on its Bitbucket page (https://bitbucket.org/crothfels/purc).

Step 3: crunch_clusters

The clusters output by PURC are then run through our second script, crunch_clusters, which uses the taxon table and locus‐err table output by fluidigm2purc (Step 1) to infer haplotypes in a maximum likelihood framework. This script also has options for realigning clusters using MAFFT (Katoh, 2013), as well as cleaning the clusters using Phyutility (Smith and Dunn, 2008). Before haplotypes can be inferred at a locus, we first do a pairwise comparison of all clusters for each taxon individually and merge any clusters that are identical (ignoring gaps). This step is necessary because of the initial trimming/filtering step in the fluidigm2purc script. Artificially joining unmerged reads often causes two sequencing clusters representing the same haplotype to form: (1) one cluster for reads that were merged, and (2) one cluster for the reads that were artificially merged and contain a large number of gapped sites in the middle. In this case, these two clusters should not be treated as separate haplotypes, so we combine the clusters by keeping the larger haplotype (i.e., the one with fewer gaps) and adding the sizes of the two clusters together. The alternative would be to process the original data by ignoring all reads that did not merge. However, throwing away unmerged reads could potentially discard sequence variation that should be represented in the data set, especially if most reads are unmerged, which may be the case for large amplicons. The downside of merging sequences that are identical except for gaps is that it potentially discards informative indel variation, although it is unlikely that a locus within an individual would have only one of its haplotypes containing gaps and not the others. Overall, we felt that this approach provided the best method for including more of the original data when inferring haplotypes.

Inferring haplotypes with ploidy information

For known ploidy levels, we use a multinomial likelihood to determine the number of copies of each potential haplotype using the ordered cluster sizes returned by PURC (largest to smallest). Given an individual of ploidy level K, we enumerate the number of possible haplotype configurations using integer partitions (an unordered set of integers that sums to K; Stojmenovic and Zoghbi, 1998). Because the cluster sizes are sorted, we never need to consider more than the first K largest clusters. For example, a tetraploid can have a maximum of four haplotypes, and the integer partitions to consider are (4,0,0,0), (3,1,0,0), (2,2,0,0), (2,1,1,0), and (1,1,1,1). This corresponds to (four copies of haplotype 1), (three copies of haplotype 1, one copy of haplotype 2), (two copies of haplotype 1, two copies of haplotype 2), (two copies of haplotype 1, one copy of haplotype 2, one copy of haplotype 3), and (one copy of haplotype 1, one copy of haplotype 2, one copy of haplotype 3, one copy of haplotype 4). The mathematical details for the likelihood function with an example calculation are presented in Appendix S1 (section S1.1). Once the most likely configuration has been identified, the crunch_clusters script will return each haplotype in proportion to its representation in the maximum likelihood estimate. We have also provided options to return only unique haplotypes and to treat loci as haploid, the latter of which can be used to process organellar data. The haploid option can also be used as an alternative to finding consensus sequences for nuclear loci by returning only the cluster with the most reads.

Inferring haplotypes without ploidy information

For unknown ploidy levels, we no longer have information about the maximum number of haplotypes that an individual can have. However, we can use the cluster sizes to infer which clusters from PURC are actual haplotypes versus those that are likely to be sequencing errors. We do this by calculating the likelihood that each successive haplotype in the sorted list is a “real” haplotype versus a sequencing error. As an example, consider a tetraploid with six clusters identified by PURC. We first calculate the likelihood that all clusters are errors. Then we calculate the likelihood that cluster 1 is a real haplotype, and 2 through 6 are errors. Next, we calculate the likelihood that clusters 1 and 2 are real haplotypes, and that 3 through 6 are errors. This continues until we calculate the likelihood of all six clusters being real haplotypes. We then apply a cutoff that uses the relative increase in the likelihood when an additional haplotype is added. If treating an additional cluster as a haplotype increases the likelihood by less than the cutoff, then only the previous haplotypes are kept and the others are considered errors. We use a default cutoff of 10% increase in the likelihood. An example with the likelihood function that we use for this approach is provided in Appendix S1 (section S1.2).

Example analysis

To demonstrate the use of the Fluidigm2PURC pipeline, we analyzed amplicon sequence data generated from orthologs of the nuclear gene PISTILLATA (PI) in the genus Thalictrum (Ranunculaceae), which is single‐copy in diploids and two‐copy in tetraploids (Di Stilio et al., 2005). PI is responsible for establishing stamen and petal identity during flower development in Arabidopsis thaliana (L.) Heynh. (Goto and Meyerowitz, 1994) and has been used to detect reticulation in polyploid Lepidium L. (Brassicaceae) (Lee et al., 2002; Soza et al., 2014). Given the length of the PI locus, primers were designed to sequence exons 3 to 6 in two overlapping ~600‐bp segments: exons 3 to 5 (PIS_4) and exons 4 to 6 (PIS_3). Our analyses focused on six species with known ploidy levels ranging from diploid (2n = 2x =14) to 22‐ploid (2n = 22x = 154). These species are presented in Table 2, with accession numbers following Soza et al. (2013). Paired‐end reads were demultiplexed and annotated using dbcAmplicons (Uribe‐Convers et al., 2016) followed by read trimming, merging, and sequence renaming using the fluidigm2purc script with default options. All reads coming from PIS_3 and PIS_4 were then run separately through PURC using the purc_recluster.py script (Rothfels et al., 2017). After clustering and chimera detection, we determined haplotypes for each amplicon using three different approaches: (1) consensus sequences using the ‘‐‐haploid’ option, (2) unique haplotypes assuming unknown ploidy (10% likelihood cutoff), and (3) unique haplotypes using known ploidy. For each of these methods, we realigned and cleaned the sequences using MAFFT (Katoh, 2013) and Phyutility (added the options ‘‐‐realign ‐‐clean 0.33’; Smith and Dunn, 2008).

Table 2

Thalictrum species included in the comparison of Fluidigm2PURC and dbcAmplicons.a

Species	Ploidy level	Collection information (Herbarium)c
T. thalictroides (L.) A. J. Eames & B. Boivin	2n = 2x = 14	V. Di Stilio 123 (WTU)
T. squarrosum Stephan ex Willd.	2n = 6x = 42	V. Di Stilio & X. Duan Thalictrum sp#8, 20120617 (PE)
T. macrostylum Shuttlew. ex Small & A. Heller	2n = 8x = 56	R. Penny (unvouchered)
T. pubescens Purshb	2n = 12x = 84 or 2n = 22x = 154	D. Baum & D. Howarth 375 (A)
T. revolutum DC.	2n = 20x = 140	V. Soza 1917 (WTU)
T. dasycarpum Fisch., C. A. Mey. & Avé‐Lall.	2n = 22x = 154	V. Di Stilio 110 (WTU)

All additional information is available from Soza et al. (2013), Tables S1, S3, and S4.

For all analyses, T. pubsescens was analyzed at the 22x level.

Collection information is listed as the collector(s), collection number, and the herbarium. Herbarium acronyms are according to Index Herbariorum (Thiers, 2018).

Thalictrum species included in the comparison of Fluidigm2PURC and dbcAmplicons.a All additional information is available from Soza et al. (2013), Tables S1, S3, and S4. For all analyses, T. pubsescens was analyzed at the 22x level. Collection information is listed as the collector(s), collection number, and the herbarium. Herbarium acronyms are according to Index Herbariorum (Thiers, 2018). As a comparison, we also analyzed these data using the reduce_amplicons.R script from the dbcAmplicons package (v0.8.5; Uribe‐Convers et al., 2016). This script merges paired‐end reads using FLASH2 (Magoč and Salzberg, 2011) and allows for a global read trimming size to be used for read 1, read 2, or both. Unmerged reads are treated independently, resulting in separate haplotypes for read 1 and read 2. The final result is a FASTA file with the unaligned haplotypes that can be further processed for downstream applications. We generated consensus haplotypes as well as haplotypes based on read occurrence (controlled by the minimum read frequency and minimum read count) using the default settings, and trimmed 20 bp from read 1 and 40 bp from read 2 (base quality drops off more quickly in read 2, so we removed more bases; Uribe‐Convers et al., 2016). We then aligned the resulting sequences using MAFFT (Katoh, 2013). These results were compared to the haplotypes from Fluidigm2PURC based on (1) the number of recovered haplotypes, (2) the length of the resulting alignment, and (3) the amount of gaps in the alignment.

Results

Haplotypes inferred by both methods were visualized and compared using alignment statistics computed in Geneious version 8.1.8 (Kearse et al., 2012) and MEGA version 7.0.18 (Kumar et al., 2016). Consensus sequences from the Fluidigm2PURC and dbcAmplicons pipelines were similar overall, with the reduce_amplicons.R script producing longer haplotypes, but containing more gaps (Table 3). We then compared the occurrence‐based method from the reduce_amplicons.R script with the crunch_clusters results when ploidy levels are treated as unknown. In this case, Fluidigm2PURC recovered more haplotypes with fewer gaps and more parsimony informative sites. We believe the reason that the reduce_amplicons.R script recovered so few haplotypes is due to its use of minimum read count and frequency criteria that rely on reads being identical to form haplotypes, rather than clustering based on similarity. Inferring haplotypes with Fluidigm2PURC using known ploidy levels resulted in the largest number of recovered haplotypes. The reason that using known versus unknown ploidy levels produced more haplotypes (PIS_3: 57 vs. 18, PIS_4: 43 vs. 14) was because the cluster sizes that went into the likelihood calculation were disparate for some species (a few large clusters and many smaller ones), making the smaller clusters difficult to model when the ploidy level was unknown due to lack of prior knowledge about how many haplotypes should be expected. On a per species basis, using known ploidy levels always led to more inferred haplotypes, but only for polyploids (Table 4). For example, the PIS_4 region for Thalictrum pubescens Pursh recovered 15 haplotypes when assuming known ploidy (analyzed as 22x), but only one haplotype when assuming unknown ploidy. The reason for this is that the cluster data for this species had one putative haplotype with many reads (147), but all other putative haplotypes had far fewer reads (the next largest cluster had 25 reads, and nine clusters had fewer than 10 reads). In general, drawing the line between real haplotypes and errors for clusters with lower read counts is difficult when the ploidy level is unknown. By applying a threshold, the method we implement is a conservative way to estimate haplotypes that only includes clusters with the highest read counts.

Table 3

Overall alignment statistics for the comparison between Fluidigm2PURC and the reduce_amplicons.R script

Statistics	PIS_3		PIS_4
Statistics	Fluidigm2PURC	reduce_amplicons	Fluidigm2PURC	reduce_amplicons
Consensus
Alignment length (bp)	395	415	403	418
Percent gaps	27.9	30.0	1.9	4.4
Parsimony informative sites	19	19	11	10
Unknown ploidy/occurrence
No. of haplotypes	18	5	14	6
Alignment length (bp)	395	403	403	424
Percent gaps	16.2	48.9	2.7	7.1
Parsimony informative sites	48	11	43	10
Known ploidy
No. of haplotypes	57	—	43	—
Alignment length (bp)	395	—	403	—
Percent gaps	10.3	—	2.6	—
Parsimony informative sites	81	—	62	—

Table 4

Per species data for the number of haplotypes inferred by Fluidigm2PURC using known vs. unknown ploidy

Species	Ploidy	PIS_3 a		PIS_4 a
Species	Ploidy	Known	Unknown	Known	Unknown
T. thalictroides	2n = 2x = 14	1 (2.3)	1 (2.3)	1 (0.8)	1 (0.8)
T. squarrosum	2n = 6x = 42	6 (3.3)	4 (3.1)	4 (3.9)	3 (3.2)
T. macrostylum	2n = 8x = 56	8 (13.9)	5 (20.9)	4 (3.3)	4 (3.3)
T. pubescens b	2n = 12x = 84 or 2n = 22x = 154	12x: 11 (3.7)22x: 19 (3.3)	1 (1.5)	12x: 9 (2.3)22x: 15 (2.3)	1 (1.2)
T. revolutum	2n = 20x = 140	10 (26.9)	4 (40.9)	5 (2.5)	3 (2.9)
T. dasycarpum	2n = 22x = 154	13 (9.2)	3 (2.8)	14 (2.4)	2 (2.1)

Data are presented as: number of inferred haplotypes (average percent gaps per haplotype).

For Thalictrum pubescens, haplotypes are presented at both the 12x and 22x level.

Overall alignment statistics for the comparison between Fluidigm2PURC and the reduce_amplicons.R script Per species data for the number of haplotypes inferred by Fluidigm2PURC using known vs. unknown ploidy Data are presented as: number of inferred haplotypes (average percent gaps per haplotype). For Thalictrum pubescens, haplotypes are presented at both the 12x and 22x level.

CONCLUSIONS

The ability to infer haplotypes regardless of an individual's ploidy level is a crucial step toward understanding the complex relationships within many plant groups whose evolutionary histories often contain multiple instances of hybridization and whole genome duplication (Soltis and Soltis, 2009; Van de Peer et al., 2009). As models that accommodate these processes continue to be developed (e.g., Jones et al., 2013; Solís‐Lemus and Ané, 2016; Oberprieler et al., 2017; Thomas et al., 2017; Wen and Nakhleh, 2018), we anticipate that the functionality of our pipeline will be especially useful for conducting phylogenomic studies with nuclear sequence data. Furthermore, the increase in genomic resources for taxa across the plant tree of life will continue to facilitate the process of phylogenetic marker development, allowing more researchers to take advantage of targeted enrichment strategies such as double‐barcoded amplicon sequencing. Compared with existing approaches for analyzing these data, the methods we present here offer an improved workflow for sequence processing, clustering, and haplotype inference, and are particularly well suited for analyses in taxa for which there is incomplete knowledge about ploidy levels.

DATA ACCESSIBILITY

Fluidigm2PURC is an open source software that is freely available on GitHub (https://github.com/pblischak/fluidigm2purc) for Unix‐like operating systems (Mac, Linux) under the GNU General Public License v3. We have also built a Docker image with all dependencies (Table 1) pre‐installed for use on any operating system with a compatible distribution of the Docker software (https://hub.docker.com/r/pblischak/fluidigm2purc) (https://www.docker.com; Merkel, 2014). Fluidigm2PURC is written in Python and has been successfully tested using Python versions 2.7 and 3.6. Documentation for the software can be found on ReadTheDocs (http://fluidigm2purc.readthedocs.io). The code for each step of our example analysis is available in Appendix S1 (section S2). Raw sequence data from the PIS_3 and PIS_4 loci for the six sampled Thalictrum species, as well as all output FASTA files from the Fluidigm2PURC and dbcAmplicons pipelines, are available from the Dryad Digital Repository (https://doi.org/10.5061/dryad.89k5k30; Blischak et al., 2018). Click here for additional data file.

25 in total

1. Allopolyploidization and evolution of species with reduced floral structures in Lepidium L. (Brassicaceae).

Authors: Ji-Young Lee; Klaus Mummenhoff; John L Bowman
Journal: Proc Natl Acad Sci U S A Date: 2002-12-12 Impact factor: 11.205

2. Phyutility: a phyloinformatics tool for trees, alignments and molecular data.

Authors: Stephen A Smith; Casey W Dunn
Journal: Bioinformatics Date: 2008-01-28 Impact factor: 6.937

3. Timing and consequences of recurrent polyploidy in meadow-rues (thalictrum, ranunculaceae).

Authors: Valerie L Soza; Kendall L Haworth; Verónica S Di Stilio
Journal: Mol Biol Evol Date: 2013-05-31 Impact factor: 16.240

4. Next-generation polyploid phylogenetics: rapid resolution of hybrid polyploid complexes using PacBio single-molecule sequencing.

Authors: Carl J Rothfels; Kathleen M Pryer; Fay-Wei Li
Journal: New Phytol Date: 2016-07-27 Impact factor: 10.151

5. Evolutionary and domestication history of Cucurbita (pumpkin and squash) species inferred from 44 nuclear loci.

Authors: Heather R Kates; Pamela S Soltis; Douglas E Soltis
Journal: Mol Phylogenet Evol Date: 2017-03-10 Impact factor: 4.286

6. Floral MADS box genes and homeotic gender dimorphism in Thalictrum dioicum (Ranunculaceae) - a new model for the study of dioecy.

Authors: Verónica S Di Stilio; Elena M Kramer; David A Baum
Journal: Plant J Date: 2005-03 Impact factor: 6.417

7. MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors: Kazutaka Katoh; Daron M Standley
Journal: Mol Biol Evol Date: 2013-01-16 Impact factor: 16.240

8. Function and regulation of the Arabidopsis floral homeotic gene PISTILLATA.

Authors: K Goto; E M Meyerowitz
Journal: Genes Dev Date: 1994-07-01 Impact factor: 11.361

9. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors: Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal: Bioinformatics Date: 2012-04-27 Impact factor: 6.937

10. A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae).

Authors: Simon Uribe-Convers; Matthew L Settles; David C Tank
Journal: PLoS One Date: 2016-02-01 Impact factor: 3.240

2 in total

1. Examining the utility of DNA barcodes for the identification of tallgrass prairie flora.

Authors: Sarah A Herzog; Maribeth Latvis
Journal: Appl Plant Sci Date: 2021-01-24 Impact factor: 1.936

2. Adaptive responses drive the success of polyploid yellowcresses (Rorippa, Brassicaceae) in the Hengduan Mountains, a temperate biodiversity hotspot.

Authors: Ting-Shen Han; Zheng-Yan Hu; Zhi-Qiang Du; Quan-Jing Zheng; Jia Liu; Thomas Mitchell-Olds; Yao-Wu Xing
Journal: Plant Divers Date: 2022-03-16

2 in total