Literature DB >> 27785388

Transcriptomic resources and marker validation for diploid and polyploid Veronica (Plantaginaceae) from New Zealand and Europe.

Eike Mayland-Quellhorst1, Heidi M Meudt2, Dirk C Albach1.   

Abstract

PREMISE OF THE STUDY: Polyploidy may generate novel variation, leading to adaptation and species diversification. An excellent natural system to study polyploid evolution in a comparative framework is Veronica (Plantaginaceae), which comprises several parallel, recently evolved polyploid series.
METHODS: Over 105 million Illumina paired-end sequence reads were generated from cDNA libraries of leaf tissue from eight individuals representing three European and four New Zealand species. Forty-eight simple sequence repeat (SSR) and 48 low-copy nuclear (LCN) markers were developed and validated with Fluidigm microfluidic PCR and Illumina MiSeq amplicon sequencing on 48 different individuals each.
RESULTS: Individual Trinity assemblies were similar regarding annotated transcripts (13,009-14,271), mean contig length (635-742 bp), N50 value (916-1133 bp), E90N50 value (1099-1308 bp), contigs with positive BLAST hits (42-63%), and gene ontology terms. Analyses of 29,738 single-nucleotide polymorphisms (8746 phylogenetically informative) mined from these transcriptomes plus two outgroups (Picrorhiza kurrooa and Plantago ovata) showed moderate to high bootstrap support for all branches and reticulation among sampled European Veronica. DISCUSSION: The transcriptome sequences themselves, as well as the validated SSR (40/48) and LCN (11/48) markers derived from them, show inter- and intraspecific genetic variation. These resources will be invaluable for future population genetic, phylogenetic, and functional genetic investigations in polyploid Veronica.

Entities:  

Keywords:  Veronica; low-copy nuclear (LCN) markers; polyploidy; simple sequence repeat (SSR) markers; single-nucleotide polymorphisms (SNPs); transcriptome

Year:  2016        PMID: 27785388      PMCID: PMC5077287          DOI: 10.3732/apps.1600091

Source DB:  PubMed          Journal:  Appl Plant Sci        ISSN: 2168-0450            Impact factor:   1.936


Polyploidy (whole genome duplication) is a very important process that has shaped flowering plant evolutionary history (Soltis et al., 2009). Much progress in the study of polyploid evolution has been made in the past two decades regarding both ancient paleopolyploidization (Doyle et al., 2008; Soltis et al., 2009) as well as very recent neopolyploidization (Buggs et al., 2009; Abbott et al., 2013). An important research gap (Soltis et al., 2009) is understanding polyploids of intermediate age that have diploid ancestors in the same genus, so-called mesopolyploids, which are characterized by diploid-like reproduction but whose parental subgenomes are still discernible (Mandáková et al., 2010). Several mesoallopolyploid crop systems (e.g., cotton, soybean, tobacco, wheat) are becoming well understood and have excellent genetic resources; however, understanding natural systems is also important. Specifically, studying natural mesopolyploid species radiations may be key to understanding the importance of polyploidy in angiosperm diversification (Soltis et al., 2009). Recent plant species radiations are a significant contributor to generating plant biodiversity, and evidence suggests that polyploidy has played an important role in these radiations (Mayrose et al., 2011). Many fundamental and biologically interesting questions regarding polyploidy and diversification in plants are yet to be investigated in such systems (Doyle et al., 2008; Soltis et al., 2009; Mayrose et al., 2011). The large, nearly cosmopolitan genus Veronica L. (Plantaginaceae) comprises approximately 450 species of annual and perennial herbs, shrubs, and small trees with centers of diversity in both Eurasia and New Zealand. The genus is an excellent example of a natural mesopolyploid (∼20 million years old) system comprising multiple lineages, including several recent species radiations, in which polyploidy and hybridization have accompanied diversification (Albach et al., 2008; Meudt et al., 2015). Northern Hemisphere Veronica species are diploids or polyploids with chromosome numbers ranging from 2n = 14–80 and base numbers of x = 6–9 and 17 (Albach et al., 2008). By contrast, Southern Hemisphere species—which evolved as a single lineage from Northern Hemisphere ancestors ∼10 million years ago (Wagstaff et al., 2002; Albach and Meudt, 2010; Meudt et al., 2015)—all have high chromosome numbers (2n = 40–124) with base chromosome numbers of x = 20 or 21 (Albach et al., 2008). Several studies focusing on Veronica in both hemispheres have used standard DNA sequencing and amplified fragment length polymorphism (AFLP) fingerprinting techniques to elucidate patterns of relationship from phylogeography (Meudt and Bayly, 2008) to phylogeny of the genus as a whole (Wagstaff et al., 2002; Albach and Meudt, 2010; Meudt et al., 2015) or of particular polyploid complexes (e.g., Albach, 2007), and used these to infer the evolution of chromosome number, genome size, breeding systems, and habit (Albach and Greilhuber, 2004; Albach et al., 2008; Meudt et al., 2015). Nevertheless, a lack of variable genetic markers using standard DNA sequencing and genotyping techniques, and a lack of appropriate phylogenetic analysis methodologies that can incorporate reticulate evolution and allopolyploids, have hampered further progress in studies of Veronica and polyploid evolution at the population, species, and generic levels. It has been known for some time that low-copy nuclear (LCN) markers can be extremely useful for phylogenetic reconstruction at the genus (interspecific) level, including for elucidating the evolutionary history of polyploids, for which standard uniparental DNA sequencing markers from chloroplast DNA or the internal transcribed spacer (ITS) region are not informative (e.g., Sang, 2002). Apart from LCN markers, microsatellites or simple sequence repeat (SSR) markers are useful for closely related species when traditionally genotyped and analyzed for studies at the infrageneric level, but SSRs and their flanking regions may also be useful as phylogenetic markers when high-throughput sequenced (Chatrou et al., 2009; Germain-Aubrey et al., 2016).This, however, requires new bioinformatic tools such as the workflows MarkerMiner (Chamala et al., 2015) and QDD (Meglécz et al., 2014) for the development of LCN and SSR markers, respectively, using genomic and transcriptomic resources. High-throughput de novo transcriptome sequencing, or RNA-Seq, has proven to be an excellent source of genetic data for gene characterization and marker development in studies of natural systems with little or no additional genetic resources available (Strickler et al., 2012; Alvarez et al., 2015), as is the case for Veronica. The benefits of RNA-Seq are simultaneous characterization of genes and gene expression, reduced representation for large, complex genomes, and the generation of large amounts of sequence data without a reference genome. RNA-Seq also presents its challenges, particularly assembly without a reference genome, and assembly of polyploid genomes. Polyploid transcriptome assembly is an active area of research. A major issue is the differentiation of homoeologs from orthologs. Some studies have tested different pipelines, such as combining multiple k-mer assemblies in polyploid wheat (Krasileva et al., 2013), or combining assemblies from different assemblers and then using a second step to cluster redundant contigs in polyploid tobacco (Nakasugi et al., 2014). However, there are few examples to date of comparisons in natural, noncrop systems with few prior genomic resources. To date, there are no clear answers regarding which assembler, combination of assemblers, or assembly pipeline is best for polyploids and their diploid progenitors or close relatives. The aim of our study was, therefore, twofold. First, we aimed to generate transcriptomic data for Veronica; second, we aimed to use these transcriptomic resources to develop and validate phylogenetically informative sequencing markers. Specifically, in this paper, we generate the first transcriptome resources for the genus Veronica, using short-read Illumina HiSeq 2000 (Illumina, San Diego, California, USA) sequencing of eight individuals representing seven species and five different ploidy levels. We then assemble, identify, and broadly characterize and compare a large number of expressed sequences. Single-nucleotide polymorphisms (SNPs) are mined from transcriptomes of these eight individuals plus those of two additional Plantaginaceae outgroups (Plantago ovata Forssk. and Picrorhiza kurrooa Royle ex Benth., available from public databases) and compared using phylogenetic and network analyses. Secondly, we used the transcriptomic data to discover, design, and develop two types of genetic markers (i.e., LCN and SSR markers). To test the success of our approach, we then used microfluidic PCR and Illumina MiSeq to validate 48 loci in 48 individuals for both LCN and SSR markers. We provide examples of sequence alignments and downstream phylogenetic analyses for representative loci showing their potential phylogenetic utility in Veronica when resequenced with high-throughput sequencing. The resource and marker development of the current study provide new, variable markers for future evolutionary studies of the genus. Furthermore, a parallel study currently underway will further examine assembly methods and analyze the transcriptomes themselves to quantify and compare underlying interspecific gene divergence and investigate the timing and mode of polyploidy in the sampled Veronica polyploids and their close relatives (Meudt et al., unpublished data). The current study is thus a critical first step toward ultimately understanding the role of polyploidy in generating novel genetic and morphological variation that leads to adaptation and species diversification (Doyle et al., 2008; Soltis and Soltis, 2009).

MATERIALS AND METHODS

RNA extraction, cDNA library prep, and Illumina sequencing

We sampled leaf tissue from seven greenhouse-grown individuals and from one field-collected individual representing seven species of two polyploid complexes in Veronica from New Zealand and Eurasia with three ploidy levels each (Appendix 1). The field-collected material was stored at −80°C on RNAlater (Life Technologies, Carlsbad, California, USA). Because we wanted to take a broad approach to analyze polyploidy in Veronica and develop markers for the entire genus, we sampled multiple species in two divergent lineages, rather than multiple individuals per species. Cultivated plants were grown in the same greenhouse in Oldenburg, Germany. Leaf material was harvested and placed directly into tubes with liquid nitrogen, stored at −80°C until extraction, and ground to a powder with a prechilled mortar and pestle while adding liquid nitrogen. RNA was extracted using the RNeasy kit (QIAGEN GmbH, Hilden, Germany) following manufacturer’s instructions using 500 μL RLC buffer with 4% PVP and 1% β-mercaptoethanol. A DNase I digest and RNase inhibitor reaction was performed using 0.5 μL (20 units) RNase inhibitor, 6.0 μL 10× DNase I buffer, and 1.0 μL DNase I to the resulting 60 μL RNA extract and incubated at 37°C for 15 min. Then, 2.6 μL EDTA (0.2 M, pH = 8; final conc. 8 mM) was added, incubated for 10 min at 75°C, and the RNA was reprecipitated by adding 1:10 3 M sodium acetate, 2.5 volume 100% ethanol, incubating on ice for 20 min, centrifuging at full speed for 5 min, washing with 100 μL 75% ethanol, centrifuging at full speed, air-drying the resultant pellet for 10–15 min, redissolving in 25 μL RNase-free water, and storing at −80°C. Small aliquots of raw RNA extract and the reprecipitated RNA extract were run on the Tecan Infinite Pro F200 (Tecan, Crailsheim, Germany) and Agilent 2100 Bioanalyzer (Agilent Technologies, Waldbronn, Germany) to measure RNA quality and quantity. RNA from eight individuals with RNA Integrity Number (RIN) of 6.8 or greater, 260:280 ratio between 1.9–2.1, and at least 50 ng/μL (Appendix 1) were sent to BGI (BGI-Hong Kong Co. Ltd, Hong Kong, China) for Illumina TruSeq cDNA library preparation on normalized RNA and high-throughput Illumina HiSeq 2000 100-bp paired-end de novo transcriptome sequencing. The transcriptomic data generated here are publicly available in the National Center for Biotechnology Information (NCBI) Sequence Read Archive for submission SPR074674 and the Trinity assemblies in the NCBI Transcriptome Shotgun Assembly Sequence Archive (Table 1; http://www.ncbi.nlm.nih.gov/sra/SRP074674).
Table 1.

Information about Illumina sequencing reads and Trinity assemblies for the eight individuals of Veronica sampled.

Species (Ploidy)GeographySRA accession/TSA accessionaNo. of raw (clean) readsNo. of contigs (Trinity)N50 value (mean/median contig length)No. (%) annotated contigs with positive BLAT hitsNo. (%) GO terms assignedbNo. of LCN markersc
Veronica catarractae (6x)New ZealandSAMN04961631/ GEVT0000000023,711,07466,6711078 (732/493)37,287 (56)13,940 (21)580
V. hectorii subsp. coarctata (6x)New ZealandSAMN04961628/ GEVQ0000000024,385,09864,9501097 (742/511)36,839 (57)14,068 (22)573
V. planopetiolata (12x)New ZealandSAMN04961630/ GEVS0000000025,055,26473,8201020 (698/476)39,915 (54)14,197 (19)625
V. ochracea (18x)New ZealandSAMN04961629/ GEVR0000000024,050,11061,7521065 (722/494)37,071 (60)14,211 (23)606
V. panormitana (2x)EuropeSAMN04961624/ GEVN0000000024,429,09041,4511118 (741/482)25,047 (60)13,714 (33)571
V. trichadena (2x)EuropeSAMN04961625/ GEVU0000000024,269,93658,998916 (635/403)24,583 (42)13,009 (22)460
V. cymbalaria (4x)EuropeSAMN04961626/ GEVO0000000023,406,76046,5731133 (767/539)29,458 (63)13,564 (29)506
V. cymbalaria (6x)EuropeSAMN04961627/ GEVP0000000024,845,50473,889992 (671/431)36,589 (50)14,271 (19)634

Note: GO = gene ontology; LCN = low-copy nuclear; SRA = Sequence Read Archive; TSA = Transcriptome Shotgun Assembly.

Sequence Read Archive (SRA) accession numbers for SRA submission SPR074674 (http://www.ncbi.nlm.nih.gov/sra/SRP074674).

BLAT search with MapMan categories.

Number of LCN markers detected in MarkerMiner (Chamala et al., 2015), contigs longer 600 bp.

Information about Illumina sequencing reads and Trinity assemblies for the eight individuals of Veronica sampled. Note: GO = gene ontology; LCN = low-copy nuclear; SRA = Sequence Read Archive; TSA = Transcriptome Shotgun Assembly. Sequence Read Archive (SRA) accession numbers for SRA submission SPR074674 (http://www.ncbi.nlm.nih.gov/sra/SRP074674). BLAT search with MapMan categories. Number of LCN markers detected in MarkerMiner (Chamala et al., 2015), contigs longer 600 bp.

Quality control, preprocessing of reads, assembly, and Blast2GO analyses

The following analyses were carried out on each of the eight individuals separately. Demultiplexed Illumina sequencing results were retrieved in FASTQ format via FTP from BGI. Between 12.8 and 13.5 million paired-end reads were generated per individual in both the forward and reverse directions (Table 1), from which single reads, adapters, and reads with a quality score (QC) cutoff of less than 20 had already been removed. After testing the effect of different QC cutoffs on the resulting sequence reads and assemblies of V. trichadena Jord. & Fourr., we used QC = 40 in the bash script TrimClip.sh (De Wit et al., 2012) to remove reads QC < 40. Reads were screened for contaminant sequences from H. sapiens, E. coli, mtDNA, and cpDNA using mirabait (MIRALIB version 4.0; Chevreux et al., 1999) with default settings, the respective databases downloaded from NCBI, and then removed. We used QualityStats.sh (De Wit et al., 2012) and the Galaxy web interface (Afgan et al., 2016) to summarize quality score and nucleotide distribution data for the forward and reverse reads, CollapseDuplicateCount.sh (De Wit et al., 2012) to calculate the fraction of duplicate reads and singletons, PECombiner.sh (De Wit et al., 2012) to remove orphan reads and put remaining reads in the same order in forward and reverse files, and the Velvet helper script shuffleSequences_fastQ.pl to put those two files together in one interleaved file (necessary for Velvet/Oases assembly). The resulting clean sequence reads were assembled de novo using several different assemblers including Trinity, trans-ABySS, SOAPdenovo-Trans, and Velvet/Oases. Relative to the other assemblers, Trinity produced more hits with >80% similarity to contigs >600 bp against Arabidopsis thaliana (L.) Heynh. (data not shown; comparisons done using MarkerMiner 1.0 [Chamala et al., 2015]). Therefore, we chose the de novo assemblies produced using Trinity version r20140717 (Grabherr et al., 2011, compiled for 64-bit Ubuntu) using default settings on the resulting clean sequence reads. For the purposes of marker development, a highly accurate discrimination of homoeologs in polyploids is not necessary at the transcriptome assembly stage, as the discrimination is done in the second resequencing step. Additional comparisons of different assemblers and assembly pipelines, particularly regarding polyploid transcriptomes, were outside the scope of the current study and will be addressed in a subsequent study (Meudt et al., unpublished data). Trinity assemblies of all four New Zealand, all four European, and all eight Veronica individuals were also made. Table 1 shows information about the sequence reads and statistics from the eight different individual Trinity assemblies. Functional annotation of contigs from the different assemblies was conducted using BLAT (Kent, 2002) with default settings against the TAIR database (version 10 represented gene model from 2011-01-03; Lamesch et al., 2012) and MapMan hierarchical categories (Ath_AGI_LOCUS_TAIR10_Aug2012; http://mapman.gabipd.org/web/guest/mapmanstore). Mean contig length ranged from 635–742 bp, N50 value from 916–1133 bp, E90N50 value from 1099–1307 bp (which is computed with the contig_ExN50_statistic.pl script of the Trinity package and represents the N50 of 90% of the expressed transcripts), and number (and percentage) of contigs with positive BLAST hits from 24,583–39,915 (42–63%). To demonstrate the quality and utility of the transcriptomic resources developed here, we compared the transcriptome sequences of our eight sampled individuals relative to each other and to two outgroups, Picrorhiza kurrooa (http://scbb.ihbt.res.in/Picro_information/; SRR392742; Gahlan et al., 2012) and Plantago ovata (SRR629688; Kotwal et al., 2016). To do this, we mined the data from these 10 individuals for SNPs using Site Identification from Short Read Sequences (SISRS) version 1.0 (Schwartz et al., 2015; https://github.com/rachelss/SISRS/releases). SISRS identifies SNPs for phylogenetic studies directly from raw high-throughput sequences without a reference genome and without a priori knowledge of potentially informative loci. Briefly, SISRS first assembles raw sequence reads into a “composite genome” using Velvet, maps the raw reads and individual contigs against this composite genome with Bowtie 2, and then calls SNPs with a Python script (Schwartz et al., 2015). SNP discovery was performed using SISRS on four different data sets: (1) all eight Veronica individuals combined plus P. kurrooa and P. ovata as outgroups, (2) all eight Veronica individuals only, (3) the four New Zealand individuals only, and (4) the four European individuals only. The SNP data were converted to NEXUS format and analyzed using NeighborNet networks (SplitsTree version 4.14.2; Huson and Bryant, 2006). In addition, GARLI version 2.01.1067 (Zwickl, 2006) was used for phylogenetic tree reconstruction under maximum likelihood. We first performed a GARLI run with 10 replicates to estimate the model parameters for the model of evolution estimated with jModelTest version 2.1.5 (012010F; Darriba et al., 2012) [setting ratematrix = a b c a b a statefrequencies = estimate]; six of the 10 replicates had the same best lnL score. These estimated model parameters were then fixed for a bootstrap analysis, which was performed with 1000 replicates [parametervaluestring = M1 r 1.00000 7.30163 1.61422 1.00000 7.30163 1.00000 e 0.27231 0.22561 0.22541 0.27667]. The resulting tree was compared to previously published phylogenetic estimates.

Marker development

Two different types of markers were developed from the Veronica transcriptome resources generated here, LCN and SSR markers.

Low-copy nuclear markers

MarkerMiner was used with default settings to identify LCN markers from a curated set of conserved ortholog set (COS) loci (De Smet et al., 2013). MarkerMiner was developed and tested using transcriptome assemblies from 77 Lamiales species (including six from Plantaginaceae; Chamala et al., 2015), and uses a reciprocal BLAST of all transcriptomes with one another and to the reference A. thaliana genome. Arabidopsis thaliana (Brassicales) is the phylogenetically closest reference available in MarkerMiner to Veronica (Lamiales). Of the 1228 loci returned, 73 were classified as being “strictly” and 1155 as “mostly” single copy. MAFFT alignments of the 330 loci found in six or more individuals, of which 15 were “strictly” and 314 “mostly” single copy, were used to develop primers in Geneious (version 8.7) with Primer3 (Untergasser et al., 2012), aiming for a melting temperature of 60°C. Loci were checked manually for large introns in Geneious by comparing the MarkerMiner alignment to A. thaliana. We chose 13 “strictly” single-copy loci with a successful primer search and 35 additional “mostly” single-copy loci with successful primer searches such that all five A. thaliana chromosomes were equally represented in this marker set. These 48 loci were validated using Fluidigm microfluidic PCR and Illumina MiSeq amplicon sequencing of 48 individuals representing 46 Veronica species (19 from the Southern Hemisphere) and all subgeneric lineages (Appendix 2). The combination of this technique with Illumina MiSeq amplicon sequencing of 300-bp paired-end reads has proven useful and highly efficient in recent studies for development of novel and effective nuclear sequencing markers and improving understanding of phylogenetic relationships in nonmodel genera (Gostel et al., 2015; Uribe-Convers et al., 2016). This method enables the amplification of 48 samples and 48 primer pairs in 4-μL reaction volumes, such that the total volume of these reactions equals, e.g., 10 standard 25-μL reaction volumes. Each reaction contained 2 ng DNA, 200 nM of each primer, 0.1 μL 5 U/μL VELOCITY DNA polymerase (Bioline, Luckenwalde, Germany), 1× buffer, 0.1 μL 10 mM dNTPs, 0.25 μL 1 M DMSO, and 0.5 μL 5 M betaine. The samples were initially denatured for 2 min at 98°C; followed by 45 cycles of denaturation for 15 s at 98°C, annealing for 30 s at 55°C, elongation for 30 s at 72°C; and finalized with 5-min elongation at 72°C. Preliminary testing showed that more cycles were necessary due to some low-quality DNA samples. Barcoding and Illumina sequencing was done by LGC Genomics (Berlin, Germany) with Illumina MiSeq v3 chemistry. For each LCN locus, resulting sequences were trimmed with BBMap tools (https://sourceforge.net/projects/bbmap/), de novo assembled with CAP3 (99% identity; Huang and Madan, 1999), aligned to the respective locus sequence of the transcript with MAFFT (setting E-INSI; Katoh and Standley, 2013), and examined in Geneious for sequence length, similarity to original transcript, A. thaliana gene, and number of individuals successfully sequenced. In addition, the alignment was exported to GARLI, in which numbers of sequences, SNPs, and parsimony informative characters (PICs) were calculated. For one randomly chosen example LCN marker, a phylogeny was reconstructed using the same settings as described above for SNPs.

Simple sequence repeats

Numerous SSRs were identified from Trinity assembly of the New Zealand individuals only using QDD version 3.1 (Meglécz et al., 2014; Table 1). Settings for the search were a length of 250–350 bp of the locus and primer melting temperatures of 59–61°C. After filtering for quality (taking QDD categories A and B), repeats (removing dinucleotides for example), and length of predicted PCR product, 48 loci were chosen from the 1124 potential SSRs with primer sites found by QDD. These were validated using Fluidigm microfluidic PCR and Illumina MiSeq amplicon sequencing (see above) of 48 individuals representing 20 Australasian species and one interspecific hybrid (Appendix 3). For each SSR marker, which included the SSR repeat area and flanking regions, resulting sequences were analyzed in the same way as the LCN data (see above) and examined in Geneious and GARLI regarding SSR motif, sequence length, number of individuals successfully sequenced, number of alleles sequenced, and pairwise genetic distance. In addition, for one randomly chosen example SSR locus, the alignment was exported to GARLI and a phylogeny was reconstructed using the same settings as described above for SNPs.

RESULTS

Transcriptomes

Functional annotation of individual assemblies was similar for each of the eight individuals, with gene ontology (GO) terms assigned to 13,009–14,271 contigs (19–33%; Table 1). There was large overlap of annotated contigs of the different assemblies whether looking at assemblies of individuals of New Zealand species only (26,524 or 89.4% shared annotated contigs; Fig. 1A), European species only (25,456 or 87.8%; Fig. 1B), or all New Zealand vs. all European species (29,839 or 94.3%; Fig. 1C). On the other hand, individual species had 114–453 (0.4–1.6%) unique annotated contigs relative to other species from the same geographical area, and the numbers for New Zealand and European species were comparatively very similar (compare Fig. 1A and 1B). Within the New Zealand species, V. hectorii Hook. f. and V. ochracea (Ashwin) Garn.-Jones shared the most unique annotated contigs (234 or 0.8%) relative to the other five species pairs, whereas V. catarractae G. Forst. and V. ochracea shared the fewest (110 or 0.4%; Fig. 1A). Within the European species, the species pair with the most unique shared annotated contigs was V. panormitana Tineo ex Guss. (2x) and V. cymbalaria Bodard (6x) (238 or 1.1%), whereas the two diploids V. panormitana and V. trichadena shared the fewest (53 or 0.2%) (Fig. 1B).
Fig. 1.

Venn diagrams showing the number of annotated contigs from the Veronica Trinity assemblies. (A) Four New Zealand individuals. (B) Four European individuals. (C) All New Zealand vs. all European individuals.

Venn diagrams showing the number of annotated contigs from the Veronica Trinity assemblies. (A) Four New Zealand individuals. (B) Four European individuals. (C) All New Zealand vs. all European individuals. GO term results were also very similar; of 35 GO categories, the number of unique transcripts were largely overlapping for all species pairs, as is shown for the most divergent species pair of the eight transcriptomes sequenced (i.e., New Zealand V. hectorii vs. European V. panormitana; Fig. 2). The GO categories with the largest numbers of unique transcripts (ca. 500–3000) for these Veronica leaf transcriptomes were (from highest to lowest) “not assigned,” “protein,” “RNA,” “signaling,” “transport,” “misc,” “cell,” and “DNA” (Fig. 2A).
Fig. 2.

(A) Number of unique transcripts (x axis) for each of 35 hierarchical gene ontology (GO) categories (y axis) for the Trinity assemblies of leaf transcriptome data from one individual each of Veronica panormitana (European diploid, *) and V. hectorii (New Zealand hexaploid, ○). (B) Comparison of number of genes for V. panormitana vs. V. hectorii. Results from the other six individuals were very similar (data not shown).

(A) Number of unique transcripts (x axis) for each of 35 hierarchical gene ontology (GO) categories (y axis) for the Trinity assemblies of leaf transcriptome data from one individual each of Veronica panormitana (European diploid, *) and V. hectorii (New Zealand hexaploid, ○). (B) Comparison of number of genes for V. panormitana vs. V. hectorii. Results from the other six individuals were very similar (data not shown). SNP discovery using SISRS resulted in the following number of SNPs and potential PICs: 10-individual data set including outgroups (29,738 SNPs, 8746 PICs), eight Veronica individuals only (45,751 SNPs, 40,217 PICs), four New Zealand individuals only (41,167 SNPs, 2302 PICs), and four European individuals only (65,278 SNPs, 1735 PICs). When the 10-individual data set was analyzed using SplitsTree (Fig. 3A–C), the NeighborNet network clearly showed a main split between all Veronica transcriptomes vs. the two outgroups (Fig. 3A). Although some reticulation was present among the New Zealand species (Fig. 3C), reticulation is more pronounced among the European individuals (Fig. 3B), which comprise two allopolyploids and their putative diploid parental species. The phylogenetic analysis of the same data set contained moderate to high support for all branches in the phylogeny (Fig. 3D). Among the New Zealand individuals, V. hectorii (6x) and V. ochracea (18x) are very closely related to each other; V. ochracea may be an allopolyploid of V. hectorii and another unsampled species (Wagstaff and Wardle, 1999). Within the European lineage, V. cymbalaria (4x) is positioned between both diploid parental species as expected (Albach, 2007; Fig. 3B, 3D), and we suspect V. cymbalaria (6x) to be a backcross allopolyploid of V. cymbalaria (4x) × V. panormitana (2x) based on the larger similarity with that species compared with V. trichadena (Fig. 1; 328 vs. 132 unique annotated contigs).
Fig. 3.

Network and phylogenetic analyses of SNPs mined from leaf transcriptome data using SISRS for eight individuals of Veronica and two outgroups. (A) SplitsTree NeighborNet network. (B) Detail of network showing relationships of the four New Zealand Veronica individuals. (C) Detail of network showing relationships of the four European Veronica individuals. (D) GARLI phylogenetic tree with bootstrap values from 1000 replicates.

Network and phylogenetic analyses of SNPs mined from leaf transcriptome data using SISRS for eight individuals of Veronica and two outgroups. (A) SplitsTree NeighborNet network. (B) Detail of network showing relationships of the four New Zealand Veronica individuals. (C) Detail of network showing relationships of the four European Veronica individuals. (D) GARLI phylogenetic tree with bootstrap values from 1000 replicates. A range of 3–44 (average: 23.4, median: 22) of 48 individuals were successfully sequenced for each of the 48 loci, with 22 of 48 (46%) loci successfully amplifying in at least 24 (>50%) individuals (Appendix 4). For each individual, 4–40 loci were successfully amplified (average: 23.4, median: 21.5), and again less than half of the individuals (22/48, 46%) had successful amplification of at least 25 (>50%) loci (data not shown). Only one-quarter (11/48) of the loci aligned well with the corresponding transcript; these loci had mean lengths of 327–480 bp, contained large numbers of SNPs and PICs, and BLASTed to known A. thaliana genes (Appendix 4). Figure 4 shows an alignment and GARLI tree of sequences from 22 of the 42 individuals successfully sequenced for one randomly chosen example locus, LCN-04 (two outgroups plus 10 European and 10 New Zealand Veronica individuals/species). Nearly twice as many different sequences were generated for the 10 New Zealand individuals shown here (27 sequences; 6x or 18x, “V. townsonii E6 18” to “V. melanocaulon D5 11” in the tree) relative to the 10 European individuals (14 sequences; 2x or 4x, “V. missurica E2 13” to “V. chamaedrys C3 10”). Of the 10 New Zealand individuals, five have only one sequence and are all in the same clade (V. albiflora (Pennell) Albach, V. cupressoides Hook. f., V. densifolia F. Muell., V. lavaudiana Raoul, and V. senex (Garn.-Jones) Garn.-Jones), whereas the other five have 2–8 sequences that fall into the first clade or one of two other clades (Fig. 4). As another example highlighting the low-copy nature of the loci that were sequenced, locus LCN-38 has two orthologous copies, which is expected due to the categorization of the A. thaliana gene AT3G59380 as “mostly” single copy (data not shown; comparisons done using MarkerMiner 1.0 [Chamala et al., 2015]). Additional phylogenetic analyses of the other LCN loci are outside the scope of this study and will be performed elsewhere (Meudt et al., unpublished data).
Fig. 4.

MAFFT alignment and GARLI phylogenetic tree (visualized in Geneious) for 22 of the 42 individuals (two outgroups plus 10 European and 10 New Zealand Veronica) for which sufficient sequence reads of the correct locus were successfully generated from sequences of LCN locus LCN-04 mined using MarkerMiner from Trinity assemblies of leaf transcriptome data. The consensus and identity sequences are shown at the top. Base pairs that are identical to the consensus are shown in gray, whereas SNPs are shown as colors (red = A, blue = C, green = T, yellow = G, black = N). For each sequence in the alignment, species names are followed by sequencing plate location (e.g., D1) and number of sequence reads supporting that allele (range: 10–424). Green branches in the GARLI tree to the left of the individual names have >80% bootstrap support (see Fig. 3 for GARLI settings). Voucher information is shown in Appendix 2.

MAFFT alignment and GARLI phylogenetic tree (visualized in Geneious) for 22 of the 42 individuals (two outgroups plus 10 European and 10 New Zealand Veronica) for which sufficient sequence reads of the correct locus were successfully generated from sequences of LCN locus LCN-04 mined using MarkerMiner from Trinity assemblies of leaf transcriptome data. The consensus and identity sequences are shown at the top. Base pairs that are identical to the consensus are shown in gray, whereas SNPs are shown as colors (red = A, blue = C, green = T, yellow = G, black = N). For each sequence in the alignment, species names are followed by sequencing plate location (e.g., D1) and number of sequence reads supporting that allele (range: 10–424). Green branches in the GARLI tree to the left of the individual names have >80% bootstrap support (see Fig. 3 for GARLI settings). Voucher information is shown in Appendix 2. Overall, 3–47 (mean: 37.7, median: 44.5) of 48 individuals were successfully sequenced for each of the 48 SSR loci, including 40 of 48 loci that were successfully sequenced for at least 26 (>50%) individuals (Appendix 3). For each individual, 0–43 loci were successfully sequenced (average: 37.7, median: 40), with all but two individuals with at least 29 (>60%) loci successfully sequenced (individuals V. catarractae B1 and V. colostylis H3 failed for all 48 and 40 loci, respectively). In general, sequences ranged from 98–851 bp in length (average: 324) and contained one or more length- and/or sequence-variable SSR motifs as well as flanking SNPs and indels within and among individuals (e.g., Fig. 5). Number of sequenced alleles (which are supported by at least 10 raw sequencing reads) per individual ranged from 1–39 (mean: 4.32, median: 3.0, n = 47), with the lower polyploids having fewer alleles than the higher polyploids (6x, mean: 3.96, n = 37; 12x, mean: 5.25, n = 5; 18x, mean: 6.09, n = 5).
Fig. 5.

MAFFT alignment and GARLI phylogenetic tree (visualized in Geneious) of 54 sequences for a subset of eight New Zealand Veronica individuals of V. chionohebe, V. trifida, and their interspecific hybrid from two South Island locations from sequences of SSR locus SSR-08 mined using QDD from Trinity assemblies of leaf transcriptome data. Consensus and identity sequences are shown at the top. Base pairs that are identical to the consensus are shown in gray, whereas SNPs are shown as colors (red = A, blue = C, green = T, yellow = G, black = N). Each of the eight individuals has a unique color: three individuals of V. chionohebe (orange, red, and brown), two of V. trifida (blue, pink), and two of their hybrid (light and dark green). For each sequence in the alignment, species names are followed by location (Garvie Mountains or Pisa Range), sequencing plate location (A5, B5, C4, D4, E4, F4, G4, or H4), and number of sequence reads supporting that allele (range: 12–187). Green branches in the GARLI tree to the left of the individual names have >80% bootstrap support (see Fig. 3 for GARLI settings). Voucher information is shown in Appendix 3.

MAFFT alignment and GARLI phylogenetic tree (visualized in Geneious) of 54 sequences for a subset of eight New Zealand Veronica individuals of V. chionohebe, V. trifida, and their interspecific hybrid from two South Island locations from sequences of SSR locus SSR-08 mined using QDD from Trinity assemblies of leaf transcriptome data. Consensus and identity sequences are shown at the top. Base pairs that are identical to the consensus are shown in gray, whereas SNPs are shown as colors (red = A, blue = C, green = T, yellow = G, black = N). Each of the eight individuals has a unique color: three individuals of V. chionohebe (orange, red, and brown), two of V. trifida (blue, pink), and two of their hybrid (light and dark green). For each sequence in the alignment, species names are followed by location (Garvie Mountains or Pisa Range), sequencing plate location (A5, B5, C4, D4, E4, F4, G4, or H4), and number of sequence reads supporting that allele (range: 12–187). Green branches in the GARLI tree to the left of the individual names have >80% bootstrap support (see Fig. 3 for GARLI settings). Voucher information is shown in Appendix 3. As the focus of SSRs is often population genetics, we analyzed two subsets of the larger SSR data set in more detail, i.e., eight individuals of V. chionohebe Garn.-Jones (4), V. trifida Petrie (2), and their interspecific hybrid (2) (all 2n = 42) (Appendix 5), and six individuals of V. thomsonii Cheeseman (2n = 42), respectively (Appendix 6). For all loci in the two subsets, sequences were on average of 317–327 bp, with 1–26 alleles (mean: 4.0–4.3), 54–80 SNPs, and 41.7–52.5 PICs (see “Totals” rows in Appendix 5 and 6). Figure 5 shows an alignment of 54 different SSR sequences from one locus (SSR-08) of the eight-individual V. chionohebe/V. trifida subset. In locus SSR-08, the sequences ranged from 311–387 bp (average: 357 bp). The sampled individuals had on average 6.8 alleles, and individuals of V. chionohebe had half as many unique alleles (3–6 each) as individuals of V. trifida and the interspecific hybrid (8–10). The sequences of locus SSR-08 were highly variable (note the many colored bars in the alignment in Fig. 5), with 126 SNPs and 109 PICs, and 0–0.14 pairwise genetic distances (mean and median: 0.08) (Appendix 5). In the phylogenetic tree, there is support for some taxonomic clustering of sequences of V. chionohebe and V. trifida, respectively, with hybrid sequences in highly supported clades with V. chionohebe or V. trifida in three vs. four cases, respectively (see tree in Fig. 5). Additional analyses of the other SSR loci are outside the scope of this study and will be performed elsewhere (Meudt et al., unpublished data).

DISCUSSION

The development of transcriptomic and genomic resources and variable genetic markers in so-called natural “mesopolyploid” species radiations is key to addressing fundamental questions about polyploidy and diversification. For polyploids, functional genomic resources in particular are important to facilitate the study of gene evolution. Veronica is an example of a natural mesopolyploid species radiation that to date has lacked such genomic and genetic resources, and this has hindered progress in studying polyploid evolution at the population, species, and generic levels. The transcriptomic and genetic resources developed here will make further detailed studies regarding the role of polyploidy in adaptation and species diversification in Veronica possible. In the current study, we sequenced and assembled leaf transcriptomes from eight individuals representing seven species of Veronica from polyploid species radiations in Europe and New Zealand. There was high overlap of annotated contigs (Fig. 1) and GO terms (Fig. 2) among the eight individuals, as well as good phylogenetic resolution in the network and phylogenetic analyses of SNPs generated using SISRS (Fig. 3). An outstanding challenge with de novo transcriptome assemblies of polyploids is differentiating homoeologs from orthologs; however, this was not an issue for developing markers in polyploid Veronica from our transcriptome assemblies, as phylogenetic relationships (Fig. 3) are consistent with hypothesized relationships and previous phylogenetic results. Such results demonstrate the utility of these transcriptomic resources for phylogenetic studies, functional analyses across the genus using reverse transcription PCR, or for further comparative transcriptomic analyses of the sampled natural allopolyploids and their diploid parental species in the two main centers of diversity for Veronica (i.e., Europe and New Zealand). The large number of transcripts unique to hexaploid V. cymbalaria (453) relative to other individuals representing species from which it likely derived (V. trichadena: 114 and V. panormitana: 195) is surprising and opens the door to studies of differential expression and functional differentiation of genes in polyploids. Common garden experiments are also planned, which will allow comparison of other individuals with the eight sequenced here. Furthermore, the SSR and LCN genetic markers developed here from the transcriptomes, and validated using microfluidic PCR and high-throughput sequencing, are highly variable and will be extremely useful in future phylogenetic studies of Veronica as a whole, as well as studies at the interface of inter- and intraspecific levels of New Zealand Veronica (e.g., phylogenetic, phylogeographic, and population genetic studies). From 330 mostly or strictly “low copy” loci common to 6–8 of the sequenced transcriptomes, we developed and sequenced 48 LCN markers in 48 individuals representing all subgeneric lineages in Veronica. Of the 22 LCN markers that were successfully sequenced for >50% individuals, 11 aligned well with the corresponding transcripts, were on average 394 bp long, contained large numbers of SNPs and PICs, and BLASTed to known A. thaliana genes. These 11 LCN markers are excellent candidates for reconstructing a better-resolved phylogeny of Veronica. In addition, of the 1124 SSRs identified in the four New Zealand Veronica individuals, we validated 48 in 48 Southern Hemisphere Veronica individuals, 40 of which were successfully sequenced for >50% of individuals. Sequenced SSRs and their flanking regions were on average 324 bp long, contained numerous SNPs and PICs, and had mean pairwise genetic distances of 0.01–0.18. The variation seen, particularly in the flanking regions of the sequenced SSRs, is equal to or much greater than that from previous studies using standard DNA sequencing and genotyping markers (e.g., Wagstaff et al., 2002; Meudt and Bayly, 2008). These 40 SSRs have great potential as highly variable sequencing markers (as opposed to being genotyped) at the interface of intra- and interspecific levels regarding questions of population genetics, species limits, and relationships of closely related species in New Zealand Veronica. Additionally, challenges presented by genotyping SSRs in polyploids, such as determining allele dosage and unambiguously identifying alleles (Pfeiffer et al., 2011), are overcome by sequencing the SSRs and their flanking regions, which we would recommend for future studies. For both the LCN and the SSR markers, future sequencing projects could be conducted either using traditional methods (PCR, cloning, sequencing) or using high-throughput sequencing. Furthermore, as biparental nuclear markers, the LCN markers and SSRs will be highly effective in elucidating complex relationships in polyploid Veronica. In addition to the potential advances for Veronica, our methodological approach may also be useful for other natural polyploid groups that lack genomic or genetic resources. Natural species that are not associated with economically important crop or other “model” species often lack genomic resources and are very limited regarding the availability of variable genetic markers. Furthermore, developing and establishing such markers using traditional methods (e.g., López-González et al., 2015) can be tedious and time-consuming, with more effort required for fewer microsatellites developed. (As an aside, we found eight of the 12 reported SSR loci from López-González et al. [2015] in our transcript sequences, none of which met the quality criteria of our QDD pipeline for our New Zealand–focused sampled species.) There are nearly 4000 plant transcriptomes in the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra/) and 1000 Plants (1KP) project (www.onekp.com; Matasci et al., 2014) online resources (Hodel et al., 2016). The 1KP transcriptomes have recently been used to develop SSRs for over 1000 plant species (Hodel et al., 2016), whereas Chapman (2015) published a method for the development and validation of 10 COS LCN loci for legume crop species from transcriptomes. By combining Chapman et al. (2015) standard wet laboratory approach with a scalable, high-throughput microfluidic PCR strategy (Gostel et al., 2015; Uribe-Convers et al., 2016), we here show that screening of 48 SSR or LCN loci is possible in one microfluidic PCR. In fact, this approach could be scaled up from 48 loci to 480 loci, although the latter might have drawbacks, as here 10 loci are amplified in multiplexed reactions, respectively, and results of these de novo marker sequences can contain PCR chimeras. Nevertheless, combining Fluidigm microfluidic PCR and MiSeq amplicon sequencing of LCN and SSR markers, which were designed in MarkerMiner and QDD from transcriptomic data, is a relatively straightforward high-throughput marker validation method as well as an analysis pipeline that can be used on other natural (and polyploid) systems.
Appendix 1.

Information about the eight individuals of Veronica sampled for RNA-Seq.

SpeciesaGPS coordinatesChromosome number (Ploidy)b1C-value (pg)cCollection locality and collection no. (Voucher)dRNA 260/280 ratioeRNA conc. (ng/µL)fRNA RINf
Veronica catarractae G. Forst.NA (cultivated plant)2n = 42 (6x)1.06Cult. Botanischer Garten Oldenburg (Germany), ex New Zealand, Meudt s.n. (OLD00026)2.111017.007.90
V. hectorii Hook. f. subsp. coarctata (Cheeseman) Garn.-JonesNA (cultivated plant)2n = 40 (6x)1.07Cult. Botanischer Garten Bonn 1342 (Germany), ex New Zealand, Meudt s.n. (OLD00029)1.94121.007.10
V. planopetiolata G. Simpson & J. S. Thomson44.52247°S, 168.6736916667°E2n = 84 (12x)2.45New Zealand: South Island, Otago, Meudt HMM339a (WELT SP091593)1.93147.007.00
V. ochracea (Ashwin) Garn.-JonesNA (cultivated plant)2n = 124 (18x)2.97Cult. Botanischer Garten Bonn 9509 (Germany), ex New Zealand, Meudt s.n. (OLD00071)2.121327.006.80
V. panormitana Tineo ex Guss.36.6672°N, 31.8989°E2n = 18 (2x)0.36Turkey: north of Paravallar, Albach 1114 & S272 (OLD00214)2.0053.008.00
V. trichadena Jord. & Fourr.39.678536°N, 2.80062°E2n = 18 (2x)0.39Spain: Mallorca, Meudt HMM346L (OLD00086)1.98302.007.50
V. cymbalaria Bodard36.5325°N, 31.99°E2n = 36 (4x)0.76Turkey: Alanya Castle, Albach 1235 (OLD01171)2.04245.006.90
V. cymbalaria37.22778°N, 31.12972°E2n = 54 (6x)1.38Turkey: Anatalya, Selgedos, Albach 1087 & S300 (OLD00481)2.111265.007.60

Note: NA = not applicable.

RNA was extracted from leaf material from greenhouse-grown material of all individuals except V. planopetiolata, which was from field-collected leaf material stored in RNAlater (Life Technologies, Carlsbad, California, USA).

Chromosome numbers are from the literature (Albach et al., 2008).

1C-values (Meudt et al., 2015) were assessed for the same individual from which RNA was extracted for this study except for V. panormitana, whose 1C-value is based on the average of five other individuals from three different Turkish populations (range 0.35–0.37 pg; Meudt et al., 2015).

Voucher specimens are lodged at herbaria at the Museum of New Zealand Te Papa Tongarewa (WELT) or Carl-von-Ossietzky Universität Oldenburg (OLD).

RNA 260:280 ratio was calculated using the Tecan Infinite Pro F200 (Tecan, Crailsheim, Germany).

RNA concentration and RNA Integrity Number (RIN) were calculated using the Agilent 2100 Bioanalyzer (Agilent Technologies, Waldbronn, Germany); please note that the cDNA construction was made with normalized RNA.

Appendix 2.

Information about the 48 individuals of Veronica sampled for the LCN marker validation.

SpeciesSubgenusPloidyaChromosome no.aCountryVoucher (Herbarium and/or Herbarium accession no.)bLocation on sequencing plateNo. of LCN markers successfully sequenced (of 48 total)
Lagotis integrifolia (Willd.) Schischk. ex Vikulova(outgroup)444KazakhstanTribsch & Essl 10986 (WU)D140
Paederota lutea L. f.(outgroup)436AustriaAlbach 209 (WU)B132
Veronicastrum stenostachyum (Hemsl.) T. Yamaz.(outgroup)434ChinaAlbach 123 (K)C129
Wulfenia carinthiaca Jacq.(outgroup)218cult.Albach 74 (BONN)A136
Veronica anagallis-aquatica L.Beccabunga??Czech Republic597087 (BRUENN)F128
V. catenata PennellBeccabunga218Czech Republic597095 (BRUENN)G136
V. gentianoides VahlBeccabunga??GeorgiaSchneeweiss Geo02/43 (WU)H127
V. arvensis L.Chamaedrys216GermanyAlbach 147 (WU)B319
V. chamaedrys L.Chamaedrys432NorwayAlbach 121 (K)C320
V. crista-galli StevenCochlidiosperma218GeorgiaDolmkanov 17.4.1983 (TBS)G220
#V. cymbalaria BodardCochlidiosperma436TurkeyAlbach 1235 (OLD01171)F412
#V. cymbalariaCochlidiosperma654TurkeyAlbach 1087 (OLD00481)G425
V. javanica BlumeCochlidiosperma216Murata et al. 10050 (BM)F24
#V. panormitana Tineo ex Guss.Cochlidiosperma218TurkeyAlbach 1114 (OLD00214)D421
#V. trichadena Jord. & Fourr.Cochlidiosperma218SpainMeudt HMM346L (OLD00086)E418
V. triloba (Opiz) OpizCochlidiosperma218TurkeyAlbach 242 (WU)H221
V. brownii Roem. & Schult.Labiatoides1272AustraliaNSW 285360B433
V. triphyllos L.Pellidosperma214RussiaS434, BG Osnabrück, 961; RU, Altei, 1900 mA312
V. cuneifolia D. DonPentasepalae216TurkeyAlbach 1159 (OLD)G321
V. fuhsii Freyn & Sint.PentasepalaeTurkeyAlbach 897 (VANF, WU)F332
V. prostrata L.Pentasepalae216AustriaAlbach 860 (MZJG)E321
V. filiformis Sm.Pocilla214GermanyAlbach 144 (WU)D318
V. longifolia L.Pseudolysimachium434TurkeyBehcet 7435 (OLD)C218
V. longifoliaPseudolysimachiumUKSheahan 48 (K)D218
V. schmidtiana RegelPseudolysimachium434JapanUmezawa 20130 (WU)A219
V. spicata L.Pseudolysimachium868AustriaBardy 60 (WU)B211
V. fruticans Jacq.Stenocarpon216UKViv Halcro VH030 (K)A431
V. missurica Raf.Synthyris424USAAlbach 124 (K)E215
V. chamaepithyoides Lam.Triangulicapsula424SpainUA 174 (SALA)H339
V. scutellata L.Veronica436AustriaDobes 7026 (WU)E130
V. albiflora (Pennell) AlbachPseudoveronica642New GuineaJohns 8965 (K)C436
V. baylyi Garn.-JonesPseudoveronica18116New ZealandGarnock-Jones PGJ 2868 (OLD)C622
#V. catarractae G. Forst.Pseudoveronica642New ZealandMeudt HMM s.n. (OLD00026)B513
V. colostylis Garn.-JonesPseudoveronica642New ZealandMeudt HMM341C (OLD)F520
V. cupressoides Hook. f.Pseudoveronica642New ZealandGarnock-Jones PGJ 2887 (OLD)A626
V. densifolia F. Muell.Pseudoveronica642New ZealandMeudt HMM337A (WELT SP091591)H536
#V. hectorii Hook. f. subsp. coarctata (Cheeseman) Garn.-JonesPseudoveronica640New ZealandMeudt HMM s.n., cult. Bonn 13428 ex New Zealand (OLD00029)H416
V. hulkeana F. Muell. ex Hook. f.Pseudoveronica642New ZealandGarnock-Jones PGJ 2874 (OLD)H611
V. lavaudiana RaoulPseudoveronica642New ZealandGarnock-Jones PGJ 2881 (OLD)G529
V. macrantha Hook. f.Pseudoveronica642New ZealandClarke s.n., cult. K 1969-35034 ex New Zealand (OLD)B616
V. melanocaulon Garn.-JonesPseudoveronica642New ZealandGarnock-Jones PGJ 2883 (OLD)D529
#V. ochracea (Ashwin) Garn.-JonesPseudoveronica18124New ZealandMeudt HMM s.n., Bonn 9509 (OLD00071) A526
V. pinguifolia Hook. f.Pseudoveronica1280New ZealandMeudt HMM s.n. cult. Bonn 265 ex New Zealand (OLD)D617
#V. planopetiolata G. Simpson & J. S. ThomsonPseudoveronica1284New ZealandMeudt HMM339a (WELT SP091593)C522
V. senex (Garn.-Jones) Garn.-JonesPseudoveronica642New ZealandGarnock-Jones PGJ 2879 (OLD)E528
V. speciosa R. Cunn. ex A. Cunn.Pseudoveronica640New ZealandGarnock-Jones PGJ 2878 (OLD)F633
V. tairawhiti (B. D. Clarkson & Garn.-Jones) Garn.-JonesPseudoveronica1280New ZealandGarnock-Jones PGJ 2888 (OLD)G69
V. townsonii CheesemanPseudoveronica640New ZealandGarnock-Jones PGJ 2901 (WELT SP103482)E626

Note: LCN = low-copy nuclear.

Ploidy and chromosome numbers are from the literature (Albach et al., 2008).

Herbaria acronyms follow Thiers (2016).

RNA-Seq sample.

Appendix 3.

Validation of 48 SSR markers on 48 individuals of 20 species of Southern Hemisphere Veronica subg. Pseudoveronica.

Species nameSection and informal groupPloidyaChromosome no.aCountryVoucher and collection locality (Herbarium and/or Herbarium accession no.)bLocation on sequencing plateNo. SSR loci successfully sequenced (of 48 total)
Veronica calycina R. Br.sect. Labiatoides636AustraliaRGC 19644, near Lithgow, NSW (NSW, OLD)C340
V. derwentiana Andrews subsp. subglauca (B. G. Briggs & Ehrend.) B. G. Briggssect. Labiatoides640AustraliaRGC 19649, near Lithgow, NSW (NSW, OLD)D337
V. chionohebe Garn.-Jonessect. Hebe, snow hebe642New ZealandMJB 1823, Pisa Range (WELT SP084028/A)E442
V. chionohebesect. Hebe, snow hebe642New ZealandMJB 1824, Pisa Range (WELT SP084029)F438
V. chionohebesect. Hebe, snow hebe642New ZealandMJB 1844, Garvie Mountains (WELT SP084043)C440
V. chionohebesect. Hebe, snow hebe642New ZealandMJB 1845, Garvie Mountains (WELT SP084044)D440
V. chionohebe × V. trifida Petriesect. Hebe, snow hebe × speedwell hebe hybrid642New ZealandMJB 1848, Garvie Mountains (WELT SP084059)G438
V. chionohebe × V. trifidasect. Hebe, snow hebe × speedwell hebe hybrid642New ZealandMJB 1849, Garvie Mountains (WELT SP084060/A)H439
V. ciliolata (Hook. f.) Garn.-Jones subsp. ciliolatasect. Hebe, snow hebe642New ZealandMJB 1696, Mt. Brewster (WELT SP083925)D640
V. ciliolata subsp. ciliolatasect. Hebe, snow hebe642New ZealandMJB 1813, Mt. Cook (WELT SP084020)C643
V. ciliolata subsp. fiordensis (Ashwin) Meudtsect. Hebe, snow hebe642New ZealandMJB 1673, Mt. Burns (WELT SP083910)A640
V. ciliolata subsp. fiordensissect. Hebe, snow hebe642New ZealandMJB 1837, Livingstone Range (WELT SP084037)B642
V. densifolia F. Muell.sect. Hebe, snow hebe642New ZealandMJB 1805, Hunter Hills (WELT SP084053)H641
V. densifoliasect. Hebe, snow hebe642New ZealandMJB 1858, Garvie Mountains (WELT SP084058)G638
V. pulvinaris (Hook. f.) Cheesemansect. Hebe, snow hebe642New ZealandMJB 1728, Temple Basin (WELT SP083950)E642
V. pulvinarissect. Hebe, snow hebe642New ZealandMJB 1761, Mt. Arthur (WELT SP083968)F637
V. thomsonii Cheesemansect. Hebe, snow hebe642New ZealandHMM 259, Mt. St. Bathans (WELT SP085925)F541
V. thomsoniisect. Hebe, snow hebe642New ZealandHMM 261, Mt. St. Bathans (WELT SP085937)H540
V. thomsoniisect. Hebe, snow hebe642New ZealandHMM 265, Mt. St. Bathans (WELT SP085931)G540
V. thomsoniisect. Hebe, snow hebe642New ZealandMJB 1851, Garvie Mountains (WELT SP084047/A)C543
V. thomsoniisect. Hebe, snow hebe642New ZealandMJB 1852, Garvie Mountains (WELT SP084048)D539
V. thomsoniisect. Hebe, snow hebe642New ZealandMJB 1853, Garvie Mountains (WELT SP084049)E542
V. trifida Petriesect. Hebe, speedwell hebe642New ZealandMJB 1841, Garvie Mountains (WELT SP084041)A537
V. trifidasect. Hebe, speedwell hebe642New ZealandMJB 1842, Garvie Mountains (WELT SP084041)B541
V. brachysiphon (Summerh.) Beansect. Hebe, hebe18120New ZealandPGJ 2902, cult. Otari (WELT SP103452)G240
V. brachysiphon (as Hebe vernicosa in Kew Gardens)sect. Hebe, hebe18120New ZealandHMM s.n., cult. Kew Gardens 1997-5679 (OLD)H235
V. catarractae G. Forst.sect. Hebe, speedwell hebe642New ZealandPGJ 2875, cult. Wellington (OLD)B10
#V. catarractae (purchased as Parahebe ‘Snow’)sect. Hebe, speedwell hebe642New ZealandHMM s.n., cult. Botanischer Garten Oldenburg (OLD00026)A141
V. colostylis Garn.-Jonessect. Hebe, speedwell hebe642New ZealandHMM338a, Arrowtown (WELT SP091592)H38
V. colostylissect. Hebe, speedwell hebe642New ZealandHMM341c, Moke Creek (WELT SP091595)G330
V. hectorii Hook. f.sect. Hebe, hebe640New ZealandPGJ 2910, cult. Otari (WELT SP103460)D138
#V. hectorii subsp. coarctata (Cheeseman) Garn.-Jonessect. Hebe, hebe640New ZealandHMM s.n., Bonn 13428 (OLD00029)C141
V. hulkeana F. Muell. ex Hook. f. subsp. evestita (Garn.-Jones) Garn.-Jones ‘Lena’sect. Hebe, sun hebe642New ZealandPGJ 2874, cult. Wellington (OLD)A432
V. lavaudiana Raoulsect. Hebe, sun hebe642New ZealandPGJ 2881, cult. Wellington (OLD)B441
V. macrantha Hook. f.sect. Hebe, unresolved, early branching642New ZealandHMM s.n., cult. Kew Gardens 1969-35034 (OLD)D236
V. macranthasect. Hebe, unresolved, early branching642New ZealandPGJ 2924, cult. Otari (WELT SP103475)C241
#V. ochracea (Ashwin) Garn.-Jonessect. Hebe, hebe18124New ZealandHMM s.n., Bonn 9509 (OLD00071)E142
V. ochraceasect. Hebe, hebe18124New ZealandPGJ 2911, cult. Otari (WELT SP103461)F136
V. ochracea ‘James Stirling’sect. Hebe, hebe18124New ZealandHMM s.n., cult. Kew Gardens 1992-1403 (OLD)G139
V. odora Hook. f. (as Hebe vernicosa in Botanischer Garten Bonn)sect. Hebe, hebe1284New ZealandHMM s.n., cult. Bonn 17475 (OLD)A329
V. odora ‘New Zealand Gold’sect. Hebe, hebe1284New ZealandHMM s.n., cult. Kew Gardens 1989-2000 (OLD)B340
#V. planopetiolata G. Simpson & J. S. Thomsonsect. Hebe, speedwell hebe1284New ZealandHMM339a, Shotover Saddle (WELT SP091593)H142
V. planopetiolatasect. Hebe, speedwell hebe1284New ZealandHMM339b, Shotover Saddle (WELT SP091593)A236
V. planopetiolatasect. Hebe, speedwell hebe1284New ZealandHMM339c, Shotover Saddle (WELT SP091593)B243
V. salicornioides Hook. f. sect. Hebe, hebe642New ZealandHMM s.n., cult. Kew Gardens 1989-2004 (OLD)F238
V. salicornioidessect. Hebe, hebe642New ZealandPGJ 2923, cult. Otari (WELT SP103474)E242
V. vernicosa Hook. f.sect. Hebe, hebe642New ZealandPGJ 2925, cult. Otari (WELT SP103476)E341
V. vernicosasect. Hebe, hebe642New ZealandPGJ 2926, cult. Otari (WELT SP103477)F339

Ploidy and chromosome numbers are from the literature (Albach et al., 2008).

Herbaria acronyms follow Thiers (2016). Voucher specimens are lodged at herbaria at the Museum of New Zealand Te Papa Tongarewa (WELT), Carl-von-Ossietzky Universität Oldenburg (OLD), or National Herbarium of New South Wales (NSW). Collection initials: MJB = Michael J. Bayly, HMM = Heidi M. Meudt, PGJ = Phil Garnock-Jones, RGC = R. G. Coveny.

RNA-Seq sample.

Appendix 4.

Validation of 48 LCN markers on 48 individuals of 46 species of Veronica, representing all subgeneric lineages in the genus.

LocusPrimer sequences (5′–3′)Sequence same as original transcript?No. of individuals successfully sequencedNo. of different sequences in GARLI alignmentLength (bp, range)Length (bp, mean)No. SNPsNo. PICsA. thaliana gene
LCN-03F: AGCAGTGCCTCTAGTCTGTTTcomplete 18 37 158–866 480 607 345AT3G07080: EamA-like transporter family
R: CCGCTAATGGCACCTGAATTG
LCN-04F: AGGTTTATACATTTGCGGCGcomplete 42 43 288–331 327 117 77AT3G07720: Galactose oxidase/kelch repeat superfamily protein
R: TTCCCGCACCCTCCAAAC
LCN-08F: CCCTCCAGAGAAGAGCTTAACGcomplete 27 44 311–791 374 427 131AT4G17100: UNKNOWN
R: GCCCTTTGCCTCCTCCATATAG
LCN-10F: GCAAAGACCAGTTCAAACTTTGAGcomplete 35 68 234–820 440 523 303AT4G33460: ABC transporter family protein
R: AGAGGCTTGCTGACCTTCAAC
LCN-13F: TCTAACTGGTTGTCATCCGCTpartial 18 19 310–912 422 354 112AT5G65760: Serine carboxypeptidase S28 family protein
R: CCAAGGATCCAAGAGCCCATT
LCN-20F: GGCATACGTGAAGACCTGGGpartial 44 85 306–681 384 280 176AT1G57770: FAD/NAD(P)-binding oxidoreductase family protein
R: AGCAACAATGGCACCACTTG
LCN-25F: AGGAGTGATTCGAGCAGTGCpartial 43 89 310–726 396 439 287AT2G05830: NagB/RpiA/CoA transferase-like superfamily protein
R: ACTTGTTCCCCCAATCCACC
LCN-38F: AAGACCCTTGGAGGATGGGAcomplete 42 104 310–762 361 436 299AT3G59380: farnesyltransferase A
R: TAGTGCTCTTTCGCCACTCC
LCN-43F: TATGACTGCTGCTGGTCTTGGcomplete 30 50 139–750 396 329 185AT4G35850: Pentatricopeptide repeat (PPR) superfamily protein
R: AGACCACGTTCTAATTCGCCA
LCN-46F: TGCAACTCCTTTTTGGGGGTcomplete 44 85 307–707 373 314 224AT5G13800: pheophytinase
R: ACTTCATCATGGGGGCAGTG
LCN-48F: AAGGTAACGCCGCCAAGTATcomplete 34 42 147–734 379 445 280AT5G14520: pescadillo-related
R: TGCGCAGTTTATGGGTACGA
LCN-01F: AGCATCGCTTGGACAGGTTTAno 17AT1G71810: protein kinase superfamily protein
R: ATTCCCCCATCATGCCGAAAT
LCN-02F: TGGGAGCAGCGCCTTAATTCno 22AT2G25950: protein of unknown function (DUF1000)
R: CCACAACATCCCTCTTCAGCT
LCN-05F: TTGCCGCCTCCTGATCATATCpartial 12AT3G20790: NAD(P)-binding Rossmann-fold superfamily protein
R: AGAACTGCAACATCTCTGGCA
LCN-06F: GTGAGCAGGTTTTTCGAGTGGno 33AT4G09730: RH39
R: AAGCTTCTGCACTCCCTTTGA
LCN-07F: GGAGATCAATCGCTTTTGGAGTCno 0AT4G09750: NAD(P)-binding Rossmann-fold superfamily protein
R: TGGCATATTGTTCAACTCCATCG
LCN-09F: AAAGCTGGTGAACTTGCAGTGno 16AT4G25450: nonintrinsic ABC protein 8
R: GGCAGCCCATAAGCAATGTTC
LCN-11F: GTGCATTTGCCATGGAATCCCno 3AT4G37040: methionine aminopeptidase 1D
R: TACGTCCACGACCGTTATTCC
LCN-12F: GGAATGGTGGTAGGATTGGGGno 20AT5G44520: NagB/RpiA/CoA transferase-like superfamily protein
R: CCTCCAAACCTCAGCATCTCC
LCN-14F: CGGATCGTTACATTGCTAGCTGno 13AT1G04420: NAD(P)-linked oxidoreductase superfamily protein
R: GCACCTGACAAGCAAACTGTAG
LCN-15F: CGGTGGGTGGAAGCATTTTGpartial 28AT1G16180: Serine-domain containing serine and sphingolipid biosynthesis protein
R: TCCAACAGAAGTGGACCAGC
LCN-16F: ACTCCTTTCCCGCATTCCTGno 30AT1G19600: pfkB-like carbohydrate kinase family protein
R: CCTCACCATCTCGAAGCTGG
LCN-17F: AGACTCTACCCACAGCCTCCno 11AT1G31800: cytochrome P450, family 97, subfamily A, polypeptide 3
R: TGGGGATGATAGGGGGCC
LCN-18F: AGTTTGGTGGTGGGCATAGGpartial 24AT1G48520: GLU-ADT subunit B
R: GAAGATCAGGCTCGGGGAAG
LCN-19F: CTGTTGCGCTTGGGTCATGno 31AT1G53280: Class I glutamine amidotransferase-like superfamily protein
R: TTGAGCTCCACCAAGACCAC
LCN-21F: TGGTGTCATTGGAGCTGGTCno 9AT1G68010: hydroxypyruvate reductase
R: TGCCATTCCTTCTCGAGTCC
LCN-22F: TGGGTGAAGGGTCTTTTGGTGno 21AT1G68830: STT7 homolog STN7
R: CCAACTCTCAAATCAGTAGCTGC
LCN-23F: AAGCATGTGGGAGAAGAGGCno 24AT1G71240: Plant protein of unknown function (DUF639)
R: CAAGCACCAATCGCTCTGAC
LCN-24F: GGAACTCCTATGCCTCAGGTTGno 13AT1G75210: HAD-superfamily hydrolase, subfamily IG, 5′-nucleotidase
R: TCTTCATTAGTTGTCCCCACACC
LCN-26F: GATAACTGGAGCGACGGGATTno 32AT2G21280: NAD(P)-binding Rossmann-fold superfamily protein
R: GCTAGAGCACCACCCTCTTTT
LCN-27F: TGGGATGCAGTATCATTGGCApartial 18AT2G23390: UNKNOWN
R: CAGCTGTAGGTTGTGACTGGT
LCN-28F: TGCCTCCACCAGTCAAGATGno 16AT2G27680: NAD(P)-linked oxidoreductase superfamily protein
R: CCATCCTCCCCAAGCATCAA
LCN-29F: GCTAGAGCCCCAAAGAGCAApartial 11AT2G30390: ferrochelatase 2
R: TCCTCCACATATGCAACCGG
LCN-30F: ATGGAAAGGAGTGGGAGCTGno 22AT2G44760: Domain of unknown function (DUF3598)
R: TTGGCTGGACTGACCCATTC
LCN-31F: TCAACTTTGCAGCATTGGAGCpartial 19AT3G06510: Glycosyl hydrolase superfamily protein
R: CAACAGCGGCAATGTCAAAGA
LCN-32F: AAAATGGGTGCTGCTGTTGGno 39AT3G17810: pyrimidine 1
R: ACAAGGCCATACCCATGCAT
LCN-33F: TGCACGATCACCTCCTTGTCno 28AT3G17940: Galactose mutarotase-like superfamily protein
R: AGAATGGTTCCGGAGCTGTG
LCN-34F: CACAGAAAGGCAGAATCAGGCpartial 12AT3G23620: Ribosomal RNA processing Brix domain protein
R: TGATCCAATCAGAGGTGCGT
LCN-35F: AAATCGCTCACCGGTGTTTGpartial 9AT3G52190: phosphate transporter traffic facilitator1
R: TTGCAGTTGGGAAGTTCCAAAA
LCN-36F: GATCCGGGTCAAATCCACCApartial 31AT3G56460: GroES-like zinc-binding alcohol dehydrogenase family protein
R: AACGGCAATGACAATGGCAC
LCN-37F: CAAGGAGCTTGGTAGGAGGCno 7AT3G56940: dicarboxylated iron protein, putative (Crd1)
R: GAGACAGAAGAAGCGGGACC
LCN-39F: CCGGTGATCTTGTTCGCATGno 36AT3G62910: Peptide chain release factor 1
R: AATTGGAGCGCTCGACTCTT
LCN-40F: TGGGAAACTCGGAATGGGTGno 43AT4G02790: GTP-binding family protein
R: CGGAATGCTGCTTGATGTGT
LCN-41F: AGGTGGGCTGAATGGAATGGno 13AT4G09020: isoamylase 3
R: CCTCCAATTGTCCCCACTGG
LCN-42F: AAGTGGTTGCCGTGCCATpartial 9AT4G21470: riboflavin kinase/FMN hydrolase
R: GCCTCTGGTCGTATGTAGCC
LCN-44F: ACAAAGGATGAGATCGAACGGTpartial 14AT5G06260: TLD-domain containing nucleolar protein
R: TGCCCAAGAAAGTGCTGAAAC
LCN-45F: GGCAGACTTGGTCATGGACAno 36AT5G08710: Regulator of chromosome condensation (RCC1) family protein
R: CCCCAGCCCCATGTGTAAAT
LCN-47F: TTCTGCAGCAGCTCAAAGGAno 22AT5G14250: Proteasome component (PCI) domain protein
R: AAATCTCTGGCGCTCTCGTC

Note: LCN = low-copy nuclear; PIC = parsimony informative character; SNP = single-nucleotide polymorphism.

Appendix 5.

Validation of 48 SSRs on a subset of the 48 New Zealand and Australian individuals of Veronica sequenced. Shown are eight individuals of the Veronica chionohebe/V. trifida subset (A5, B5, C4, D4, E4, F4, G4, and H4; see Appendix 3).

Sequence length (bp)No. of allelesPairwise genetic distanceNo.
SSR locusPrimer sequences (5′–3′)SSR motif, Main (additional)RangeMeanNo. of individuals successfully sequencedA5B5C4D4E4F4G4H4MinMaxMeanMinMaxMeanMedianSNPsPICsIntronsNotes
SSR-01F: TGGAACAGCCATTGCATCAAAACA (ATG)310–692353832233334242.900.030.010.0123142two large introns, motif in central exon, sequences partially not covering complete locus
R: TCGTCGACTTACCAGTTCCAG
SSR-02F: GATTGTTTCAGCCAAGAGATTCTCAGAT208–476328702221113131.5?incomplete; several genes amplified by primers; same locus as SSR-42
R: CTTGTTCCGACGCAGACCAT
SSR-03F: TTGAGACGCAAGATTTCTGCAAACTat least 1several genes amplified by primers
R: CCCTCACGCGCTCTATCATT
SSR-04F: TTGTTCAACCAGTCGGACGTGAT127–239216883454455384.800.060.020.0123150
R: CCGCTTCGAGGACTTGCTAG
SSR-05F: GTCGAAATCGGATTTACTAGCTAAGTCATA293–326298844111456163.300.080.040.0228270
R: AGTCGGGAAAGAGATTGGGC
SSR-06F: AATAAACTGACGACAGCGCGTGA3?
R: ACTGTGAGTCTGCCTTACGC
SSR-07F: AGCAGTGAGAGCCAACATCCTAC358–42438687101099911107119.400.330.180.141901741three orthologues?
R: CGAAACGCCCTCTTACACGA
SSR-08F: CCATCAAACCCTTCCAAGCTGGAT311–38735789104366883106.800.140.080.08126109
R: TGGCCTCTTACTTCCTACGTG
SSR-09F: TGGTCACTCTTTCGTGTTGGA310–401325213120001003at least three introns, sequence not covering complete locus
R: CCATAAATTTGTGCTGCCTCCA
SSR-10F: CGTAAATTGGATCAGGTCGCCAGT266–28027283411434414300.040.020.0220141
R: CGTAGCTAGTTTGTCATTGGATGG
SSR-11F: AAACGACGTCGGACTGAGACACGA (TTG, ATT, AG)264–293283828945544295.100.080.040.0442310two orthologues?
R: GGGATAACATTGCTCACTCACC
SSR-12F: TTGCAGTCGGCTTTAAAGATCCAATC00.070.020.020two orthologues?
R: ATACCAGCCATATCAGAGCGC
SSR-13F: TCCTTCCTACTTGCCAAACTCT
R: TCACGCACAGAGGACTGAAC
SSR-14F: TGTTGACTCAATCCGTCTCCGTTAA290–306293821114422142.100.020.010.01960one unambiguous locus
R: TCTGCTTTGCTACCTGTCTTCT
SSR-15F: GGCAGAAGAAACGGTTGCAGGAT310–355220020001002sequences not covering complete locus and do not overlap
R: GACCTTTATGCCGTCTGCCT
SSR-16F: GAGACAACTGCTGCACTTGCATC301–620377734234103142.500.510.060.0291431sequences nearly not overlapping
R: TTAGTCCACCAGTGTCCACG
SSR-17F: AACTTGCTCGTCTCCACCAGGAT203–257238811725656174.100.090.040.0339270two orthologues?
R: CCGATGGATTCAGAAACCAACAA
SSR-18F: TCTGTGCTACAACTAGTACAAGGAGnot reference transcript sequenced
R: GGATGGATCCCTTTCTTGAAATAAGG
SSR-19F: TGGCAACATGCAACTGTGTTTATC (ATA, TAC)268–303282866214143163.400.090.040.0335330two orthologues?
R: ACGAGAATACCATACTTCATGTTCG
SSR-20F: CATTCGTATTACTGTAAATGGTTTGCCGTTA (ACA, GTGA)186–25322882283488528500.10.040.0533310two orthologues?
R: GCAAACAGCACAAATATTTCACCA
SSR-21F: ATGGATGAAGGGCCAGTTAAGGGAT238–256246853451656164.400.120.050.0238320two or three orthologues?
R: CCGCCAACTCCTCATCTAATTCA
SSR-22F: AGGGTCGTTATGGAAACCGGGAT286–348332811225144152.500.360.110.01111991two orthologues?
R: GACATCACCAGTCATCCGCA
SSR-23F: CACAACCAAAGTAGCAGCACTthree orthologues sequenced? sequence different to transcript
R: TGTGAGTTCGCGTAAAGGGA
SSR-24F: GATGCCATTGTTGGATGAATTTCGsequences different to transcript
R: AGCTGCAACTCCTCCTTCAA
SSR-25F: GGTGGTAAAGGCACCGTTAGA0not amplified/sequenced
R: CGACGAGCTCAGGTACGTC
SSR-26F: GTGCGCGAACAAGTTTGGTTATC288–315304833323434243.100.190.120.1488800three orthologues?
R: TCACTAATCCACCTGATCCGTC
SSR-27F: CGGAGAGGTGCAATATACAAATGTACT259–262260821212433142.300.020.010.01840one unambiguous locus
R: GGACAACGCATTAGGAAGTGG
SSR-28F: GCGAAATGCAACATTCCACTGACT266–284276834534755374.500.050.020.0225170two orthologues?
R: GGAGACACGGAACCTGAACA
SSR-29F: GACACCAAACTTGTCTTCAACGTACT300–35534286132262710213600.140.030.0165481two orthologues?
R: AAAGAGGTTGTGAATTCACTAGAAGTT
SSR-30F: TTCTTGCTCTTGTGTTGGTTCCTGA278–290283821221823182.600.040.010.0116100three orthologues?
R: TTCACTTCAAACCTTTGTCACTACC
SSR-31F: CGATGACGATGAGGACGACG2Sequences different to transcript
R: CATTTGATGCACCTCCATGCT
SSR-32F: GTGCCTAGATATCACCAAGATAGAAGAGAT158–24823581244773417400.070.030.0317210two orthologues?
R: GACCAGAAGATCAGACTCAGCA
SSR-33F: GCTGCACCTGGGATTCAAAG5three orthologues sequenced? sequence different to transcript
R: ACTGTGAGTCTGCCTTACGC
SSR-34F: ATTGCTCAACATGTTTGCCTCT3Sequences different to transcript
R: TGTCACAGTTTGGCGATATTGG
SSR-35F: TCGTCATCGCTGAAACCATCAATC (CAA)317–501481856456318184.800.340.060.05141591three orthologues?
R: ACACTTGATCTGCTTGTTGCC
SSR-36F: AAACCCAATTCAAAGCAATGACACTCA238–253245832212233132.300.070.02017130two orthologues?
R: ACCCTCATTTCTCCAAACCAACT
SSR-37F: AGTTGACGCCTTGTTTGGTTCGAT112–280272813169131118192092014.900.090.030.0345300two or three orthologues?
R: CACGCAAACACCACATTCCC
SSR-38F: CCCTAAAGTTCAAGCATCTATACCAG310–56952187664666747600.270.090.11741542three orthologues?
R: TGCTGCAGCTTCAAATGTTTCA
SSR-39F: ACTTGCTGCAACTTGCTAAACATCA450–480457843464756374.900.050.020.0355352two orthologues?
R: TGGATGACAATGAAAGAGAAAGAAGAC
SSR-40F: GCGTGGCTTGATGAACTTGG1Sequences different to transcript
R: ATGCTAGTTGAAGCCGTGCA
SSR-41F: GTAAGACAAGTAGATTTGGTTCACTCT375–380379511111000110.600.050.030.042361
R: GCGGTGTCTCCTTTGTTATGTT
SSR-42F: ACGTAACTCAAATAACGATGCAAGTGAT7three orthologues sequenced? sequence different to transcript
R: AGCTCATTTCCCAGTCATTTAGC
SSR-43F: ACCATCAAACCCTTCCAAGCTATG192–416382811112322131.61two orthologues? second sequence different to transcript
R: TTTGGGATTGGCGCCTCTAC
SSR-44F: GTTATAAGCATCACCAGCGTGGATC (TCG, CACC)283–310297822424533253.100.060.030.0331210two or three orthologues?
R: AGGTAGGAGCATGCTCGTTG
SSR-45F: GTTGGTGTTGAAGATGGACATGA147–6323168two orthologues? second sequence different to transcript
R: ACAATTGTTCCATCAGGTTGTGAA
SSR-46F: TCGCTGTAATGCCAAGAGCC3Sequences different to transcript
R: GCGTTGGTCCAAGAAAGCAA
SSR-47F: CAGGACCAGATGGCTGACAATGAGAT (GGAATT, TGT)264–2882728122541410131146.400.040.010.0219150one unambiguous locus?
R: ACCACTTGTCATTAAACAAACCCT
SSR-48F: CTCTTCACTTCATGAAATGTATCGAGA0failed
R: CAATCTCTTGCCGCTTTATATCAGA
Totals112–692316.86.73.74.13.53.14.14.84.45.1120400.510.040.0454.741.70–3

Note: PICs = parsimony informative characters; SNPs = single-nucleotide polymorphisms; SSRs = simple sequence repeats.

Appendix 6.

Validation of 48 SSRs on a subset of the 48 New Zealand and Australian individuals of Veronica sequenced. Shown are six individuals of the V. thomsonii subset (C5, D5, E5, F5, G5, and H5; see Appendix 3).

Sequence length (bp)No. of allelesPairwise genetic distanceNo.
SSR locusPrimer sequences (5′–3′)SSR motif, Main (additional)RangeMeanNo. of individuals successfully sequencedC5D5E5F5G5H5MinMaxMeanMinMaxMeanMedianSNPsPICsIntronsNotes
SSR-01F: TGGAACAGCCATTGCATCAAAACA (ATG)310–6943426444332243.300.060.010.0129122two large introns, motif in central exon, sequences partially not covering complete locus
R: TCGTCGACTTACCAGTTCCAG
SSR-02F: GATTGTTTCAGCCAAGAGATTCTCAGAT5?incomplete; several genes amplified by primers; same locus as SSR-42
R: CTTGTTCCGACGCAGACCAT
SSR-03F: TTGAGACGCAAGATTTCTGCAAACT5at least 1several genes amplified by primers
R: CCCTCACGCGCTCTATCATT
SSR-04F: TTGTTCAACCAGTCGGACGTGAT127–2422146239662294.700.050.010.011990
R: CCGCTTCGAGGACTTGCTAG
SSR-05F: GTCGAAATCGGATTTACTAGCTAAGTCATA293–3142986663142163.700.080.040.0539280
R: AGTCGGGAAAGAGATTGGGC
SSR-06F: AATAAACTGACGACAGCGCGTGA50?not transcript sequence amplified
R: ACTGTGAGTCTGCCTTACGC
SSR-07F: AGCAGTGAGAGCCAACATCCTAC388–433386691210910109121000.350.180.141791631three orthologues?
R: CGAAACGCCCTCTTACACGA
SSR-08F: CCATCAAACCCTTCCAAGCTGGAT342–418357686844648600.170.080.08146870
R: TGGCCTCTTACTTCCTACGTG
SSR-09F: TGGTCACTCTTTCGTGTTGGA310–79839351112101212?sequences not covering locus
R: CCATAAATTTGTGCTGCCTCCA
SSR-10F: CGTAAATTGGATCAGGTCGCCAGT145–2752586242141142.300.360.090.03105241?two misamplifications
R: CGTAGCTAGTTTGTCATTGGATGG
SSR-11F: AAACGACGTCGGACTGAGACACGA (TTG, ATT, AG)265–2942816771251155127.800.080.030.0336330two orthologues?
R: GGGATAACATTGCTCACTCACC
SSR-12F: TTGCAGTCGGCTTTAAAGATCCAATC199–2632296589656596.500.070.020.0239210two orthologues?
R: ATACCAGCCATATCAGAGCGC
SSR-13F: TCCTTCCTACTTGCCAAACTCT5?sequences not covering locus
R: TCACGCACAGAGGACTGAAC
SSR-14F: TGTTGACTCAATCCGTCTCCGTTAA286–311293622121414200.280.080.0184740two orthologues? maybe two misamplifications
R: TCTGCTTTGCTACCTGTCTTCT
SSR-15F: GGCAGAAGAAACGGTTGCAGGAT312–10516812000101110.32one misamplification
R: GACCTTTATGCCGTCTGCCT
SSR-16F: GAGACAACTGCTGCACTTGCATC310–6274666134223142.500.080.10.027591sequences not covering locus
R: TTAGTCCACCAGTGTCCACG
SSR-17F: AACTTGCTCGTCTCCACCAGGAT222–25223969664710410700.510.090.06173520two orthologues? three misamplifications
R: CCGATGGATTCAGAAACCAACAA
SSR-18F: TCTGTGCTACAACTAGTACAAGGAG4?sequences not covering locus
R: GGATGGATCCCTTTCTTGAAATAAGG
SSR-19F: TGGCAACATGCAACTGTGTTTATC (ATA, TAC)260–2952746435443353.800.080.020.0131240two orthologues?
R: ACGAGAATACCATACTTCATGTTCG
SSR-20F: CATTCGTATTACTGTAAATGGTTTGCCGTTA (ACA, GTGA)190–253232654105464105.700.090.040.0426240two orthologues?
R: GCAAACAGCACAAATATTTCACCA
SSR-21F: ATGGATGAAGGGCCAGTTAAGGGAT238–2562456545645464.800.120.050.0239320two orthologues?
R: CCGCCAACTCCTCATCTAATTCA
SSR-22F: AGGGTCGTTATGGAAACCGGGAT285–3453336366132163.500.410.150.011331261two orthologues?
R: GACATCACCAGTCATCCGCA
SSR-23F: CACAACCAAAGTAGCAGCACT6?sequences not covering locus, two loci?
R: TGTGAGTTCGCGTAAAGGGA
SSR-24F: GATGCCATTGTTGGATGAATTTCG5?sequences not covering locus
R: AGCTGCAACTCCTCCTTCAA
SSR-25F: GGTGGTAAAGGCACCGTTAGA0no sequences
R: CGACGAGCTCAGGTACGTC
SSR-26F: GTGCGCGAACAAGTTTGGTTATC302–3063026344222242.800.20.120.1795810three orthologues?
R: TCACTAATCCACCTGATCCGTC
SSR-27F: CGGAGAGGTGCAATATACAAATGTACT259–2632596223111131.700.040.010.011440one unambiguous locus
R: GGACAACGCATTAGGAAGTGG
SSR-28F: GCGAAATGCAACATTCCACTGACT257–290273643643436400.050.020.0325180two orthologues?
R: GGAGACACGGAACCTGAACA
SSR-29F: GACACCAAACTTGTCTTCAACGTACT300–3513296745742274.800.130.060.0661581three orthologues?
R: AAAGAGGTTGTGAATTCACTAGAAGTT
SSR-30F: TTCTTGCTCTTGTGTTGGTTCCTGA275–2852796122352152.500.040.010.011350one unambiguous locus
R: TTCACTTCAAACCTTTGTCACTACC
SSR-31F: CGATGACGATGAGGACGACG0no sequences
R: CATTTGATGCACCTCCATGCT
SSR-32F: GTGCCTAGATATCACCAAGATAGAAGAGAT236–2512416369563395.300.070.030.0326230two or three orthologues?
R: GACCAGAAGATCAGACTCAGCA
SSR-33F: GCTGCACCTGGGATTCAAAG50different orthologous locus sequenced?
R: ACTGTGAGTCTGCCTTACGC
SSR-34F: ATTGCTCAACATGTTTGCCTCT4sequences different to transcript
R: TGTCACAGTTTGGCGATATTGG
SSR-35F: TCGTCATCGCTGAAACCATCAATC (CAA)463–501484574766017500.470.160.061921411two orthologues?
R: ACACTTGATCTGCTTGTTGCC
SSR-36F: AAACCCAATTCAAAGCAATGACACTCA240–253245622442424300.020.010.01860one unambiguous locus
R: ACCCTCATTTCTCCAAACCAACT
SSR-37F: AGTTGACGCCTTGTTTGGTTCGAT184–2802756102616141022261300.080.030.0332290two or three orthologues?
R: CACGCAAACACCACATTCCC
SSR-38F: CCCTAAAGTTCAAGCATCTATACCAG310–568536610868686107.700.230.080.06171982two or three orthologues?
R: TGCTGCAGCTTCAAATGTTTCA
SSR-39F: ACTTGCTGCAACTTGCTAAACATCA406–4804566643545364.500.060.030.0352362two orthologues?
R: TGGATGACAATGAAAGAGAAAGAAGAC
SSR-40F: GCGTGGCTTGATGAACTTGG3sequences different to transcript, but consistent
R: ATGCTAGTTGAAGCCGTGCA
SSR-41F: GTAAGACAAGTAGATTTGGTTCACTCT376–380379520111112100.060.020.042441
R: GCGGTGTCTCCTTTGTTATGTT
SSR-42F: ACGTAACTCAAATAACGATGCAAGTGAT125–3623136323212132.200.390.180.15180680two orthologues?
R: AGCTCATTTCCCAGTCATTTAGC
SSR-43F: ACCATCAAACCCTTCCAAGCTATG157–4163664202001120.80misamplifications
R: TTTGGGATTGGCGCCTCTAC
SSR-44F: GTTATAAGCATCACCAGCGTGGATC (TCG, CACC)283–3162966344352253.500.050.030.0321160two orthologues?
R: AGGTAGGAGCATGCTCGTTG
SSR-45F: GTTGGTGTTGAAGATGGACATGA148–359303661106111104.200.60.430.47405311?three orthologues? second sequence different to transcript
R: ACAATTGTTCCATCAGGTTGTGAA
SSR-46F: TCGCTGTAATGCCAAGAGCC3sequences different to transcript
R: GCGTTGGTCCAAGAAAGCAA
SSR-47F: CAGGACCAGATGGCTGACAATGAGAT (GGAATT, TGT)264–294271637462227400.030.010.0117100two orthologues?
R: ACCACTTGTCATTAAACAAACCCT
SSR-48F: CTCTTCACTTCATGAAATGTATCGAGA1?
R: CAATCTCTTGCCGCTTTATATCAGA
Totals125–1051327.354.34.65.44.13.93.21264.300.60.10.180.352.50–2

Note: PICs = parsimony informative characters; SNPs = single-nucleotide polymorphisms; SSRs = simple sequence repeats.

  40 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  Phylogeographic patterns in the Australasian genus Chionohebe (Veronica s.l., Plantaginaceae) based on AFLP and chloroplast DNA sequences.

Authors:  Heidi M Meudt; Michael J Bayly
Journal:  Mol Phylogenet Evol       Date:  2008-01-01       Impact factor: 4.286

3.  Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants.

Authors:  Riet De Smet; Keith L Adams; Klaas Vandepoele; Marc C E Van Montagu; Steven Maere; Yves Van de Peer
Journal:  Proc Natl Acad Sci U S A       Date:  2013-02-04       Impact factor: 11.205

4.  Polyploidy and angiosperm diversification.

Authors:  Douglas E Soltis; Victor A Albert; Jim Leebens-Mack; Charles D Bell; Andrew H Paterson; Chunfang Zheng; David Sankoff; Claude W Depamphilis; P Kerr Wall; Pamela S Soltis
Journal:  Am J Bot       Date:  2009-01       Impact factor: 3.844

5.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

6.  A composite genome approach to identify phylogenetically informative data from next-generation sequencing.

Authors:  Rachel S Schwartz; Kelly M Harkins; Anne C Stone; Reed A Cartwright
Journal:  BMC Bioinformatics       Date:  2015-06-11       Impact factor: 3.169

7.  Combining transcriptome assemblies from multiple de novo assemblers in the allo-tetraploid plant Nicotiana benthamiana.

Authors:  Kenlee Nakasugi; Ross Crowhurst; Julia Bally; Peter Waterhouse
Journal:  PLoS One       Date:  2014-03-10       Impact factor: 3.240

8.  Separating homeologs by phasing in the tetraploid wheat transcriptome.

Authors:  Ksenia V Krasileva; Vince Buffalo; Paul Bailey; Stephen Pearce; Sarah Ayling; Facundo Tabbita; Marcelo Soria; Shichen Wang; Eduard Akhunov; Cristobal Uauy; Jorge Dubcovsky
Journal:  Genome Biol       Date:  2013-06-25       Impact factor: 13.583

9.  A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae).

Authors:  Simon Uribe-Convers; Matthew L Settles; David C Tank
Journal:  PLoS One       Date:  2016-02-01       Impact factor: 3.240

10.  A new resource for the development of SSR markers: Millions of loci from a thousand plant transcriptomes.

Authors:  Richard G J Hodel; Matthew A Gitzendanner; Charlotte C Germain-Aubrey; Xiaoxian Liu; Andrew A Crowl; Miao Sun; Jacob B Landis; M Claudia Segovia-Salcedo; Norman A Douglas; Shichao Chen; Douglas E Soltis; Pamela S Soltis
Journal:  Appl Plant Sci       Date:  2016-06-16       Impact factor: 1.936

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.