Literature DB >> 25202615

Whole genome shotgun sequences for microsatellite discovery and application in cultivated and wild Macadamia (Proteaceae).

Catherine J Nock1, Martin S Elphinstone1, Gary Ablett1, Asuka Kawamata1, Wayne Hancock1, Craig M Hardner2, Graham J King1.   

Abstract

PREMISE OF THE STUDY: Next-generation sequencing (NGS) data are widely used for single-nucleotide polymorphism discovery and genetic marker development in species with limited available genome information. We developed microsatellite primers for the Proteaceae nut crop species Macadamia integrifolia and assessed cross-species transferability in all congeners to investigate genetic identification of cultivars and gene flow. • METHODS AND
RESULTS: Primers were designed from both raw and assembled Illumina NGS paired-end reads. The final 12 microsatellite markers selected were polymorphic among wild individuals of all four Macadamia species-M. integrifolia, M. tetraphylla, M. ternifolia, and M. jansenii-and in commercial macadamia cultivars including hybrids. •
CONCLUSIONS: We demonstrate the utility of raw and assembled Illumina NGS reads from total genomic DNA for the rapid development of microsatellites in Macadamia. These primers will facilitate future studies of population structure, hybridization, parentage, and cultivar identification in cultivated and wild Macadamia populations.

Entities:  

Keywords:  Macadamia; Proteaceae; crop; cultivar; horticulture; nut

Year:  2014        PMID: 25202615      PMCID: PMC4103134          DOI: 10.3732/apps.1300089

Source DB:  PubMed          Journal:  Appl Plant Sci        ISSN: 2168-0450            Impact factor:   1.936


Macadamia is a recently domesticated nut crop derived from the Australian subtropical rainforest species Macadamia integrifolia Maiden & Betche and M. tetraphylla L. A. S. Johnson and their hybrids. Within the genus, all species, including M. ternifolia F. Muell. and M. jansenii C. L. Gross & P. H. Weston, are under threat of genetic erosion (Mast et al., 2008; Costello et al., 2009). Commercial cultivars were developed primarily in Hawaii and are only a few generations removed from Australian wild progenitors (Hardner et al., 2009). Macadamia are preferentially out-crossing and take four to five years to reach maturity. For breeding programs to progress effectively, there is a need to discriminate among clonally propagated industry standard cultivars and novel selections well before maturity. Although the 17 available M. integrifolia microsatellite markers with perfect repeats were tested in our laboratory (Schmidt et al., 2006), only four amplified successfully. These results are consistent with previous research on M. integrifolia (Neal, 2008), and no published study has used more than four polymorphic markers (Shapcott and Powell, 2011; Spain and Lowe, 2011). Additional microsatellite markers are needed to support conservation studies and breeding programs. Next-generation sequencing (NGS) platforms are now routinely used for isolation of microsatellite, or simple sequence repeat (SSR), loci from plants (Egan et al., 2012). Long-read platforms are commonly used because reads of 300 to 500 bp in length may contain both the SSR motif and flanking sequence for primer design (Zalapa et al., 2012). Together, paired-end reads from short-read platforms also contain the SSR motif and flanking sequence for primer design at a lower cost per base (Silva et al., 2013). The aim of this study was to develop polymorphic microsatellite markers for Macadamia using paired-end Illumina reads with and without prior de novo assembly.

METHODS AND RESULTS

Fresh leaf material was collected from macadamia nut cultivars at Clunes Varietal Trial M2, Clunes, New South Wales, Australia (Stephenson and Gallagher, 2000). Additional cultivars and clones of wild-collected individuals of all four Macadamia species were sourced from the Australian Macadamia Germplasm Collection at Alstonville Tropical Fruit Research Station, NSW Department of Primary Industries. Herbarium material is deposited at the Southern Cross University Medicinal Plant Herbarium (PHARM), Lismore, New South Wales, Australia (Appendix 1). Fresh leaf material was stored at −80°C (for Illumina sequencing) or after collection dried in a sealed container with 10× silica gel by fresh weight. Total DNA was extracted using a QIAGEN DNeasy Plant Kit (QIAGEN, Valencia, California, USA) according to manufacturer’s protocols. Approximately 4.5 μg of DNA extracted from one individual of M. integrifolia was submitted to the Australian Genome Research Facility, Melbourne, for sequencing. A DNA library was prepared with an Illumina TruSeq Sample Preparation Kit (version 2) following the manufacturer’s instructions (Illumina, San Diego, California, USA). Genomic DNA was sheared using a Covaris S2 sonication device (Covaris, Woburn, Massachusetts, USA). DNA fragments were end-repaired, A-tailed, and ligated to adapters. Size and concentration of DNA fragments were assessed using a DNA 1000 chip on a Bioanalyzer 2100 instrument (Agilent Technologies, Santa Clara, California, USA). Average insert size of the library was 424 bp. Approximately 4 pmol of the library was paired-end sequenced (100 × 2 cycles) on an Illumina Hi-Seq 2000 instrument.
Appendix 1.

Voucher information for Macadamia species used in this study.

SpeciesVoucher specimen accession no.aCollection localityGeographic coordinates
M. janseniiPHARM-13-0809Bulburin National Park, Queensland, Australia24°37.584′S, 151°33.291′E
M. tetraphyllaPHARM-13-0810Mulllumbimby, northern New South Wales, Australia28°32.835′S, 153°25.455′E
M. ternifoliaPHARM-13-0811Draper, Queensland, Australia27°21.268′S, 152°54.965′E
M. integrifoliaPHARM-13-0812Villeneuve, Queensland, Australia26°58.384′S, 152°38.899′E
M. integrifolia, cultivar 741PHARM-13-0813Clunes Varietal Trial M2, New South Wales, Australia28°43.844′S, 153°23.699′E

Vouchers deposited at Southern Cross University, Medicinal Plant Herbarium (PHARM), Lismore, New South Wales, Australia.

Paired-end reads were imported into CLC Genomics Workbench (version 4.9; CLC Bio, Aarhaus, Denmark) and trimmed to remove low-quality base calls (di- and trinucleotide SSR motifs with a minimum of eight repeats in raw sequence reads. SSR regions were identified at the 3′-end of a read. Primers were then designed in the flanking regions (i.e., 5′-end of read containing SSR) and in the matching paired-end read. De novo contigs: trimmed reads were assembled de novo with the following parameters: similarity index = 0.8; length fraction = 0.5; insertion/deletion cost = 3; mismatch cost = 2. Contigs were screened for SSR regions using the search function described above. To develop and optimize a suite of SSR markers for cultivar identification and gene flow studies, primers were designed for 48 loci, 24 for each method using a batch function in Primer3 version 2 (Rozen and Skaletsky, 2000) specifying a primer melting temperature (Tm) range 58–70°C, maximum Tm difference 5°C, and primer GC content 40–60%. To minimize the cost of primer synthesis during the testing phase, one primer from each pair was 5′ modified with an engineered sequence (5′-CCCCCGGGGGC-3′) to enable the attachment of a third primer that was fluorescently labeled using a two-step PCR protocol (Pacey-Miller and Henry, 2003). Primer pairs were tested for amplification success and polymorphism among 12 DNA samples including eight M. integrifolia cultivars and one individual from each Macadamia species. Of the 48 primer pairs tested, six did not amplify and seven produced multiple bands. Of the remaining 35 loci, none were monomorphic, with two or more alleles detected among the 12 test individuals. Primer sequences for these loci are available on request from the author. Twelve microsatellite loci were selected for further development on the basis of single band amplification, level of polymorphism, and size compatibility for pooled multilocus capillary electrophoresis. The 5′ end of one of each primer pair was fluorescently labeled (Table 1) and the following single-step PCR protocol was used: in 20-μL reaction volumes containing approximately 20 ng DNA template, 0.5 U Platinum Taq (Life Technologies, Carlsbad, California, USA), 2 μL Platinum Taq PCR buffer, 0.1 mM dNTPs, 2 mM MgCl2, 0.2 μM of each primer, and sterile water to 20 μL. Thermal cycling was conducted in a GeneAmp PCR System 9700 (Life Technologies) with the following conditions: initial denaturation at 94°C for 2 min; followed by 35 cycles of 94°C for 10 s, annealing temperature (Ta) (Table 1) for 10 s, extension at 70°C for 1 min; followed by final extension at 70°C for 5 min. Genotypes were generated using an ABI PRISM 3730 Genetic Analyzer (Applied Biosystems, Foster City, California, USA). Allele size was scored in reference to ABI PRISM GS (LIZ) internal size standards using the program Geneious version 6.1.6 (Biomatters Ltd., Auckland, New Zealand). We assessed variability and genotype consistency of the 12 loci in 22 macadamia cultivars (two to four replicate trees of each) including pure M. integrifolia and hybrids. The loci were also tested for cross-amplification in wild-collected individuals of M. integrifolia (n = 6), M. tetraphylla (n = 7), M. ternifolia (n = 2), and M. jansenii (n = 2).
Table 1.

Characterization of 12 polymorphic microsatellite loci developed in Macadamia integrifolia.

LocusPrimer sequences (5′–3′)Repeat motifFluorescent labelAllele size range (bp)Ta (°C)GenBank accession no.
Mac001F: GTGACTGGTGGACACCAAAACCCA(AT)11VIC412–42060KF130888
R: GCACTAGGTGTCACCCCCACTTCT
Mac002F: CCCAACTGGGTTTGCAAGGACCAA(CT)8NED283–29760KF130889
R: AGTAGCCGCGAGCTGATCGAAGAT
Mac003F: TGGACCATTGAGGAGTTGGACTGT(AT)9FAM258–27660KF130890
R: TCCACCGTTTCACTTTCGTCAGCC
Mac004F: CAAGAGTGTCCAGCGAGGGAATGC(AT)11NED224–24060KF130891
R: GGGAGACATCATACTTTTGACACATGCC
Mac005F: CATAGCATGAGTTTCAAGGGATAA(AAG)10FAM331–34360KF130892
R: ATTACAAACCCACTCTTCGATTT
Mac006F: TTTCATCATTGATCATCATAGGTACA(AG)11PET322–36055KF130893
R: GAGCTAATACTTAACCAGGTGAACA
Mac007F: AGGCCTTGGGATGTTCCAGTGTGA(CT)11NED368–39060KF130894
R: GCAATCAACACAAGCACCTGTGGC
Mac008F: AACGGTTATGTCAAGTGCAACAGGA(AT)10FAM388–39860KF130895
R: TGACTTTAGCCCTCACTTCAAAGCCA
Mac009F: CAACTCTCTCTCCCTCAGATTCTC(AAG)13VIC241–24460KF130896
R: TAAATCTATGCCACATCACTAGGC
Mac010F: GCAACTGGATCAGCACATAAGAAT(AG)11PET259–29755KF130897
R: TCCGATCATAGTCTTAGCATTTCA
Mac011F: AGAGGGCGAGATCCCTGACTCTGA(CT)9FAM175–19960KF130898
R: TGAAATTTGGCGTGGGGAAAGCGT
Mac012F: TATCAGGACCATCAACAATGATTT(AC)10VIC309–32160KF130899
R: GCCTGTTGTAGGTAAAGTGGAGAT

Note: Ta = annealing temperature used for all Macadamia species and cultivars.

Values based on 22 samples representing Macadamia cultivars located at Clunes Varietal Trial M2, New South Wales, Australia.

Characterization of 12 polymorphic microsatellite loci developed in Macadamia integrifolia. Note: Ta = annealing temperature used for all Macadamia species and cultivars. Values based on 22 samples representing Macadamia cultivars located at Clunes Varietal Trial M2, New South Wales, Australia. After trimming, there were 245,099,904 reads, with an average length of 91.57 bp. We identified 2.29 million reads containing di- and trinucleotide SSR motifs with a minimum of eight repeats. Amplification success at 60°C annealing temperature was identical (87.5%) for primer pairs from unassembled reads and de novo assembled contigs. Genetic diversity parameters and principal coordinate analysis (PCoA) were calculated using GenAlEx version 6.5 (Peakall and Smouse, 2006, 2012) (Table 2).
Table 2.

Genetic properties of 12 microsatellite loci in Macadamia integrifolia and hybrid industry cultivars, and M. tetraphylla.

Macadamia cultivars, Clunes Varietal Trial M2 (n = 22)M. tetraphylla, northern NSW (n = 7)
LocusAHoHeAHoHe
Mac00150.5450.67650.7140.704
Mac00250.5910.59630.1670.653
Mac00370.6670.68350.8570.714
Mac00470.3640.76260.4290.796
Mac00540.5910.65420.4290.337
Mac00690.8640.77690.8570.847
Mac00760.6820.65340.8570.684
Mac00850.3640.35130.4290.357
Mac00920.0910.16520.1430.133
Mac01080.9090.70250.7140.724
Mac01170.8640.80090.7140.837
Mac01260.3180.67460.5710.796
Mean5.9170.5710.6264.9170.5730.632

Note: A = number of alleles; He = expected heterozygosity; Ho = observed heterozygosity; n = number of individuals sampled.

Genetic properties of 12 microsatellite loci in Macadamia integrifolia and hybrid industry cultivars, and M. tetraphylla. Note: A = number of alleles; He = expected heterozygosity; Ho = observed heterozygosity; n = number of individuals sampled. All 12 loci amplified and were polymorphic among 22 cultivars. Mean observed (Ho) and expected (He) heterozygosity were 0.571 and 0.626, respectively. A total of 71 alleles were detected, with an average of 5.9 per locus (Table 2). Unique genotypes were obtained for each cultivar with the exception of Hawaiian Agricultural Experiment Station (HAES) 741 and 660 that shared 24 of 24 alleles. Selection records for these two cultivars are the same, suggesting that they may have been sourced from the same tree at different times. Genotypes from replicate trees of cultivars were consistent, with the exception of one of three HAES 791 trees that is presumed to be a misidentification as its genotype was identical to HAES 344. In M. tetraphylla, 59 alleles were found, with an average of 4.9 per locus. Mean Ho and He were 0.573 and 0.632, respectively (Table 2). All loci amplified reliably in sampled wild M. integrifolia and M. tetraphylla individuals, and were polymorphic with the exception of Mac009 in M. integrifolia. Locus Mac005 in M. jansenii and Mac001 in M. ternifolia did not amplify. The remaining 11 loci amplified in M. jansenii and M. ternifolia, and eight were polymorphic in two individuals of each of these species. Species-specific clusters were generated by two-dimensional PCoA based on genetic distance. Most cultivars clustered with wild M. integrifolia individuals, although hybrid cultivars such as A4 and A16 were intermediate between M. integrifolia and M. tetraphylla (Fig. 1).
Fig. 1.

Principal coordinate cluster plot based on genetic distance among multilocus genotypes for Macadamia integrifolia (white), M. tetraphylla (red), M. ternifolia (blue), M. jansenii (yellow), and macadamia cultivars (black). First and second coordinates explain 35.19% and 11.36% of the variation, respectively.

Principal coordinate cluster plot based on genetic distance among multilocus genotypes for Macadamia integrifolia (white), M. tetraphylla (red), M. ternifolia (blue), M. jansenii (yellow), and macadamia cultivars (black). First and second coordinates explain 35.19% and 11.36% of the variation, respectively.

CONCLUSIONS

The microsatellite markers developed here enable discrimination among macadamia industry cultivars and will be used to select parental genotypes in breeding programs. Cross-amplification and polymorphism of the markers in all Macadamia species will facilitate studies of population structure, gene flow, and hybridization. In this work, we demonstrate the effectiveness of Illumina NGS paired-end sequence reads for rapid and cost-effective microsatellite development with and without prior assembly of reads.
  6 in total

1.  Single-nucleotide polymorphism detection in plants using a single-stranded pyrosequencing protocol with a universal biotinylated primer.

Authors:  Toni Pacey-Miller; Robert Henry
Journal:  Anal Biochem       Date:  2003-06-15       Impact factor: 3.365

Review 2.  Using next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences.

Authors:  Juan E Zalapa; Hugo Cuevas; Huayu Zhu; Shawn Steffan; Douglas Senalik; Eric Zeldin; Brent McCown; Rebecca Harbut; Philipp Simon
Journal:  Am J Bot       Date:  2011-12-20       Impact factor: 3.844

3.  Applications of next-generation sequencing in plant biology.

Authors:  Ashley N Egan; Jessica Schlueter; David M Spooner
Journal:  Am J Bot       Date:  2012-02-06       Impact factor: 3.844

4.  A smaller Macadamia from a more vagile tribe: inference of phylogenetic relationships, divergence times, and diaspore evolution in Macadamia and relatives (tribe Macadamieae; Proteaceae).

Authors:  Austin R Mast; Crystal L Willis; Eric H Jones; Katherine M Downs; Peter H Weston
Journal:  Am J Bot       Date:  2008-07       Impact factor: 3.844

5.  GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Authors:  Rod Peakall; Peter E Smouse
Journal:  Bioinformatics       Date:  2012-07-20       Impact factor: 6.937

6.  Development and validation of microsatellite markers for Brachiaria ruziziensis obtained by partial genome assembly of Illumina single-end reads.

Authors:  Pedro I T Silva; Alexandre M Martins; Ediene G Gouvea; Marco Pessoa-Filho; Márcio E Ferreira
Journal:  BMC Genomics       Date:  2013-01-16       Impact factor: 3.969

  6 in total
  6 in total

1.  Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae).

Authors:  Catherine J Nock; Abdul Baten; Bronwyn J Barkla; Agnelo Furtado; Robert J Henry; Graham J King
Journal:  BMC Genomics       Date:  2016-11-17       Impact factor: 3.969

2.  Relationships between Nut Size, Kernel Quality, Nutritional Composition and Levels of Outcrossing in Three Macadamia Cultivars.

Authors:  Tarran E Richards; Wiebke Kämper; Stephen J Trueman; Helen M Wallace; Steven M Ogbourne; Peter R Brooks; Joel Nichols; Shahla Hosseini Bai
Journal:  Plants (Basel)       Date:  2020-02-11

3.  ChloroMitoSSRDB 2.00: more genomes, more repeats, unifying SSRs search patterns and on-the-fly repeat detection.

Authors:  Gaurav Sablok; G V Padma Raju; Suresh B Mudunuri; Ratna Prabha; Dhananjaya P Singh; Vesselin Baev; Galina Yahubyan; Peter J Ralph; Nicola La Porta
Journal:  Database (Oxford)       Date:  2015-09-27       Impact factor: 3.451

4.  Microsatellite markers: what they mean and why they are so useful.

Authors:  Maria Lucia Carneiro Vieira; Luciane Santini; Augusto Lima Diniz; Carla de Freitas Munhoz
Journal:  Genet Mol Biol       Date:  2016-08-04       Impact factor: 1.771

5.  Ultra-high-throughput DArTseq-based silicoDArT and SNP markers for genomic studies in macadamia.

Authors:  Mobashwer Alam; Jodi Neal; Katie O'Connor; Andrzej Kilian; Bruce Topp
Journal:  PLoS One       Date:  2018-08-31       Impact factor: 3.240

6.  Optimizing depth and type of high-throughput sequencing data for microsatellite discovery.

Authors:  Mark A Chapman
Journal:  Appl Plant Sci       Date:  2019-11-03       Impact factor: 1.936

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.