Literature DB >> 28050346

Massive Analysis of cDNA Ends (MACE) for transcript-based marker design in pea (Pisum sativum L.).

Aleksandr Zhernakov¹, Björn Rotter², Peter Winter², Alexey Borisov¹, Igor Tikhonovich³, Vladimir Zhukov¹.

Abstract

Aimed at gene-based markers design, we generated and analyzed transcriptome sequencing datasets for six pea (Pisum sativum L.) genetic lines that have not previously been massively genotyped. Five cDNA libraries obtained from nodules or nodulated roots of genetic lines Finale, Frisson, Sparkle, Sprint-2 and NGB1238 were sequenced using a versatile 3'-RNA-seq protocol called MACE (Massive Analysis of cDNA Ends). MACE delivers a single next-generation sequence from the 3'-end of each individual cDNA molecule that precisely quantifies the respective transcripts. Since the contig generated from the 3'-end of the cDNA by assembling all sequences encompasses the highly polymorphic 3'-untranslated region (3'-UTR), MACE efficiently detects single nucleotide variants (SNVs). Mapping MACE reads to the reference nodule transcriptome assembly of the pea line SGE (Transcriptome Shotgun Assembly GDTM00000000.1) resulted in characterization of over 34,000 polymorphic sites in more than 9700 contigs. Several of these SNVs were located within recognition sequences of restriction endonucleases which allowed the design of co-dominant CAPS markers for the particular transcript. Cleaned reads of sequenced libraries are available from European Nucleotide Archive (http://www.ebi.ac.uk/) under accessions PRJEB18101, PRJEB18102, PRJEB18103, PRJEB18104, PRJEB17691.

Entities: CellLine Chemical Disease Species

Keywords: CAPS markers; Gene-based markers; MACE (Massive Analysis of cDNA Ends); Pisum sativum L.; SNVs; Transcriptome sequencing

Year: 2016 PMID： 28050346 PMCID： PMC5192247 DOI： 10.1016/j.gdata.2016.12.004

Source DB: PubMed Journal: Genom Data ISSN： 2213-5960

Direct link to deposited data

http://www.ebi.ac.uk/ena/data/view/PRJEB18101 http://www.ebi.ac.uk/ena/data/view/PRJEB18102 http://www.ebi.ac.uk/ena/data/view/PRJEB18103 http://www.ebi.ac.uk/ena/data/view/PRJEB18104 http://www.ebi.ac.uk/ena/data/view/PRJEB17691

Introduction

Garden pea (Pisum sativum L.) is one of the most agriculturally important legumes in the world and a versatile model plant for studying the genetic bases of beneficial plant-microbe interactions [1]. Hence, the development of genetic and genomic resources for pea such as single nucleotide variants (SNV) datasets is demanded for both basic and applied science. These SNVs may serve as a base for marker development for genotyping and/or genetic mapping. Considering the lack of a pea genomic sequence, transcriptome analysis by next generation sequencing (NGS) is an appropriate solution for SNV discovery. We here focused our efforts on such genetic lines which have been used in several mutagenesis programs aimed at identification of pea symbiotic genes involved in the interaction of the plant with nodule bacteria and arbuscular-mycorrhizal fungi [2], [3], [4], [5], [6]. We expect that the development of transcript-based molecular markers will facilitate genetic mapping of symbiotic genes with unknown genomic location.

Experimental design, materials and methods

Biological materials

Transcriptomic analysis was performed on five pea (Pisum sativum L.) genetic lines: Finale = JI2678 [2], Frisson = JI2491 [3], NGB1238 = JI0073 (also known as WBH1238, WL1238), Sparkle = JI0427 [4], Sprint-2 = JI2612 [6] (JI - identifiers of JIC Pisum Collection, https://www.seedstor.ac.uk/search-infocollection.php?idCollection=6). Seeds were surface-sterilized with concentrated sulfuric acid (98%) (15 min on a shaker), washed 10 times with autoclaved distilled water, and germinated on Petri dishes containing sterile vermiculite for 3 days. The germinated seeds were then planted into 2 L pots containing quartz sand (5 seedlings per pot), watered with nitrogen-free mineral nutrition solution [7], and inoculated with an aqueous suspension of Rhizobium leguminosarum bv. viciae RCAM1026 [8] (1 × 106 CFU per pot). Samples (nodules or nodulated roots of all plants from one pot) were harvested according to peculiarities of pea lines: on day 14 post inoculation (dpi) for Sparkle, on 21 dpi for Sprint-2, on 28 dpi for Finale, Frisson and NGB1238. Harvested material (mature nodules of lines Finale, Frisson and Sprint-2, nodulated roots of lines NGB1238 and Sparkle) was placed in liquid nitrogen, ground into powder, and stored at − 80 °C until needed.

Libraries preparation and sequencing

RNA isolation, NGS-library preparation and sequencing were performed at GenXPro GmbH, Frankfurt am Main, Germany. RNA was isolated using the Nucleospin miRNA Kit (Macherey-Nagel GmbH & Co. KG, Düren, Germany) according to the protocol for isolation of total RNA from plant tissue. MACE libraries were constructed using the MACE kit [9] according to the manual provided with the kit and sequenced on an Illumina HiSeq 2000 with 100 cycles.

Bioinformatics

For SNVs discovery we used as a reference the pea nodules transcriptome assembly [10] constructed for the genetic line SGE = JI3023, which is deposited at NCBI Transcriptome shotgun assembly (TSA) under accession GDTM00000000.1. Trimmed and cleaned reads of each library were mapped to the assembly with the Bowtie2 program v. 2.2.5 [11]. During the mapping process, SM-tag designating the pea genetic line was added to each read. Compiled SAM-files were converted to BAM format and merged into the single BAM-file. SNV-calling followed by preliminary filtering of SNVs with mapping quality lower than 20 were executed with the BCFtools utilities [12]. Sites where the coverage with high-quality bases (DP) was less than 10 were not considered and were marked as ‘unknown’ for a particular genetic line. Sites where the DV/DP ratio of the high-quality non-reference bases number (DV) to the total number of high-quality bases (DP) exceeded 0.9 were considered as SNVs (Suppl. Table 1). For the detected SNVs using the original script we searched for recognition sequences of restriction enzymes that would cut either the canonical or the variant site and thus would generate a co-dominant Cleaved Amplified Polymorphic Sequence (CAPS) marker. Recognition sequences of restriction enzymes longer than 3 bp were retrieved from the New England Biolabs (NEB, UK) catalogue.

Approbation of CAPS markers

For ten contigs containing in total 13 SNVs we developed CAPS-markers distinguishing differences either between lines Finale and NGB1238 (six detected SNVs) or lines Sparkle and NGB1238 (seven detected SNVs). PCR primers were designed on the base of sequences of publically available pea transcriptome assemblies and ESTs with help of the online tool Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast) [13], taking into account the exon-intron structure of assumed orthologous genes of Medicago truncatula Gaertn predicted by aligning the pea contigs with M. truncatula genome (ver. Mt4.0, www.phytozome.org). PCR resulted in specific amplification in nine cases out of ten, and digestion with proper restriction endonuclease led to predicted digestion pattern for 11 SNV sites (Suppl. Table 2).

Conclusion

As a result, 34,711 polymorphic sites were characterized in 9724 contigs of the pea nodule transcriptome assembly. For 28,494 SNVs it is potentially possible to design CAPS markers. For a total of 10 loci primers were designed, and of these 9 could be amplified neatly. 8 of them could be digested differently for distinct lines with the appropriate restriction enzymes and are thus markers. The generated dataset provides necessary information for gene-based markers design in pea, which is useful, in particular, for genetic mapping of the genes related to symbiotic interactions with nodule bacteria and arbuscular-mycorrhizal fungi, since over 90% of described pea symbiotic mutants are obtained on backgrounds Finale, Frisson, SGE, Sparkle and Sprint-2 [14]. The following are the supplementary data related to this article.

Supplementary Table 1

SNVs detected in contigs of pea nodules transcriptome assembly (genetic line SGE, Transcriptome Shotgun Assembly GDTM00000000.1) in comparisons to lines: Finale, Frisson, Sparkle, Sprint-2 and NGB1238.

Supplementary Table 2

CAPS markers developed on the base of detected SNVs.

Specifications
Organism/cell line/tissue	Pisum sativum L., nodules or nodulated roots of pea genetic lines Finale, Frisson, Sparkle, Sprint-2 and NGB1238
Sex	–
Sequencer or array type	Illumina HiSeq 2000
Data format	Raw and analyzed
Experimental factors	–
Experimental features	The Massive Analysis of cDNA Ends (MACE) protocol was used for preparation of sequencing libraries.
Consent	Allowed for reuse.
Sample source location	Lines from Collection of All-Russia Research Institute for Agricultural Microbiology, Saint-Petersburg, Russia

7 in total

1. Sequential functioning of Sym-13 and Sym-31, two genes affecting symbiosome development in root nodules of pea (Pisum sativum L.).

Authors: A Y Borisov; S M Rozov; V E Tsyganov; E V Morzhina; V K Lebsky; I A Tikhonovich
Journal: Mol Gen Genet Date: 1997-05-20

2. Fast gapped-read alignment with Bowtie 2.

Authors: Ben Langmead; Steven L Salzberg
Journal: Nat Methods Date: 2012-03-04 Impact factor: 28.547

3. Nodulation and nitrogen fixation mutants of pea, Pisum sativum.

Authors: K C Engvild
Journal: Theor Appl Genet Date: 1987-10 Impact factor: 5.699

4. Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction.

Authors: Jian Ye; George Coulouris; Irena Zaretskaya; Ioana Cutcutache; Steve Rozen; Thomas L Madden
Journal: BMC Bioinformatics Date: 2012-06-18 Impact factor: 3.169

5. The variant call format and VCFtools.

Authors: Petr Danecek; Adam Auton; Goncalo Abecasis; Cornelis A Albers; Eric Banks; Mark A DePristo; Robert E Handsaker; Gerton Lunter; Gabor T Marth; Stephen T Sherry; Gilean McVean; Richard Durbin
Journal: Bioinformatics Date: 2011-06-07 Impact factor: 6.937

6. De Novo Assembly of the Pea (Pisum sativum L.) Nodule Transcriptome.

Authors: Vladimir A Zhukov; Alexander I Zhernakov; Olga A Kulaeva; Nikita I Ershov; Alexey Y Borisov; Igor A Tikhonovich
Journal: Int J Genomics Date: 2015-11-24 Impact factor: 2.326

7. Massive analysis of cDNA Ends (MACE) and miRNA expression profiling identifies proatherogenic pathways in chronic kidney disease.

Authors: Adam M Zawada; Kyrill S Rogacev; Sören Müller; Björn Rotter; Peter Winter; Danilo Fliser; Gunnar H Heine
Journal: Epigenetics Date: 2013-11-01 Impact factor: 4.528

7 in total

4 in total

1. CRISPRroots: on- and off-target assessment of RNA-seq data in CRISPR-Cas9 edited cells.

Authors: Giulia I Corsi; Veerendra P Gadekar; Jan Gorodkin; Stefan E Seemann
Journal: Nucleic Acids Res Date: 2022-02-28 Impact factor: 16.971

2. Candidate Domestication-Related Genes Revealed by Expression Quantitative Trait Loci Mapping of Narrow-Leafed Lupin (Lupinus angustifolius L.).

Authors: Piotr Plewiński; Michał Książkiewicz; Sandra Rychel-Bielska; Elżbieta Rudy; Bogdan Wolko
Journal: Int J Mol Sci Date: 2019-11-12 Impact factor: 5.923

3. Pea Marker Database (PMD) - A new online database combining known pea (Pisum sativum L.) gene-based markers.

Authors: Olga A Kulaeva; Aleksandr I Zhernakov; Alexey M Afonin; Sergei S Boikov; Anton S Sulima; Igor A Tikhonovich; Vladimir A Zhukov
Journal: PLoS One Date: 2017-10-26 Impact factor: 3.240

4. Loss of the Chr16p11.2 ASD candidate gene QPRT leads to aberrant neuronal differentiation in the SH-SY5Y neuronal cell model.

Authors: Denise Haslinger; Regina Waltes; Afsheen Yousaf; Silvia Lindlar; Ines Schneider; Chai K Lim; Meng-Miao Tsai; Boyan K Garvalov; Amparo Acker-Palmer; Nicolas Krezdorn; Björn Rotter; Till Acker; Gilles J Guillemin; Simone Fulda; Christine M Freitag; Andreas G Chiocchetti
Journal: Mol Autism Date: 2018-11-06 Impact factor: 7.509

4 in total