| Literature DB >> 28382043 |
Jan Ištvánek1, Jana Dluhošová1, Petr Dluhoš2, Lenka Pátková1, Jan Nedělník3, Jana Řepková1.
Abstract
Red clover (Trifolium pratense) is an important forage plant worldwide. This study was directed to broadening current knowledge of red clover's coding regions and enhancing its utilization in practice by specific reanalysis of previously published assembly. A total of 42,996 genes were characterized using Illumina paired-end sequencing after manual revision of Blast2GO annotation. Genes were classified into metabolic and biosynthetic pathways in response to biological processes, with 7,517 genes being assigned to specific pathways. Moreover, 17,727 enzymatic nodes in all pathways were described. We identified 6,749 potential microsatellite loci in red clover coding sequences, and we characterized 4,005 potential simple sequence repeat (SSR) markers as generating polymerase chain reaction products preferentially within 100-350 bp. Marker density of 1 SSR marker per 12.39 kbp was achieved. Aligning reads against predicted coding sequences resulted in the identification of 343,027 single nucleotide polymorphism (SNP) markers, providing marker density of one SNP marker per 144.6 bp. Altogether, 95 SSRs in coding sequences were analyzed for 50 red clover varieties and a collection of 22 highly polymorphic SSRs with pooled polymorphism information content >0.9 was generated, thus obtaining primer pairs for application to diversity studies in T. pratense. A set of 8,623 genome-wide distributed SNPs was developed and used for polymorphism evaluation in individual plants. The polymorphic information content ranged from 0 to 0.375. Temperature switch PCR was successfully used in single-marker SNP genotyping for targeted coding sequences and for heterozygosity or homozygosity confirmation in validated five loci. Predicted large sets of SSRs and SNPs throughout the genome are key to rapidly implementing genome-based breeding approaches, for identifying genes underlying key traits, and for genome-wide association studies. Detailed knowledge of genetic relationships among breeding material can also be useful for breeders in planning crosses or for plant variety protection. Single-marker assays are useful for diagnostic applications.Entities:
Keywords: SNP; SSR; biosynthetic pathways; genetic diversity; sequencing; specific genes
Year: 2017 PMID: 28382043 PMCID: PMC5360756 DOI: 10.3389/fpls.2017.00367
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
List of red clover varieties and their characterization.
| 13T0200096 | Agil | 2x | CZ |
| 13T0200080 | Bonus | 2x | CZ |
| 13T0200097 | Brisk | 2x | CZ |
| 13T0200034 | Chlumecky | 2x | CZ |
| 13T0230104 | Concorde | 2x | US |
| 13T0200018 | Essex broad red | 2x | GB |
| 13T0200081 | Garant | 2x | CZ |
| 13T0200357 | Gibridnij pozdnespelyj | 2x | SU |
| 13T0200492 | Grasslands hamua | 2x | NZ |
| 13T0230094 | Makimidori | 2x | JP |
| 13T0200625 | Nemaro | 2x | DE |
| 13T0200568 | Parka | 2x | PL |
| 13T0200056 | Pavo | 2x | CH |
| 13T0200496 | Radan | 2x | CZ |
| 13T0200127 | Respect | 2x | CZ |
| 13T0200029 | Slavin | 2x | CZ |
| 13T0200030 | Slavoj | 2x | CZ |
| 13T0230101 | Spurt | 2x | CZ |
| 13T0200319 | Start | 2x | CZ |
| 13T0230006 | Suez | 2x | CZ |
| 13T0200598 | Tabor | 2x | CZ |
| 13T0200039 | Trubadur | 2x | CZ |
| 13T0200020 | Van | 2x | CZ |
| 13T0230090 | Vendelin | 2x | CZ |
| 13T0200600 | Vltavin | 2x | CZ |
| 13T0230095 | Walter | 2x | CA |
| 13T0230023 | Amos | 4x | CZ |
| 13T0200058 | Astur | 4x | CH |
| 13T0200088 | Atlantis | 4x | DE |
| 13T0230132 | Beskyd | 4x | CZ |
| 13T0230024 | Bivoj | 4x | CZ |
| 13T0230021 | Blizard | 4x | CZ |
| 13T0230034 | Cyklon | 4x | CZ |
| 13T0230133 | Dolina | 4x | CZ |
| 13T0230039 | Dolly | 4x | CZ |
| 13T0230145 | Fresko | 4x | CZ |
| 13T0200571 | Hungarotetra | 4x | HU |
| 13T0200329 | Kvarta | 4x | CZ |
| 13T0200486 | Lossam | 4x | FR |
| 13T0230026 | Margot | 4x | SK |
| 13T0200476 | Radegast | 4x | CZ |
| 13T0230110 | Rezista | 4x | CZ |
| 13T0230120 | Sigord | 4x | SK |
| 13T0230134 | Sprint | 4x | CZ |
| 13T0200327 | Tatra | 4x | CZ |
| 13T0200538 | Tempus | 4x | CZ |
| 13T0230114 | Titus | 4x | CZ |
| 13T0200460 | Triton | 4x | SE |
| 13T0200597 | Vesna | 4x | CZ |
| 13T0200636 | Vulkan | 4x | CZ |
National accession number—GeneBank of Crop Research Institute Ltd., Prague-Ruzyně, Czech Republic; CA, Canada; CH, Switzerland; CZ, Czech Republic; DE, Germany; FR, France; HU, Hungary; JP, Japan; NZ, New Zealand; PL, Poland; SE, Sweden; SK, Slovakia; SU, Soviet Union; US, United States.
Red clover gene characteristics.
| Number of genes | 64,761 |
| Number of exons | 202,783 |
| Number of introns | 160,364 |
| Mean number of exons per gene | 3.1 |
| Mean number of introns per gene | 2.5 |
| Mean gene length (bp) | 1,480.3 |
| Mean exon length (bp) | 244.6 |
| Mean intron length (bp) | 288.5 |
| Single-exon genes | 11,559 (17.8%) |
| Number of genes | 42,996 |
| Number of exons | 160,421 |
| Number of introns | 131,968 |
| Mean number of exons per gene | 3.7 |
| Mean number of introns per gene | 3.1 |
| Mean gene length (bp) | 1,818.1 |
| Mean exon length (bp) | 244.9 |
| Mean intron length (bp) | 294.6 |
| Single-exon genes | 6,854 (15.9%) |
| Number of genes | 21,765 |
| Number of exons | 42,362 |
| Number of introns | 28,396 |
| Mean number of exons per gene | 1.9 |
| Mean number of introns per gene | 1.3 |
| Mean gene length (bp) | 813.0 |
| Mean exon length (bp) | 243.3 |
| Mean intron length (bp) | 260.1 |
| Single-exon genes | 4,705 (21.6%) |
| Number of genes | 1,316 |
| Number of exons | 2,360 |
| Number of introns | 1,382 |
| Mean number of exons per gene | 1.8 |
| Mean number of introns per gene | 1.1 |
| Mean gene length (bp) | 1,359.0 |
| Mean exon length (bp) | 655.6 |
| Mean intron length (bp) | 174.6 |
| Single-exon genes | 301 (22.9%) |
Figure 1BLASTP top-hits distribution of red clover annotated genes with RefSeq database.
Figure 2Assignment of red clover genes into Gene Ontology (GO) terms in level 2 belonging to: (A) Biological Process, (B) Molecular Function, (C) Cellular Component.
Twenty largest biosynthetic and metabolic pathways in red clover based on number of genes (enzymes) involved.
| Purine metabolism | 1,138 |
| Starch and sucrose metabolism | 1,053 |
| Phenylalanine metabolism | 622 |
| Pentose and glucuronate interconversions | 475 |
| Phenylpropanoid biosynthesis | 469 |
| Thiamine metabolism | 428 |
| Pyrimidine metabolism | 378 |
| Glycerolipid metabolism | 364 |
| Cysteine and methionine metabolism | 354 |
| Galactose metabolism | 328 |
| Amino sugar and nucleotide sugar metabolism | 323 |
| Glycerophospholipid metabolism | 304 |
| T cell receptor signaling pathway | 297 |
| Glycolysis/Gluconeogenesis | 284 |
| Tyrosine metabolism | 273 |
| Phenylalanine, tyrosine and tryptophan biosynthesis | 272 |
| Flavonoid biosynthesis | 257 |
| Arginine and proline metabolism | 253 |
| Glutathione metabolism | 242 |
| Pyruvate metabolism | 240 |
Figure 3Catalytic activity distribution in red clover annotated genes.
Figure 4Comparison of gene densities and genome structure in legume model species (A) M. truncatula and (B) G. max with T. pratense. The 7 T. pratense chromosomes (DDBJ/EMBL/NCBI accession numbers: LT555306.1 - LT555312.1) are shown in orange, 8 M. truncatula chromosomes in blue, and 20 G. max chromosomes in green in the outer circles. (1) First circles represent repetitive element densities relevant to each chromosome (yellow). Gene densities (by 100 kb windows) are displayed on each chromosome as follows: (2) gene density in T. pratense (orange), M. truncatula (blue), and G. max (green) on their own chromosomes; (3) relative gene densities of T. pratense on M. truncatula and G. max chromosomes mapped on the partner's chromosomes; (4) homologous sequences and synteny regions in T. pratense with M. truncatula and T. pratense with G. max (central lines; top half is colored).
Figure 5Statistics for 4,005 predicted SSR markers in coding sequences of red clover. (A) Basic motif frequencies in coding regions of red clover. (B) Frequencies of mono-, di-, tri-, tetra-, penta-, and hexamers plus complex SSR motifs in red clover genes containing one to six SSRs per locus. Total number of sequences containing depicted number of SSRs is shown above each column.
Statistical overview of SNP markers predicted in red clover.
| A <-> G | 109,284 | 50.15 | A <-> C | 32,387 | 26.75 |
| C <-> T | 108,610 | 49.85 | C <-> G | 21,276 | 17.57 |
| Total | 217,894 | 63.52 | G <-> T | 29,231 | 24.14 |
| T <-> A | 38,174 | 31.53 | |||
| Multi-allelic SNP | 4,065 | 1.19 | Total | 121,068 | 35.29 |
Figure 6SSR marker validation in red clover varieties. (A) Numbers of SSRs amplified in analyzed varieties. (B) Numbers of varieties amplified for individual SSRs. (C) Allele number distribution for SSR markers validated in red clover varieties. (D) pPIC distribution for SSR markers validated in red clover varieties.
pPIC in validated red clover SSRs with different motifs.
| Monomer | 2 | 0.567–0.693 | 0.6297 | 0.0632 |
| Dimer | 2 | 0.875–0.895 | 0.8853 | 0.0102 |
| Trimer | 67 | 0.180–0.986 | 0.6929 | 0.2227 |
| Hexamer | 2 | 0.333–0.365 | 0.3488 | 0.0160 |
| Complex | 22 | 0–0.975 | 0.6514 | 0.2734 |
Figure 7Study of red clover variety divergence using SSR markers (Sørensen-Dice index, .
Figure 8SNP marker validation in red clover individuals by temperature switch method.