| Literature DB >> 34045617 |
Souframanien Jegadeesan1,2, Avi Raizada3,4, Punniyamoorthy Dhanasekar3, Penna Suprasanna3,4.
Abstract
Blackgram [Vigna mungo (L.) Hepper] (2n = 2x = 22), an important Asiatic legume crop, is a major source of dietary protein for the predominantly vegetarian population. Here we construct a draft genome sequence of blackgram, for the first time, by employing hybrid genome assembly with Illumina reads and third generation Oxford Nanopore sequencing technology. The final de novo whole genome of blackgram is ~ 475 Mb (82% of the genome) and has maximum scaffold length of 6.3 Mb with scaffold N50 of 1.42 Mb. Genome analysis identified 42,115 genes with mean coding sequence length of 1131 bp. Around 80.6% of predicted genes were annotated. Nearly half of the assembled sequence is composed of repetitive elements with retrotransposons as major (47.3% of genome) transposable elements, whereas, DNA transposons made up only 2.29% of the genome. A total of 166,014 SSRs, including 65,180 compound SSRs, were identified and primer pairs for 34,816 SSRs were designed. Out of the 33,959 proteins, 1659 proteins showed presence of R-gene related domains. KIN class was found in majority of the proteins (905) followed by RLK (239) and RLP (188). The genome sequence of blackgram will facilitate identification of agronomically important genes and accelerate the genetic improvement of blackgram.Entities:
Year: 2021 PMID: 34045617 PMCID: PMC8160138 DOI: 10.1038/s41598-021-90683-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
De novo assembly and annotation statistics of the blackgram genome.
| Scaffolds generated | 1085 |
| Maximum Scaffold length (bp) | 63,43,804 |
| Minimum Scaffold length (bp) | 510 |
| Average Scaffold length (bp) | 438,629 |
| Median Scaffold length (bp) | 67,909 |
| Total Scaffolds length (bp) | 47,59,13,455 |
| Scaffolds ≥ 100 bp | 1085 |
| Scaffolds ≥ 200 bp | 1085 |
| Scaffolds ≥ 500 bp | 1085 |
| Scaffolds ≥ 1 Kbp | 1048 |
| Scaffolds ≥ 10 Kbp | 920 |
| Scaffolds ≥ 1 Mbp | 168 |
| N50 value | 14,26,686 |
| Number of genes | 42,115 |
| Average gene length | 1131 bp |
| Maximum gene length | 23.17 kb |
| Minimum gene length | 120 bp |
| Number of genes annotated | 33,959 |
Figure 1Gene ontology chart of Vigna mungo.
Figure 2Venn diagram showing shared orthologous gene clusters among V. mungo, V. radiata, V. unguiculata and V. angularis.
Annotated repeat abundances in blackgram.
| Genome assembly size (Mbp) | 475.91 |
| 49.6 | |
| Class I: LTR Retrotransposon (RLX) | 47.3 |
| | 13.38 |
| | 31.46 |
| unclassified LTR (RLX) | 3.15 |
| Class II: TIR DNA transposon (DXX) | 2.29 |
| Helitron (DHH) | 1.3 |
| PIF-Harbinger (DTH) | 0.40 |
| Mariner (DTT) | 0.33 |
| Mutator (DTM) | 0.11 |
| hAT (DTA) | 0.1 |
| Class I/class II ratio | 15.6 |
| 0.6 |
The major represented classes, super-families, and subgroups of transposable elements as determined by automated annotation and classified according to the scheme of Wicker et al.[27], as well as other major repeat types are presented.
Number and distribution of SSRs identified in the blackgram (Vigna mungo) cv. Pant U-31 genome.
| Description | |
|---|---|
| Total number of sequences examined | 1085 |
| Total size of examined sequences (bp) | 475,913,455 |
| Total number of identified SSRs | 166,014 |
| Number of SSR containing sequences | 989 |
| Number of sequences containing more than 1 SSR | 953 |
| Number of compound SSRs (i.e. c) | 65,180 |
| p2 | 63,220 |
| p3 | 40,735 |
| p4 | 60,512 |
| p5 | 1146 |
| p6 | 402 |
Prediction of resistance genes domains/motifs present in proteins identified from whole genome sequencing of blackgram cultivar Pant U-31 with the help of DRAGO pipeline of Plant resistance gene database.
| Domain/motif types | Number of proteins | Class | Number of proteins |
|---|---|---|---|
| TM-Kinase | 688 | KIN | 905 |
| TM-Kinase-LRR | 235 | RLK | 239 |
| Kinase | 219 | RLP | 188 |
| LRR-TM | 188 | CK | 102 |
| CC-TM-Kinase | 82 | N | 66 |
| NBS-TM | 47 | L | 40 |
| LRR | 40 | NL | 31 |
| NBS-LRR-TM | 23 | CN | 20 |
| CC-Kinase | 20 | CNL | 18 |
| NBS-CC-TM | 17 | CL | 14 |
| NBS-CC-TM-LRR | 17 | T | 12 |
| NBS | 16 | TNL | 9 |
| TIR | 10 | CLK | 3 |
| TM | 9 | TL | 3 |
| CC-LRR-TM | 8 | CTNL | 2 |
| NBS-LRR-TM-TIR | 7 | NK | 2 |
| CC-LRR | 6 | TRAN | 2 |
| NBS-LRR | 6 | C | 1 |
| NBS-CC | 3 | CT | 1 |
| LRR-TM-Kinase-CC | 3 | TN | 1 |
| LRR-Kinase | 2 | Total proteins | 1659 |
| TM-TIR | 2 | ||
| NBS-TM-TIR | 2 | ||
| LRR-TM-TIR | 2 | ||
| NBS-CC-TM-TIR-LRR | 2 | ||
| LRR-TIR | 1 | ||
| NBS-CC-LRR | 1 | ||
| NBS-LRR-TIR | 1 | ||
| CC-TM | 1 | ||
| CC-TM-TIR | 1 | ||
| Total proteins | 1659 |