| Literature DB >> 34702863 |
Jae Il Lyu1,2, Rahul Ramekar3, Jung Min Kim1, Nguyen Ngoc Hung1, Ji Su Seo1, Jin-Baek Kim1, Ik-Young Choi3, Kyong-Cheul Park4, Soon-Jae Kwon5.
Abstract
Faba bean (Vicia faba L.), a globally important grain legume providing a stable source of dietary protein, was one of the earliest plant cytogenetic models. However, the lack of draft genome annotations and unclear structural information on mRNA transcripts have impeded its genetic improvement. To address this, we sequenced faba bean leaf transcriptome using the PacBio single-molecule long-read isoform sequencing platform. We identified 28,569 nonredundant unigenes, ranging from 108 to 9669 bp, with a total length of 94.5 Mb. Many unigenes (3597, 12.5%) had 2-20 isoforms, indicating a highly complex transcriptome. Approximately 96.5% of the unigenes matched sequences in public databases. The predicted proteins and transcription factors included NB-ARC, Myb_domain, C3H, bHLH, and heat shock proteins, implying that this genome has an abundance of stress resistance genes. To validate our results, we selected WCOR413-15785, DHN2-12403, DHN2-14197, DHN2-14797, COR15-14478, and HVA22-15 unigenes from the ICE-CBF-COR pathway to analyze their expression patterns in cold-treated samples via qRT-PCR. The expression of dehydrin-related genes was induced by cold stress. The assembled data provide the first insights into the deep sequencing of full-length RNA from faba bean at the single-molecule level. This study provides an important foundation to improve gene modeling and protein prediction.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34702863 PMCID: PMC8548339 DOI: 10.1038/s41598-021-00506-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Classification and cluster summary of PacBio SMRT Iso-seq data for faba bean.
| Library | 1–3 kb | > 3 kb | Total |
|---|---|---|---|
| Reads | 309,781 | 538,149 | 847,930 |
| Reads with 5′ and 3′ primers | 271,156 | 483,557 | 754,713 |
| Non-concatamer reads with 5′ and 3′ primers | 253,491 | 477,987 | 731,478 |
| Non-concatamer reads with 5′ and 3′ primers and Poly-A Tail | 253,183 | 477,118 | 730,301 |
| Reads without primers | 38,625 | 54,592 | 93,217 |
| Number of polished, high-quality isoforms | 19,609 | 25,603 | 45,212 |
| Number of polished low-quality isoforms | 42 | 438 | 480 |
Summary of high-quality non-redundant Iso-seq data for faba bean.
| Collapsing redundant sequence | Collapsing redundant Isoforms | |
|---|---|---|
| Total number of sequences | 33,880 | 28,569 |
| Total length (bp) | 94,401,312 | 74,743,641 |
| Maximum length | 9666 | 9666 |
| Minimum length | 108 | 108 |
| Total average length | 2786 | 2616 |
Figure 1Number of isoforms identified for unigenes in faba bean.
Annotation of isoforms on the basis of public databases.
| Database | Annotation number | 300 ≤ length < 1000 | Length ≥ 1000 bp |
|---|---|---|---|
| NR | 27,811 | 3131 | 24,680 |
| UniPort | 24,976 | 2397 | 22,530 |
| Pfam | 25,630 | 2464 | 23,137 |
| EggNOG | 27,628 | 2956 | 24,618 |
| NT | 27,580 | 3017 | 24,450 |
| TAIR | 26,679 | 2672 | 23,970 |
| Common | 22,891 | 2097 | 20,772 |
Figure 2Basic local alignment search tool (BLAST) top-hit species distribution. The substantial similarity to sequences from Medicago truncatula and Cicer arietinum may reflect a close phylogenetic relationship.
Figure 3COG functional classification of isoforms.
Figure 4Transcription factors among the isoforms.
Figure 5Gene ontology classifications. The results are summarized according to the three main categories: biological process, cellular component, and molecular function. The GO terms assigned to faba bean transcripts with matches in the UniProt database revealed by a BLAST search are presented.
Top 10 protein family domains encoded by faba bean unigenes revealed by a search of the Pfam database.
| No. | Protein family | Unigenes |
|---|---|---|
| 1 | Protein kinase superfamily | 1135 |
| 2 | Protein tyrosine kinase | 517 |
| 3 | NB-ARC domain | 355 |
| 4 | RNA recognition motif | 350 |
| 5 | Chlorophyll A-B binding protein | 194 |
| 6 | ATPase family associated | 170 |
| 7 | Hydrolase | 163 |
| 8 | DEAD (DEAD box helicase) | 162 |
| 9 | KAP (Kinesin-associated protein) | 156 |
| 10 | GTP_EFTU (Elongation factor binding domain) | 134 |
Representative gene families related to stress resistance in faba bean.
| Sr. no. | Gene families | Number of unigenes |
|---|---|---|
| 1 | NB-ARC | 355 |
| 2 | Heat shock protein families | 173 |
| 3 | Myb_DNA binding | 141 |
| 4 | C3H | 95 |
| 5 | ARF | 73 |
| 6 | C2H2 | 51 |
| 7 | bHLH | 48 |
| 8 | FAR | 41 |
| 9 | bzip/ABF | 29 |
| 10 | AP2/ERF | 28 |
| 11 | LEA | 26 |
| 12 | WRKY | 23 |
| 13 | COR | 20 |
| 14 | NAC | 10 |
Isoform information for shortlisted cold tolerance genes.
| Unique Isoform ID | Unigene | Number of isoforms | Length |
|---|---|---|---|
| PB.4465.1 | WCOR413-15785 | 1 | 945 |
| PB.1874.1 | DHN2-12403 | 2 | 1276 |
| PB.3249.1 | DHN2-14197 | 1 | 1037 |
| PB.3731.1 | DHN2-14797 | 2 | 1062 |
| PB.4588.1 | HVA22-15951 | 1 | 929 |
| PB.3466.1 | COR15-14478 | 2 | 1072 |
Figure 6Plant development and expression of cold-related genes in response to cold stress treatment. (a) Shoot and (b) root development at 4 and − 7 °C. (c) Expression levels of candidate cold tolerance genes were determined from the PacBio data. PI 469181 (winter type); PI 271634 (spring type).