Literature DB >> 33148177

Genome-wide identification of the restorer-of-fertility-like (RFL) gene family in Brassica napus and expression analysis in Shaan2A cytoplasmic male sterility.

Luyun Ning1, Hao Wang2, Dianrong Li2, Yonghong Li2, Kang Chen1, Hongbo Chao1, Huaixin Li1, Jianjie He1, Maoteng Li3,4.   

Abstract

BACKGROUND: Cytoplasmic male sterility (CMS) is very important in hybrid breeding. The restorer-of-fertility (Rf) nuclear genes rescue the sterile phenotype. Most of the Rf genes encode pentatricopeptide repeat (PPR) proteins.
RESULTS: We investigated the restorer-of-fertility-like (RFL) gene family in Brassica napus. A total of 53 BnRFL genes were identified. While most of the BnRFL genes were distributed on 10 of the 19 chromosomes, gene clusters were identified on chromosomes A9 and C8. The number of PPR motifs in the BnRFL proteins varied from 2 to 19, and the majority of BnRFL proteins harbored more than 10 PPR motifs. An interaction network analysis was performed to predict the interacting partners of RFL proteins. Tissue-specific expression and RNA-seq analyses between the restorer line KC01 and the sterile line Shaan2A indicated that BnRFL1, BnRFL5, BnRFL6, BnRFL8, BnRFL11, BnRFL13 and BnRFL42 located in gene clusters on chromosomes A9 and C8 were highly expressed in KC01.
CONCLUSIONS: In the present study, identification and gene expression analysis of RFL gene family in the CMS system were conducted, and seven BnRFL genes were identified as candidates for the restorer genes in Shaan2A CMS. Taken together, this method might provide new insight into the study of Rf genes in other CMS systems.

Entities:  

Keywords:  CMS; PPR; RFL; RNA-sequencing; Rf

Mesh:

Year:  2020        PMID: 33148177      PMCID: PMC7641866          DOI: 10.1186/s12864-020-07163-z

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

The male sterile line was widely used in hybrid breeding, which mainly included chemical induced male sterility (CIMS), genic male sterility (GMS) and cytoplasmic male sterility (CMS) [1, 2]. In CMS, traits are maternally inherited, primarily due to the rearrangement of mitochondrial DNA and inability to generate normal pollen [3]. The restorer-of-fertility (Rf) nuclear genes have been used to rescue the damage induced by mitochondrial DNA rearrangements. In Brassica napus, there are four major CMS systems which have been commonly used in rapeseed production: pol CMS [4], nap CMS [5], Ogu CMS [6], and Shaan2A CMS [7]. Shaan2A CMS and pol CMS are the most widely used CMS systems in B. napus [8]. What’s more, in Shaan2A CMS system, the cytoplasm type of its restorer line KC01 belongs to pol CMS type [9]. The first Rf gene encoding a putative aldehyde dehydrogenase was cloned in the T-CMS of maize (Zea mays); the encoded protein either performs acetaldehyde detoxification or interacts with the male sterile mitochondrial proteins [10]. To date, many other Rf genes have been identified in different CMS systems. Most of these Rf genes encode pentatricopeptide repeat (PPR) proteins. Examples of such Rf genes include Rf-PPR592 in petunia (Petunia hybrida) [11], Rfo [12] and orf687 [13] in radish (Raphanus sativus), Rf4 [14], Rf5 [15] and Rf6 [16] in rice (Oryza sativa) and Rfp [17] and Rfn [18] in rapeseed. The PPR proteins were first identified as tandem repeats of degenerate 35-amino-acid motifs (PPR motifs) in Arabidopsis thaliana [19] and were classified into PLS and P subfamilies, according to the PPR motif structure [20]. The PPR gene family is a large family comprising 441 members in Arabidopsis [21], 491 members in rice [22] and 626 members in poplar (Populus alba) [23]. Except for the PPR13 in sorghum (Sorghum bicolor), most of the RF-related PPR proteins belong to the P subfamily and lack the catalytic sites for RNA editing or binding [15]. Two partner proteins of the RF-related PPR proteins have been reported, including GRP162, which associates with RF5 [15], and hexokinase 6 (HXK6), which functions together with RF6 [16] to rescue CMS in rice. To date, all reported Rf genes have been identified via genetic mapping, which is a time-consuming method and takes several years to narrow down the genomic region of interest. However, the concept of restorer-of-fertility-like (RFL) gene was put forward in 2011, and 212 RFL genes were identified based on BLAST searches using the Rf-PPR592 and Rf5 sequences against the genomes of 13 different dicot and monocot species, including Arabidopsis, soybean (Glycine max) and sorghum [24]. AtRFL2, together with RNase P, regulates the processing of mitochondrial orf291 RNA [25]. AtRFL4 is needed for processing the 5′-end of nad4 mRNA in mitochondria [26]. AtRFL9, also known as RNA PROCESSING FACTOR 4 (RPF4), participates in the generation of extra 5′ termini of ccmB transcripts in Arabidopsis [27]. These results enhanced our understanding of mitochondrial RNA processing in plants and provided novel insights into the function of RFL proteins. In the present study, we performed BLAST searches using the sequences of Rf-PPR592 and AtRFLs against the genome of rapeseed and identified 53 BnRFL genes. Based on these 53 BnRFL genes, candidate Rf genes were analyzed in the Shaan2A CMS system by RNA-seq and tissue-specific expression analyses. Our data provide a strong foundation for the study of Rf genes in other CMS systems.

Results

Identification of BnRFL genes

A total of 53 BnRFLs were identified in this study, based on the homology with the RFL genes in Arabidopsis (AtRFL1–26) and petunia (Rf-PPR592) (Table 1). First, sequences of all 26 AtRFLs were searched in the database one at a time. Of the 26 AtRFL genes, nine showed no homologs in B. napus, including AtRFL5, AtRFL6, AtRFL9, AtRFL10, AtRFL14, AtRFL15, AtRFL16, AtRFL25 and AtRFL26. Then, using Rf-PPR592 as a reference [24], 26 BnRFLs were identified (E-value <1e− 100). Taken together, there should be a total of 53 BnRFLs genes in B. napus. We also identified two known restorer genes, BnRFL6 (Rfn) and BnRFL13 (Rfp) (previously identified in the nap and pol CMS systems) and four candidate restorer genes (BnRFL2, BnRFL10, BnRFL11 and BnRFL42; previously identified in B. napus by fine genetic mapping) [31].
Table 1

Summary of the chromosomal location of BnRFL genes and characteristics and subcellular localization of the encoded proteins

A. thaliana geneA. thaliana IDB. napus geneChr.Gene positionPPR numberProtein Length (AA)pI [28]Molecular weight (Da) [28]GRAVY [28]Subcellular location
StartEndPprowler [29]TargetP [30]
AtRFL1At1G06580BnRFL1C845,301,50845,303,299114978.5556,608.610.183MbM
AtRFL2At1G12300BnRFL2A942,781,90142,784,196156375.9770,948.870.026MM
BnRFL3A942,776,62842,778,959156476.3171,868.60.038MM
BnRFL4C842,874,87642,877,570146405.6171,088.910.032MM
BnRFL5C842,856,82042,859,076156266.269,734.490.01MM
BnRFL6A942,851,65442,853,921156298.3369,640.90.16MM
BnRFL7A125,159,03825,161,294156266.1670,426.20.001MSc
BnRFL8C842,947,16542,949,404156217.3468,638.380.116MM
AtRFL3At1G12620BnRFL9A825,819,29825,821,504146126.6568,324.47−0.043MM
BnRFL10Una20,13522,471156488.2172,194.660.043dCe
BnRFL11A942,353,34542,355,681156488.272,022.510.067C
BnRFL12C817,918,17517,920,367156087.3767,610.72−0.023MM
BnRFL13A942,707,70942,710,051156508.4173,339.39−0.054MM
BnRFL14C817,950,49817,951,75983234.6735,505.170.06
BnRFL15C842,022,99942,023,77652155.2923,581.510.067MS
AtRFL4At1G12700BnRFL16C842,872,13742,874,315146045.4767,048.930.02MM
BnRFL17A942,773,73042,775,903146036.5766,969.91−0.018MM
AtRFL7At1G62680BnRFL18Un30,31232,846125388.4560,152.190.016MC
BnRFL19A97,486,2807,488,209125358.5659,967.030.012MM
BnRFL20A99,401,9819,403,866125238.3358,558.11−0.002MM
BnRFL21A99,446,8429,448,591114404.8148,891.560.127M
AtRFL8At1G62720BnRFL22A97,519,6737,521,422134858.3955,092.25−0.012MC
AtRFL11At1G62930BnRFL23C145,473,25845,480,422154876.3655,189.36−0.013SS
BnRFL24A99,204,2729,204,88441785.2920,006.320.001
AtRFL12At1G63070BnRFL25A99,203,5519,204,32841885.4721,000.35−0.047
AtRFL13At1G63080BnRFL26C453,751,55153,751,94921076.2611,658.37−0.133
BnRFL27C453,761,78453,762,10721076.2611,658.37−0.133
AtRFL17At1G63400BnRFL28C124,931,74524,934,613197966.1789,097.00−0.119MC
BnRFL29A115,383,42115,386,289187966.888,897.86−0.12MC
AtRFL18At1G64100BnRFL30A99,036,0049,038,611176815.9476,309.560.017MM
BnRFL31A99,033,3469,035,550176815.9476,309.560.017MM
BnRFL32C145,499,11345,501,687176816.2476,106.470.022MM
AtRFL19At1G64580BnRFL33A98,665,3128,667,139125079.157,063.97−0.01MM
BnRFL34C540,329,83940,331,444104459.1650,074.120.009
BnRFL35Un47,83352,673124919.0555,206.810.062MM
BnRFL36A99,201,2549,203,025124919.0455,221.770.036MM
BnRFL37Un21,88323,773104459.1650,074.120.009
AtRFL20At3G16710BnRFL38A129,342,55529,344,457125068.6457,356.770.032MM
BnRFL39Un67,69069,826104538.9451,204.740.017MC
AtRFL21At3G22470BnRFL40A825,799,16225,801,437136316.9970,631.480.008MM
BnRFL41C842,220,82342,223,242156718.4475,390.4−0.045M
BnRFL42A942,760,10842,762,458156526.7373,148.36−0.005MM
BnRFL43C114,072,48714,074,026114235.446,851.41−0.088
BnRFL44A825,805,81425,806,64152294.6325,248.08−0.064
BnRFL45C817,950,05417,950,65241658.1218,161.40.335SS
AtRFL22At4G26800BnRFL46A111,953,78811,955,598105028.8656,499.9−0.017MM
BnRFL47C120,149,19920,151,006115018.9456,441.830MM
AtRFL23At5G16640BnRFL48Un18,56220,747125018.4656,334.670.011MC
BnRFL49A1015,694,47515,696,271114988.7356,128.710.018MC
AtRFL24At5G41170BnRFL50Un83,03785,098114818.7654,110.040.023MM
BnRFL51A411,026,72411,028,448114788.8453,9240.064MM
BnRFL52A927,526,10027,529,023168117.0692,429.64−0.364MM
BnRFL53C755,839,98455,843,142148217.8993,225.71−0.308MM
Rf-PPR592 [11]145927.8167,340.37−0.071MM
Rf4 [14]187826.5686,282.74−0.037MM
Rf5 [15]177916.187,614.43−0.013MM
Rf6 [16]157988.488,617.48−0.096MM
Rfo [12]176874.9676,500.420.022MM

a Unplaced Scaffold

b Mitochondrion

c Secretory pathway

d Any other location

e Chloroplast

Summary of the chromosomal location of BnRFL genes and characteristics and subcellular localization of the encoded proteins a Unplaced Scaffold b Mitochondrion c Secretory pathway d Any other location e Chloroplast The number of PPR motifs in the BnRFL proteins varied from 2 to 19, although most of the BnRFL proteins contained at least 10 PPR motifs and the average number of PPR motifs was 12 (Table 1). Approximately one-fifth of the BnRFLs showed relatively low pI (< 6), whereas nearly half of the BnRFLs showed relatively high pI (> 8). The molecular weight of these RFL proteins ranged from 11.7–92.4 kDa. Additionally, the GRAVY value of nearly two-fifth BnRFLs and most of the selected Rf genes was less than 0, indicating that these RFL proteins were hydrophilic. Most of the BnRFLs were predicted to localize to the mitochondria, which is consistent with the subcellular localization of RF proteins (Table 1).

Chromosomal location and structural analysis of BnRFL genes

First, we downloaded the chromosomal distribution of AtRFLs from The Arabidopsis Information Resource (TAIR) (Fig. 1a). All 26 AtRFLs were located in a cluster on chromosome 1. Of the 53 BnRFLs identified in this study, 46 BnRFLs were distributed unevenly on 10 of the 19 chromosomes, and 18 and 10 BnRFLs formed highly dense clusters on chromosomes A9 and C8, respectively. The remaining seven BnRFLs were located on the unmapped scaffold (Fig. 1b and c).
Fig. 1

Distribution of the AtRFL and BnRFL genes on chromosomes. a Chromosomal distribution of AtRFLs. b Number of BnRFLs on different chromosomes. c Chromosomal distribution of BnRFLs

Distribution of the AtRFL and BnRFL genes on chromosomes. a Chromosomal distribution of AtRFLs. b Number of BnRFLs on different chromosomes. c Chromosomal distribution of BnRFLs Next, we determined the exon-intron structure of the BnRFL genes and a few known restorer genes (Additional file 1). Most of the BnRFLs were intron-less, similar to the restorer genes, such as Rf4, Rf5 and Rf6, in rice CMS line. Of the 53 BnRFL genes, 10 contained a single intron, similar to the Rf genes, Rfk1, Rfob and orf687, in radish. Notably, the intron in BnRFL23 was more than 4 kb in length, unlike other BnRFLs. Because PPR proteins generally contain tandem repeats of PPR motifs, we searched for the PPR motifs in the BnRFL proteins and a few known restorer proteins (Additional file 2). To investigate whether BnRFLs contained additional motifs, 53 BnRFLs and 9 known restorer proteins from other species were submitted to MEME. The results showed 20 motifs in the BnRFL proteins (Fig. 2). Interestingly, all of the identified RFL proteins contained motif 1, which contained 80 amino acids.
Fig. 2

Distribution of 20 motifs identified in BnRFL proteins, and sequence of the conserved motif 1

Distribution of 20 motifs identified in BnRFL proteins, and sequence of the conserved motif 1

Phylogenetic and Syntenic analysis

To identify the homologs of BnRFLs in different monocot and dicot species, multiple sequence alignments were performed and sequence similarity was determined. The rice RF5 protein was used for BLAST searches against the rice and maize genomes, and Rf-PPR592 was used for BLAST searches against the radish genome. An additional 16 OsRFLs (including 4 reported restorer genes), 9 ZmRFLs, and 22 RsRFLs (including 4 known restorer genes) were identified (E-value <1e− 100). Phylogenetic analysis revealed that RFLs mainly formed two separate clusters, and RFLs in monocot and dicot species were clustered together, respectively (Fig. 3a). Additionally, two reported restorer genes (BnRFL6 and BnRFL13) and four candidate restorer genes (BnRFL2, BnRFL10, BnRFL11 and BnRFL42) clustered together. Six additional BnRFLs (BnRFL3, BnRFL4, BnRFL5, BnRFL8, BnRFL15 and BnRFL41) clustered together with the reported and candidate restorer genes. These 12 BnRFLs have been deeply investigated in the following study.
Fig. 3

Phylogenetic and syntenic analysis of RFL genes. a Phylogenetic analysis of RFLs in rapeseed (Brassica napus), Arabidopsis thaliana, rice (Oryza sativa), maize (Zea mays) and radish (Raphanus sativus). Monocot RFLs are indicated in light blue. Dicot RFLs are indicated in light purple. RFLs in the cluster of known RF proteins are indicated in pink. Red solid circles, known RF proteins; red diamonds, candidate RF proteins in a previous study. Yellow, purple, light blue and dark blue circles and green triangles represent the RFL genes in R. sativus, B. napus, O. sativa, Z. mays and A. thaliana, respectively. Numbers at nodes (range 0–100) indicate the reliability of the corresponding branch; higher the number, higher the reliability of the branch. b Synteny analysis of the RFL genes in A. thaliana, B. napus, B. oleracea and B. rapa. AtChr1–5, A. thaliana chromosomes 1–5; BraA1-A10, B. rapa chromosomes 1–10; BolC1-C9, B. oleracea chromosomes 1–9; BnaA1-A10, B. napus chromosomes A1-A10; BnaC1-C9, B. napus chromosomes C1-C9

Phylogenetic and syntenic analysis of RFL genes. a Phylogenetic analysis of RFLs in rapeseed (Brassica napus), Arabidopsis thaliana, rice (Oryza sativa), maize (Zea mays) and radish (Raphanus sativus). Monocot RFLs are indicated in light blue. Dicot RFLs are indicated in light purple. RFLs in the cluster of known RF proteins are indicated in pink. Red solid circles, known RF proteins; red diamonds, candidate RF proteins in a previous study. Yellow, purple, light blue and dark blue circles and green triangles represent the RFL genes in R. sativus, B. napus, O. sativa, Z. mays and A. thaliana, respectively. Numbers at nodes (range 0–100) indicate the reliability of the corresponding branch; higher the number, higher the reliability of the branch. b Synteny analysis of the RFL genes in A. thaliana, B. napus, B. oleracea and B. rapa. AtChr1–5, A. thaliana chromosomes 1–5; BraA1-A10, B. rapa chromosomes 1–10; BolC1-C9, B. oleracea chromosomes 1–9; BnaA1-A10, B. napus chromosomes A1-A10; BnaC1-C9, B. napus chromosomes C1-C9 Next, we examined the synteny of BnRFLs with their homologs in Arabidopsis, B. rapa and B. oleracea (Fig. 3b). The results showed syntenic relationships between AtRFL genes on chromosome 1 and RFL genes on chromosomes BraA8, BraA9, BolC8, BnaA8, BnaA9, BnaC3, BnaC8 and BnaC9. The AtRFL gene on chromosome 3 showed synteny with RFL genes on BraA1 and BnaA1. The AtRFL genes on chromosome 4 showed synteny with RFL genes on BraA1, BolC1, BnaA1 and BnaC1, and the AtRFL genes on chromosome 5 showed synteny with RFL genes on BraA4, BnaA4, BnaC4 and BnaC9. The Ks and Ka values indicate the evolutionary pressure on species. A Ka/Ks ratio < 1indicates functional constraint, whereas Ka/Ks ratio > 1 indicates positive selection [32]. To explore the selection pressure on BnRFLs, we calculated the Ka/Ks ratios (Additional file 3). All BnRFL genes showed a Ka/Ks ratio of 0.1–0.7. The KaKs ratio of most of the BnRFLs was relatively low (< 0.4). However, BnRFL46 and BnRFL47 showed relatively high Ka/Ks ratios (> 0.6).

Interaction analysis of AtRFL proteins

Most of the RFL proteins belong to the P subfamily and need to interact with other proteins to perform RNA processing [15]. To predict the interacting partners of RFL proteins (no B. napus information in STRING database), an interaction network for AtRFLs were constructed based on STRING 10.5 and Cytoscape 3.6.1. Except AtRFL10, which did not have interaction information in the STRING database, 25 AtRFLs and their predicted partners are shown in Fig. 4 and Additional file 4. Interestingly, AtRFL11, AtRFL12 and AtRFL13 and three HXKs, including HXK1, HXK2 and HXK3, shared a common interacting protein, namely replication factor C2 (RFC2), a multi-subunit complex critical for high-speed ATP-dependent DNA synthesis [33]. No homologs of AtRFL25 were identified in B. napus. Approximately one-quarter of the BnRFL genes were homologous to AtRFL2 and AtRFL3. Further analysis revealed that AtRFL2 and AtRFL3 interact with AtRFL25, which showed interaction with the glycine-rich proteins, GRP7 and GR-RBP2 (Fig. 4). Moreover, AT1G48510, SURF1, COX15 and COX11 were predicted to interact with atp6–1, AT3G48810, NAD9, CCMH and most of the AtRFLs.
Fig. 4

Interaction analysis of AtRFLs. Purple nodes represent AtRFLs. Proteins interacting with AtRFLs, shown in other colors, were searched in the STRING database

Interaction analysis of AtRFLs. Purple nodes represent AtRFLs. Proteins interacting with AtRFLs, shown in other colors, were searched in the STRING database

Expression analysis of BnRFL genes

Based on the results of phylogenetic analysis, 12 BnRFL genes, which were mentioned in the phylogenetic analysis, were selected for tissue-specific expression analysis in the sterile line Shaan2A, the maintainer line Shaan2B and the restorer line KC01 by qRT-PCR. Although BnRFL10 and BnRFL11 were located on different chromosomes, the coding sequence (CDSs) of these genes were highly similar (identity = 1926/1947; 99%), and it was difficult to distinguish them by qRT-PCR. Therefore, we finally analyzed the expression of 11 BnRFL genes. In the restorer line, the expression of BnRFL6, BnRFL13 and BnRFL42 was lower in leaves than in the perianth (Fig. 5a, Additional file 5). The majority of the selected BnRFLs showed higher expression level in MA when compared with leaves, except for BnRFL2, BnRFL3 and BnRFL4. However, the expression of the BnRFL genes, except BnRFL41, was lower in the gynoecium and LA when compared with leaves. Compared with Shaan2A tissues, the expression of 11 BnRFL genes was higher in KC01 tissues, especially in MA (Fig. 5b, Additional file 5). What’s more, the expression of these genes in Shaan2B LA was higher than those in Shaan2A (Fig. 5b, Additional file 5). However, the expression of most of these BnRFLs was lower in the gynoecium of the restorer line than in that of Shaan2A.
Fig. 5

Expression profiles of BnRFL genes. a Expression profiles of BnRFL genes in different tissues of KC01. Gene expression in other tissues was calculated relative to that in leaves. The log2X-normalized ratios are shown. b Comparative expression analysis of BnRFL genes in different tissues of KC01, Shaan2B and Shaan2A. Gene expression levels in KC01 and Shaan2B was calculated relative to those in Shaan2A. c RNA-seq analysis of the differentially expressed BnRFLs at YB and SA stage. Gene expression levels in Shaan2A was calculated relative to those in KC01. Red indicated higher expression levels. Green represented the lower expression levels. ‘-’ indicates no significant difference in expression

Expression profiles of BnRFL genes. a Expression profiles of BnRFL genes in different tissues of KC01. Gene expression in other tissues was calculated relative to that in leaves. The log2X-normalized ratios are shown. b Comparative expression analysis of BnRFL genes in different tissues of KC01, Shaan2B and Shaan2A. Gene expression levels in KC01 and Shaan2B was calculated relative to those in Shaan2A. c RNA-seq analysis of the differentially expressed BnRFLs at YB and SA stage. Gene expression levels in Shaan2A was calculated relative to those in KC01. Red indicated higher expression levels. Green represented the lower expression levels. ‘-’ indicates no significant difference in expression To compare the expression level of all 53 BnRFL genes in Shaan2A vs. KC01, three biological replicates of RNA-seq were performed using RNA isolated from young buds (YB, < 1 mm, representing pre-meiosis) and small anthers (SA, sampled from buds 1–2 mm in length (representing tetrad stage to microspore release stage). A total of 320,892,232 raw sequence reads were generated, with approximately 50 million raw reads representing each tissue sample (SRA number: PRJNA511929). Additionally, to conduct comparative transcriptome analysis of the three lines in the Shaan2A CMS system, raw transcriptome reads representing the same stages of Shaan2A and Shaan2B were downloaded (SRA number: PRJNA502996). RNA-seq data analysis of Shaan2A and KC01 revealed only nine BnRFLs exhibited differential expression (Fig. 5c). These results provide important clues for analyzing the candidate restorer genes in the Shaan2A CMS system.

Transcriptomic analysis between Shaan2A and KC01

To investigate the differences between Shaan2A and KC01, possibly caused by the male sterile genes and restorer genes, we also identified the differentially expressed genes (DEGs) between Shaan2B and KC01, as Shaan2A and Shaan2B share the same nuclear genetic background. Thus, common DEGs identified based on Shaan2A vs. KC01 comparison and Shaan2B vs. KC01 comparison would represent the DEGs identified between different genetic backgrounds, i.e., Shaan2A (or Shaan2B) and KC01. A total of 2980 and 8243 DEGs were identified in YB and SA stage, respectively, based on the comparison between Shaan2A and KC01 (|log2 Ratio| > 1; Additional file 6). Based on GO analysis, only one GO term in the molecular function category, ‘sequence-specific DNA binding’, was significant at YB stage (Fig. 6a). By contrast, at SA stage, 8243 DEGs identified between Shaan2A and KC01 were categorized in three main categories, including molecular function, cellular component and biological process, which were further classified into many functional sub-categories (Additional file 7). The top 30 sub-categories, including ‘disaccharide metabolic process’, ‘regulation of RNA biosynthetic process’, ‘regulation of RNA metabolic process’ and ‘regulation of transcription, DNA-dependent’, are shown in Fig. 6b.
Fig. 6

Top 30 gene ontology (GO) sub-categories of the DEGs identified between Shaan2A and KC01. Asterisk indicates the corrected p-value < 0.05. a YB stage. b SA stage

Top 30 gene ontology (GO) sub-categories of the DEGs identified between Shaan2A and KC01. Asterisk indicates the corrected p-value < 0.05. a YB stage. b SA stage To validate the RNA-seq data, the expression of 11 DEGs, potentially involved in anther development, was analyzed by qRT-PCR (Additional file 8). The results of qRT-PCR analysis for most of these DEGs at the two stages were consistent with those of RNA-seq analysis, indicating that the reliability of our RNA-seq data.

Discussion

Many CMS systems have been used in B. napus, including pol, nap, Ogu and Shaan2A [4-7]. To date, Rf genes have been identified via genetic mapping only in pol and nap CMS systems [17, 18]. In the present study, 53 BnRFL genes were identified in the Shaan 2A CMS system. Most of these genes contained more than 10 PPR motifs, which is consistent with the previously reported restorer proteins such as Rf-PPR592 (14 PPR motifs), Rf4 (18 PPR motifs) and Rfo (17 PPR motifs) [11, 17, 24]. Moreover, most of the BnRFLs identified in this study were predicted to localize to mitochondria, similar to the known restorer genes [15-18]. Since the Rf genes function with the toxic chimeric genes in mitochondria to rescue male sterility [14-16], the mitochondrial localization of the proteins seems appropriate. More importantly, we also identified BnRFL6 (Rfn) and BnRFL13 (Rfp), previously confirmed as restorer genes in nap and pol CMS systems, respectively [17, 18], and four candidate restorer genes (BnRFL2, BnRFL10, BnRFL11 and BnRFL42), previously identified in B. napus via genetic mapping [31]. Taken together, analysis of the RFL gene family for the identification of candidate restorer genes were viable, which would also provide a new way to analysis the restorer genes in other CMS systems as one supplementary method except for the traditional genetic mapping to locate the candidate genes. Nearly 7500 years ago, B. napus originated from the hybridization of B. rapa and B. oleracea [34], and the Brassica plants experienced the extra whole genome triplication (WGT) event when compared with Arabidopsis [35]. The Arabidopsis genome contains 26 RFL gene family members, so considering the WGT event there should be over 78 RFL genes in B. oleracea or B. rapa genome, and finally generate even more RFL genes in B. napus. While only 53 BnRFLs were identified in the present study, which implied that nearly 50% RFL genes were lost after the WGT event. Most of the BnRFLs were unevenly distributed on 10 of the 19 chromosomes of B. napus, while a few formed gene clusters on chromosomes A9 and C8, similar to the gene cluster in Arabidopsis (chromosome 1; Table 1, Fig. 1), rice and barley (Hordeum vulgare) [36, 37]. Gene clustering has also been observed in many other gene families, such as the LEA gene family in B. napus [38] and Phyllostachys edulis [39] and laccase gene family in Citrus sinensis [40]. The LEA gene clusters on B. napus chromosomes A9 and C4 probably resulted from chromosomal rearrangement during the evolution of Brassica species [38]. The RFL genes on Arabidopsis chromosome 1 showed synteny with the RFL genes on BraA9, BolC8, BnaA9 and BnaC8. Additionally, BnRFLs maintained a syntenic relationship with RFL genes in B. rapa and B. oleracea, suggesting that a conserved role of BnRFLs located on chromosomes A9 and C8. Moreover, AtRFL2, AtRFL4 and AtRFL9 were located within the gene cluster on Arabidopsis chromosome 1 and participated in the processing of the mitochondrial RNA [25-27]. The phylogenetic analysis revealed that the BnRFLs have the closer phylogenetic relationship with AtRFLs and RsRFLs (Fig. 3), and the structural analysis showed that all of the BnRFLs and the known restorer genes in radish share a conserved motif (Fig. 2), and all BnRFL genes showed a Ka/Ks ratio < 1 (Additional file 3), which indicated that there was no positive selection on the BnRFL genes during the evolution. What’s more, BnRFL6 (Rfn) and BnRFL13 (Rfp) were located within the gene cluster on chromosome A9. These data suggest that the RFL genes within gene clusters on chromosomes A9 and C8 represent the restorer genes in the CMS system, as these likely exhibit a conserved role in mitochondrial RNA processing. Tandem repeats of a degenerate 35-amino-acid PPR motif are the most prominent feature of the PPR family, and all of the 53 BnRFL proteins showed this trait. Although 212 RFL genes in 13 different species [24] and 26 RFL genes in barley [37] have been identified previously, the conserved domain of the RFL proteins has not yet been analyzed. Therefore, we investigated motifs other than PPR in the RFL proteins, revealing 20 motifs among the 53 BnRFLs and a few known RF proteins. Interestingly, all RFLs contained motif 1, comprising 80 amino acids. We propose motif 1 as the conserved motif in the RFL protein family. This motif will serve as a reference for RFL family analysis in other species. Because RF-related PPR proteins belong to the P subfamily and do not exhibit endonuclease activity, these proteins form functional complexes with other proteins [20]. To date, only two RFL-interacting partner proteins have been identified, including GRP162 and HXK6 in the rice CMS system [15, 16]. In the present study, we constructed an interaction network for AtRFLs (Fig. 4). Interestingly, AtRFL11, AtRFL12 and AtRFL13 and HXKs (HXK1, HXK2 and HXK3) shared a common interacting partner, RFC2, which was critical for high-speed DNA synthesis [22], whereas AtRFL25 showed interaction with GRP7 and GR-RBP2. Moreover, AT1G48510, SURF1, COX11 and COX15 were predicted to interact with most of the AtRFLs. AT1G48510 is a surfeit locus 1 cytochrome c oxidase biogenesis protein. SURF1 is associated with cytochrome c oxidase assembly [41]. Both COX11 and COX15 are mitochondrial proteins and belong to the cytochrome c oxidase protein family [42]. COX11 likely plays a key role as a mitochondrial chaperone in the assembly of the COX complex and regulates pollen germination and plant growth [43]. Overall, the interaction network indicates possible partner proteins of RFL proteins in Arabidopsis. These data provide important clues for the identification of interaction factors of RF proteins in other species. Previously, it has been shown that Rf4 is constitutively expressed in different rice organs at relatively low levels [14]. Although Rf6 expression is detectable in various rice tissues, it is expressed to a higher level in the panicle than in other tissues [16]. In the pol CMS system, Rfp shows relatively high expression in flower buds and weak expression in opening flowers, leaves, stems and roots [17]. In the phylogenetic tree, two previously reported restorer genes (BnRFL6 and BnRFL13), four candidate restorer genes (BnRFL2, BnRFL10, BnRFL11 and BnRFL42) and another six BnRFLs (BnRFL3, BnRFL4, BnRFL5, BnRFL8, BnRFL15 and BnRFL41) clustered together, suggesting these genes as the more probable candidates of restorer genes in the rapeseed CMS system. Analysis of expression patterns revealed that most of these genes, except for BnRFL2, BnRFL3 and BnRFL4, were expressed to relatively higher levels in MA than in leaves in KC01. Additionally, these BnRFLs showed higher expression in KC01 tissues, especially MA, than in Shaan2A tissues. Expression profiling of BnRFL genes in Shaan2A vs. KC01 showed that BnRFL1, BnRFL6, BnRFL10, BnRFL13, BnRFL15, BnRFL38 and BnRFL49 were down-regulated in Shaan2A. However, BnRFL15 only harbored five PPR motifs, which was much lower than the number of PPR motifs in the known RF proteins. While BnRFL38 and BnRFL49 are located on chromosomes A1 and A10, respectively, BnRFL1, BnRFL5 and BnRFL8, BnRFL6, BnRFL11, BnRFL13 and BnRFL42 are located in gene clusters on chromosomes C8 and A9. Interestingly, almost all of the known rice Rf genes are located in the RFL gene cluster on chromosome 10 [36]. These data suggest BnRFL1, BnRFL5, BnRFL6, BnRFL8, BnRFL11, BnRFL13 and BnRFL42 as the more likely candidates of restorer genes in the Shaan2A CMS system. In O. sativa, RF6 with a characteristic duplication of PPR motifs in the restorer line of Honglian CMS can restore sterility, while the duplicated motifs are absent in rf6 of sterile line [16]. In the next steps, we will clone these candidate restorer genes in the restorer line and sterile line of Shaan2A CMS respectively, and compare the difference of sequences between them, for we wonder if there is the similar motif difference between these candidate restorer genes. Then we will narrow down the list of candidate genes, and conduct the transgenic work in sterile line to investigate their function. Furthermore, DEGs identified in small anthers of Shaan2A vs. KC01 were annotated as involved in the ‘regulation of RNA biosynthetic process’, ‘regulation of RNA metabolic process’ and ‘regulation of transcription, DNA-dependent’. The RF-related PPR proteins interact with their partner proteins to bind or to edit RNA [15]. Here, the regulation of RNA biosynthetic, RNA metabolic process and transcription was different between the sterile line and restorer line, which might be caused by the sterile genes in Shaan2A and restorer genes in KC01. However, the detailed mechanism needs further investigation.

Conclusions

In CMS, the Rf nuclear genes rescue the sterile phenotype and most of the Rf genes encode pentatricopeptide repeat (PPR) proteins. In the present study, a total of 53 BnRFL genes were identified in B. napus. Most of the BnRFL genes were distributed on 10 of the 19 chromosomes, and gene clusters were identified on chromosomes A9 and C8. The interaction network analysis was performed to predict the interacting partners of RFL proteins. Tissue-specific expression and RNA-seq analyses between the restorer line KC01 and the sterile line Shaan2A indicated that BnRFL1, BnRFL5, BnRFL6, BnRFL8, BnRFL11, BnRFL13 and BnRFL42 located in gene clusters on chromosomes A9 and C8 were highly expressed in KC01, which suggest these seven BnRFL genes as strong candidates for the restorer genes in Shaan2A CMS. Our results would provide new insight into the study of Rf genes in other CMS systems.

Methods

Plant materials

The sterile line Shaan2A, maintainer line Shaan2B and restorer line KC01 of B. napus, gifted by Professor Dianrong Li at the Hybrid Rape Research Center of Shaanxi Province, were used in this study. Plants were cultivated on the experimental field of the Huazhong University of Science and Technology (Wuhan, Hubei province, China). After harvest, plant samples were immediately frozen in liquid nitrogen and stored at − 80 °C until needed for total RNA isolation.

Identification of the RFL gene family in B. napus and other species

The RFL genes were identified as described previously [24]. Briefly, AtRFL1–26 [24] and Rf-PPR592 [11] sequences were used for BLAST searches against the genome of the rapeseed cultivar ‘ZS11’ [44]. The sequence of rice RF5 (also known as RF1a) [15, 45] was used for BLAST searches against the genome sequences of monocots (E-value <1e− 100), including O. sativa (RGAP, http://rice.plantbiology.msu.edu/) and Z. mays [46]. The Rf-PPR592 sequence was used for BLAST searches against the genome sequences of dicots (E-value <1e− 100), including B. rapa [47], B. oleracea [48] and R. sativus [49]. The grand average of hydropathy (GRAVY) value, isoelectric point (pI) and molecular weight of RFL proteins were calculated using ExPASy (http://www.expasy.org/tools/) [28]. The subcellular location of RFL proteins was predicted using the Protein Prowler Subcellular Localization Predictor version 1.2 (http://bioinf.scmb.uq.edu.au:8080/pprowler_webapp_1-2/) [29] and TargetP1.1 server (http://www.cbs.dtu.dk/services/TargetP/) [30].

Structural analysis of RFL genes

The exon-intron structure of BnRFL genes and a few known Rf genes were based on the alignments of the CDS with the corresponding genomic sequences, and the diagram was conducted using the Gene structure display server (GSDS, http://gsds.cbi.pku.edu.cn/) [50]. The PPR motifs in all BnRFL proteins and a few known RF proteins were analyzed using the NCBI Conserved Domain Search tool (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) [51]. Conserved motifs in RFL proteins were analyzed using Multiple Expectation Maximization for Motif Elicitation (MEME, http://alternate.meme-suite.org/) [52].

Phylogenetic and Syntenic analysis of RFL genes

Multiple sequence alignment of the predicted amino acid sequences of BnRFLs, AtRFLs, RsRFLs, OsRFLs and ZmRFLs was performed using ClustalX [53]. A phylogenetic tree of these RFL proteins was constructed with MEGA 7 using the Neighbor Joining (NJ) method [54]. Analysis of synteny among BnRFL, AtRFL, BoRFL and BrRFL genes was performed using the syntenic gene tool in the Brassica database (BRAD, http://brassicadb.org/brad/) [55]. The non-synonymous to synonymous nucleotide substitution ratio (Ka/Ks) was calculated using TBtools [56].

Interaction analysis

The interaction analysis of AtRFLs was based on the STRING 10.5 database, which included the known and predicted protein–protein interactions. First, the interaction proteins of AtRFLs were searched. After deleting the repeat proteins, the interaction network was visualized using Cytoscape 3.6.1.

RNA extraction, RNA-seq and qRT-PCR analysis

Gene expression was analyzed in various tissues of the sterile line Shaan2A and restorer line KC01 including leaves, perianths, gynoecium, medium anthers (MA) and large anthers (LA). MA were harvested from buds 2–4.5 mm in length and represented the uninuclear microspore stage, whereas LA were harvested from buds 4.5 mm in length and represented the mature pollen formation stage. Total RNA extraction, RNA-seq analysis and qRT-PCR were conducted according to our previous protocols [57], with minor modifications. Briefly, approximately 100 mg plant samples were used for total RNA extraction using TRIzol Reagent (Invitrogen, Carlsbad, CA, USA), according to the manufacturer’s instructions. Then, cDNA sequencing libraries were constructed using TruSeq™ RNA Sample Preparation Kit (Illumina, San Diego, CA, USA). RNA-seq was performed on the Illumina NovaSeq 6000 platform. The raw data were filtered using the NGSQC toolkit (v2.2.3), and the clean reads were mapped to the reference genome of the rapeseed cultivar ‘ZS11’. The differentially expressed genes (DEGs) were evaluated using DESeq2, with normalized fold-change ≥2 and p-value < 0.05. Gene Ontology (GO) annotation was using the Web Gene Ontology Annotation Plot (WEGO) software. To perform qRT-PCR analysis, RNA was reverse transcribed using the TaKaRa PrimeScript™ RT Reagent Kit with gDNA Eraser, according to the manufacturer’s instructions. Actin was used as the internal reference gene [58]. The qRT-PCR experiments and transcript quantification were performed as described previously [57]. Primers used in this study are listed in Additional file 9. Additional file 1 Exon-intron structure of the BnRFL genes and known Rf genes. Additional file 2. Distribution of PPR motifs in the identified RFL proteins. Additional file 3 Non-synonymous (Ka) and synonymous (Ks) nucleotide substitution rates of the coding sequence of RFL genes in A. thaliana and B. napus. Additional file 4. List of the interaction proteins. Additional file 5. Original data of qRT-PCR. Additional file 6. RNA-seq analysis of DEGs identified between Shaan2A and KC01 at YB and SA stage, respectively. Additional file 7. List of GO sub-categories of the DEGs identified between Shaan2A and KC01. Additional file 8. Validation of the expression of selected DEGs by qRT-PCR. (A) Results of qRT-PCR analysis. (B) Results of RNA-seq analysis. The numbers indicate log2X-normalized ratios. Red indicated higher expression levels. Green represented the lower expression levels. ‘-’ indicates no significant difference in RNA-seq data. Additional file 9. List of primers used in this study.
  53 in total

1.  The rf2 nuclear restorer gene of male-sterile T-cytoplasm maize.

Authors:  X Cui; R P Wise; P S Schnable
Journal:  Science       Date:  1996-05-31       Impact factor: 47.728

Review 2.  Male sterility and fertility restoration in crops.

Authors:  Letian Chen; Yao-Guang Liu
Journal:  Annu Rev Plant Biol       Date:  2013-12-02       Impact factor: 26.379

3.  Selection patterns on restorer-like genes reveal a conflict between nuclear and mitochondrial genomes throughout angiosperm evolution.

Authors:  Sota Fujii; Charles S Bond; Ian D Small
Journal:  Proc Natl Acad Sci U S A       Date:  2011-01-10       Impact factor: 11.205

4.  Sequence conservation from human to prokaryotes of Surf1, a protein involved in cytochrome c oxidase assembly, deficient in Leigh syndrome.

Authors:  A Poyau; K Buchet; C Godinot
Journal:  FEBS Lett       Date:  1999-12-03       Impact factor: 4.124

5.  MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets.

Authors:  Sudhir Kumar; Glen Stecher; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2016-03-22       Impact factor: 16.240

6.  CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.

Authors:  D G Higgins; P M Sharp
Journal:  Gene       Date:  1988-12-15       Impact factor: 3.688

7.  Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis.

Authors:  Claire Lurin; Charles Andrés; Sébastien Aubourg; Mohammed Bellaoui; Frédérique Bitton; Clémence Bruyère; Michel Caboche; Cédrig Debast; José Gualberto; Beate Hoffmann; Alain Lecharny; Monique Le Ret; Marie-Laure Martin-Magniette; Hakim Mireau; Nemo Peeters; Jean-Pierre Renou; Boris Szurek; Ludivine Taconnat; Ian Small
Journal:  Plant Cell       Date:  2004-07-21       Impact factor: 11.277

8.  Plant genetics. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome.

Authors:  Boulos Chalhoub; France Denoeud; Shengyi Liu; Isobel A P Parkin; Haibao Tang; Xiyin Wang; Julien Chiquet; Harry Belcram; Chaobo Tong; Birgit Samans; Margot Corréa; Corinne Da Silva; Jérémy Just; Cyril Falentin; Chu Shin Koh; Isabelle Le Clainche; Maria Bernard; Pascal Bento; Benjamin Noel; Karine Labadie; Adriana Alberti; Mathieu Charles; Dominique Arnaud; Hui Guo; Christian Daviaud; Salman Alamery; Kamel Jabbari; Meixia Zhao; Patrick P Edger; Houda Chelaifa; David Tack; Gilles Lassalle; Imen Mestiri; Nicolas Schnel; Marie-Christine Le Paslier; Guangyi Fan; Victor Renault; Philippe E Bayer; Agnieszka A Golicz; Sahana Manoli; Tae-Ho Lee; Vinh Ha Dinh Thi; Smahane Chalabi; Qiong Hu; Chuchuan Fan; Reece Tollenaere; Yunhai Lu; Christophe Battail; Jinxiong Shen; Christine H D Sidebottom; Xinfa Wang; Aurélie Canaguier; Aurélie Chauveau; Aurélie Bérard; Gwenaëlle Deniot; Mei Guan; Zhongsong Liu; Fengming Sun; Yong Pyo Lim; Eric Lyons; Christopher D Town; Ian Bancroft; Xiaowu Wang; Jinling Meng; Jianxin Ma; J Chris Pires; Graham J King; Dominique Brunel; Régine Delourme; Michel Renard; Jean-Marc Aury; Keith L Adams; Jacqueline Batley; Rod J Snowdon; Jorg Tost; David Edwards; Yongming Zhou; Wei Hua; Andrew G Sharpe; Andrew H Paterson; Chunyun Guan; Patrick Wincker
Journal:  Science       Date:  2014-08-21       Impact factor: 47.728

9.  CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.

Authors:  Aron Marchler-Bauer; Yu Bo; Lianyi Han; Jane He; Christopher J Lanczycki; Shennan Lu; Farideh Chitsaz; Myra K Derbyshire; Renata C Geer; Noreen R Gonzales; Marc Gwadz; David I Hurwitz; Fu Lu; Gabriele H Marchler; James S Song; Narmada Thanki; Zhouxi Wang; Roxanne A Yamashita; Dachuan Zhang; Chanjuan Zheng; Lewis Y Geer; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

10.  Transcriptomic and Proteomic Analysis of Shaan2A Cytoplasmic Male Sterility and Its Maintainer Line in Brassica napus.

Authors:  Luyun Ning; Hao Wang; Dianrong Li; Zhiwei Lin; Yonghong Li; Weiguo Zhao; Hongbo Chao; Liyun Miao; Maoteng Li
Journal:  Front Plant Sci       Date:  2019-03-04       Impact factor: 5.753

View more
  3 in total

1.  Exploiting sterility and fertility variation in cytoplasmic male sterile vegetable crops.

Authors:  Fengyuan Xu; Xiaodong Yang; Na Zhao; Zhongyuan Hu; Sally A Mackenzie; Mingfang Zhang; Jinghua Yang
Journal:  Hortic Res       Date:  2022-01-18       Impact factor: 6.793

2.  A mitochondria-localized pentatricopeptide repeat protein is required to restore hau cytoplasmic male sterility in Brassica napus.

Authors:  Huadong Wang; Qing Xiao; Chao Wei; Hui Chen; Xiaohan Chen; Cheng Dai; Jing Wen; Chaozhi Ma; Jinxing Tu; Tingdong Fu; Jinxiong Shen; Bin Yi
Journal:  Theor Appl Genet       Date:  2021-03-16       Impact factor: 5.699

Review 3.  Mechanism and Utilization of Ogura Cytoplasmic Male Sterility in Cruciferae Crops.

Authors:  Wenjing Ren; Jinchao Si; Li Chen; Zhiyuan Fang; Mu Zhuang; Honghao Lv; Yong Wang; Jialei Ji; Hailong Yu; Yangyong Zhang
Journal:  Int J Mol Sci       Date:  2022-08-13       Impact factor: 6.208

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.