| Literature DB >> 35275209 |
Halina Pietrykowska1, Izabela Sierocka1, Andrzej Zielezinski2, Alisha Alisha1, Juan Carlo Carrasco-Sanchez2, Artur Jarmolowski1, Wojciech M Karlowski2, Zofia Szweykowska-Kulinska1.
Abstract
MicroRNAs (miRNAs) are small non-coding endogenous RNA molecules, 18-24 nucleotides long, that control multiple gene regulatory pathways via post-transcriptional gene silencing in eukaryotes. To develop a comprehensive picture of the evolutionary history of miRNA biogenesis and action in land plants, studies on bryophyte representatives are needed. Here, we review current understanding of liverwort MIR gene structure, miRNA biogenesis, and function, focusing on the simple thalloid Pellia endiviifolia and the complex thalloid Marchantia polymorpha. We review what is known about conserved and non-conserved miRNAs, their targets, and the functional implications of miRNA action in M. polymorpha and P. endiviifolia. We note that most M. polymorpha miRNAs are encoded within protein-coding genes and provide data for 23 MIR gene structures recognized as independent transcriptional units. We identify M. polymorpha genes involved in miRNA biogenesis that are homologous to those identified in higher plants, including those encoding core microprocessor components and other auxiliary and regulatory proteins that influence the stability, folding, and processing of pri-miRNAs. We analyzed miRNA biogenesis proteins and found similar domain architecture in most cases. Our data support the hypothesis that almost all miRNA biogenesis factors in higher plants are also present in liverworts, suggesting that they emerged early during land plant evolution.Entities:
Keywords: zzm321990 MIR genes; Conserved and non-conserved miRNAs in plants; liverworts; miRNA biogenesis; microprocessor; proteins involved in miRNA biogenesis
Mesh:
Substances:
Year: 2022 PMID: 35275209 PMCID: PMC9291395 DOI: 10.1093/jxb/erac098
Source DB: PubMed Journal: J Exp Bot ISSN: 0022-0957 Impact factor: 7.298
Fig. 1.miRNA biogenesis in plants. RNA polymerase II (RNA Pol II) transcribes miRNA genes. Primary transcripts (pri-miRNAs) contain a cap at their 5ʹ end and are polyadenylated at their 3ʹ end. The activity of the DCL1 endonuclease first cuts off the imperfectly folded stem-and-loop structure of pri-miRNAs, resulting in a shorter stem-loop hairpin (pre-miRNA). This reaction entails the concerted action and physical interactions of SE, HYL1, DCL1, and CBC. Throughout the process, several auxiliary proteins participate in the interactions with the microprocessor core, among others TGH, DDL, and CHR2. The resulting pre-miRNAs are further excised by DCL1 to mature miRNA/miRNA* duplexes. Next, the 3ʹ-ends of miRNA/miRNA* duplexes are methylated by HEN1. The guide miRNA strand is then integrated into AGO1 protein with the aid of CRM1 protein and exported to the cytoplasm, where it regulates the cognate mRNA level.
Fig. 2.miRNA families shared within Bryophyta. (A) Venn diagram showing the number of total (conserved and species-specific) miRNA families identified in P. endiviifolia (Pen), M. polymorpha (Mpo), P. patens (Ppat), and A. angustus (Aan), according to Alaba , Tsuzuki , Bowman , and J. Zhang . The information about the total number of P. patens miRNAs was taken from miRBase v22 (Kozomara ); (B) Conserved miRNA families within Bryophyta. * Number of total (conserved and species-specific) miRNA families. The colors used in the table correspond to the colors in (A).
Fig. 3.Sequence alignments of liverwort-specific miRNAs. (A) Sequence alignments of P. endiviifolia Pen-miR8163 and M. polymorpha Mpo-miR11737a and Mpo-miR11737b; (B) Upper panel - Pen-miR8170 and putative Mpo-miR11865* corresponding to Pen-miR8170, lower panel - structure of M. polymorpha pre-Mpo-miR11865 with designated Mpo-miR11865 and putative MpomiR8170; (C) Sequence alignments of Pen-miR8185 and Mpo-miR11889. Pen – Pellia endiviifolia, Mpo – Marchantia polymorpha.
Length and structure of 23 characterized Marchantia polymorpha MIR genes representing independent transcriptional units
| Intronless Mpo- | ||||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| Mpo- | chr1:739922..738280 (–) | 1642 bp | – | 739050..739217 | – | |
| Mpo- | chr1:16512588..16513713 (+) | 1125 bp | – | 16512919..16513019 | – | |
| Mpo- | chr1:23395553..23396596 (+) | 1043 bp | – | 23395668..23395757 | – | |
| Mpo- | chr3:8070378..8072664 (+) | 2286 bp | – | 8071432..8072344 | – | |
| Mpo- | chr3:9801184..9802171 (+) | 987 bp | – | 9801168..9801276 | – | |
| Mpo- | chr3:17303848..17305858 (+) | 2011 bp | – | 19805484..19805646 | – | |
| Mpo- | chr5:4930550..4928701 (–) | 1850 bp | – | 4930130..4930221 | – | |
| Mpo- | chr5:16515490..16517551 (+) | 2062 bp | – | 16516623..16516737 | – | |
| Mpo- | chr5:26156348..26154642 (–) | 1707 bp | – | 26155312..26155471 | – | |
| Mpo- | chr6:4268242..4266446 (–) | 1797 bp | – | 4267725..4267806 | – | |
| Mpo- | chr6:7960186..7959078 (–) | 1109 bp | – | 7959668..7959787 | – | |
| Mpo- | chr6:20474233..20475757 (+) | 1524 bp | – | 20474717..20474828 | – | |
| Mpo- | chr7:16671171..16669830 (–) | 3380 bp | – | 16670651..16670733 | – | |
| Mpo- | chr7:17775310..17773003 (–) | 2308 bp | – | 17774607..17774674 | – | |
| Mpo- | chr8:20036988..20037680 (+) | 693 bp | – | 20037282..20037406 | – | |
|
| ||||||
| Mpo- | chr7:6044481..6047405 (+) | 2925 bp | – | 6044934..6045065 | – | |
|
| ||||||
|
|
|
|
|
|
| |
| Mpo- | chr1:10051644..10051081 (–) | 1602 bp | 1 (180 bp) | 10051152..10051390 | Exon 1 | |
| Mpo- | // Mp3g01145.1 | chr3:1361154..1361555 (+) | 2565 bp | 1 (1108 bp) | 1361250..1361517 | Exon 1 |
| // Mp3g01145.2 | chr3:1361154..1361555 (+) | 1067 bp | 1 (125 bp) | 1361250..1361517 | Exon 1 | |
| Mpo- | chr3:26326600..26324986 (–) | 2971 bp | 1 (285 bp) | 26325832..26325925 | Exon 1 | |
| Mpo- | chr4:10492516..10493163 (+) | 1757 bp | 1 (151 bp) | 10492929..10493050 | Exon 1 | |
| Mpo- | chr5:17855821..17855632 (–) | 1529 bp | 1 (463 bp) | 17855250..17855342 | Intronic | |
| Mpo- | chr6:2176984..2176614 (–) | 7581 bp | 1 (5439 bp) | 2173585..2173809 | Intronic | |
| Mpo- | chr8:2556749..2556447 (–) | 3900 bp | 1 (2265 bp) | 2553976..2554099 | Exon 2 | |
All introns are of U2 type. The source of the transcription start site () are the combined RNA-seq and CAGE-seq data deposited in MarpolBase. The source of the longest 3ʹ end sequence () is the longest read from RNA-seq data deposited in MarpolBase. Two alternative pri-miRNA structures were proposed based on specific RNA-seq reads profile deposited in MarpolBase.
Proteins in Marchantia polymorpha orthologous to Arabidopsis thaliana proteins involved in miRNA biogenesis
|
| Marchantia polymorpha | ||||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |||||||
| DCL1 | AT1G01040 | ResIII—Helicase_C—Dicer_dimer—PAZ—Ribonuclease_3—Ribonuclease_3—DND1_DSRM | 1910 | Mp7g12090 | ResIII—Helicase_C—Dicer_dimer—PAZ—Ribonuclease_3—Ribonuclease_3—DND1_DSRM | 2057 | 61.5 |
| ResIII—Helicase_C—Dicer_dimer—PAZ—Ribonuclease_3—Ribonuclease_3— | Mp1g02840 | ResIII—Helicase_C—Dicer_dimer—PAZ—Ribonuclease_3—Ribonuclease_3 | 1748 | 32.0 | |||
| HYL1 | AT1G09700 | dsrm—dsrm | 419 | Mp7g08450 | dsrm—dsrm | 353 | 27.7 |
| SE | AT2G27100 | SERRATE_Ars2_N—ARS2 | 720 | Mp1g23090 | SERRATE_Ars2_N—ARS2 | 850 | 46.9 |
|
| |||||||
| APC10 | AT2G18290 | ANAPC10 | 192 | Mp5g21650 | ANAPC10 | 179 | 62.2 |
| AtELP4 | AT3G11220 | PAXNEB | 355 | Mp3g07590 | PAXNEB | 387 | 42.8 |
| CARP9 | AT3G21290 | Occludin_ELL | 1192 | Mp2g06020 | Occludin_ELL | 1627 | 24.9 |
| CBP20 | AT5G44200 | RRM_1 | 257 | Mp1g20560 | RRM_1 | 238 | 67.6 |
| CBP80 | AT2G13540 | MIF4G—MIF4G_like—MIF4G_like_2 | 848 | Mp1g05560 | MIF4G—MIF4G_like—MIF4G_like_2 | 876 | 51.4 |
| CDC5 | AT1G09770 | Myb_DNA-bind_6—Myb_Cef | 844 | Mp1g10310 | Myb_DNA-bind_6—Myb_Cef | 954 | 56.5 |
| CDF2 | AT5G39660 | zf-Dof | 457 | – | – | – | – |
| CHR2/BRM | AT2G46020 | SNF2-rel_dom—Helicase_C | 2193 | – | – | – | – |
| CPL1 | AT4G21670 | NIF—dsrm—dsrm | 967 | Mp4g06900 | NIF—dsrm—dsrm | 986 | 42.5 |
| DCL4 | AT5G20320 |
| 1702 | Mp7g11720 |
| 1932 | 29.6 |
| DDL | AT3G20550 | FHA | 314 | Mp7g08770 | FHA | 376 | 48.8 |
| ELP2 | AT1G49540 | WD40—WD40—WD40—WD40—WD40— | 840 | Mp8g01670 | WD40—WD40—WD40—WD40—WD40 | 904 | 46.0 |
| ELP5 | AT2G18410 | Elong_Iki1 | 374 | Mp7g09690 | Elong_Iki1 | 373 | 37.8 |
| EMB2765 | AT2G38770 | Aquarius_N—AAA_11—AAA_12 | 1509 | Mp6g16570 | Aquarius_N—AAA_11—AAA_12 | 1648 | 60.4 |
| FBW2 | AT4G08980 | F-box-like—LRR_6—LRR_6 | 317 | Mp4g03420 | F-box-like—LRR_6—LRR_6 | 310 | 32.2 |
| HEN1 | AT4G20910 | Hen1_Lam_C—dsRBD2— | 942 | Mp3g16010 | Hen1_Lam_C—dsRBD2— | 1018 | 32.5 |
| HEN2 | AT2G06990 | DEAD—Helicase_C—rRNA_proc-arch—DSHCT | 995 | Mp4g03900 | DEAD—Helicase_C—rRNA_proc-arch—DSHCT | 1009 | 68.6 |
| HOS5 | AT5G53060 | KH_1—KH_1—KH_1 | 652 | Mp3g0488 | KH_1—KH_1—KH_1— | 721 | 36.4 |
| ILP1 | AT5G08550 | GCFC | 908 | Mp8g18770 | GCFC | 995 | 40.9 |
| KETCH1 | AT5G19820 | Importin_rep_4—HEAT_2—Importin_rep_6—HEAT | 1116 | Mp3g07820 | Importin_rep_4—HEAT_2—Importin_rep_6—HEAT | 1120 | 69 |
| NOT2a | AT1G07705 | NOT2_3_5 | 614 | Mp1g21150 | NOT2_3_5 | 723 | 48 |
| NOT2b | AT5G59710 | NOT2_3_5 | 614 | – | – | – | – |
| NTR1 | AT1G17070 | TIP_N—G-patch—GCFC | 849 | Mp3g24100 | TIP_N—G-patch—GCFC | 848 | 49.6 |
| Mp3g09910 | TIP_N—G-patch—GCFC | 850 | 48.8 | ||||
| PPX1 | AT4G26720 | Metallophos | 305 | Mp2g13820 | Metallophos | 304 | 85.6 |
| PRL1 | AT4G15900 | WD40—WD40—WD40—WD40— | 486 | Mp6g09620 | WD40—WD40—WD40—WD40 | 487 | 64.9 |
| RACK1A | AT1G18080 | WD40—WD40—WD40—WD40—WD40—WD40—WD40 | 327 | – | – | – | – |
| RACK1B | AT1G48630 | WD40—WD40—WD40—WD40—WD40—WD40—WD40 | 326 | – | – | – | – |
| RACK1C | AT1G18080 | WD40—WD40—WD40—WD40—WD40—WD40—WD40 | 326 | Mp3g15630 | WD40—WD40—WD40—WD40—WD40—WD40—WD40 | 316 | 71.2 |
| RBM7 | AT4G10110 | RRM_1 | 173 | Mp1g13100 | RRM_1 | 220 | 30.0 |
| RH6 | AT2G45810 | DEAD—Helicase_C | 528 | – | – | – | – |
| RH8 | AT4G00660 | DEAD—Helicase_C | 505 | Mp7g14570 | DEAD—Helicase_C | 515 | 74.7 |
| RH12 | AT3G61240 | DEAD—Helicase_C | 498 | – | – | – | – |
| RH27 | AT5G65900 | DEAD—Helicase_C—DUF4217 | 633 | – | – | – | – |
| RH42 | AT1G20920 | DEAD—Helicase_C | 1166 | Mp1g06750 | DEAD—Helicase_C | 1242 | 55.8 |
| RS40 | AT4G25500 | RRM_1—RRM_1 | 350 | – | – | – | – |
| RS41 | AT5G52040 | RRM_1—RRM_1 | 357 | – | – | – | – |
| SAC3A | AT2G39340 | SAC3_GANP | 1006 | Mp8g06380 | SAC3_GANP | 1142 | 37.9 |
| SEAP1 | AT4G24270 | RRM_1—Lsm_interact | 817 | Mp8g05890 | RRM_1—Lsm_interact | 862 | 43.2 |
| SIC | AT4G24500 | – | 319 | Mp5g17530 | – | 278 | 23.8 |
| SNRK2.6 | AT4G33950 | Pkinase | 362 | Mp1g24460 | Pkinase | 349 | 76.9 |
| TGH | AT5G23080 | DUF1604—Surp | 930 | Mp1g06200 | DUF1604—Surp | 1046 | 37.2 |
| THO2 | AT1G24706 | THOC2_N—THOC2_N—Thoc2—Tho2 | 1823 | Mp1g20320 | THOC2_N—THOC2_N—Thoc2—Tho2 | 1978 | 52.7 |
| THP1 | AT2G19560 | PCI | 413 | Mp1g04560 | PCI | 410 | 65.4 |
| XCT | AT2G21150 | XAP5 | 337 | Mp2g07350 | XAP5 | 333 | 79.2 |
| ZCCHC8A | AT5G38600 | PSP | 532 | Mp4g19730 | PSP | 675 | 25.2 |
|
| |||||||
| COP1 | AT2G32950 | zf-C3HC4_2—WD40—WD40 | 675 | Mp5g12010 | zf-C3HC4_2—WD40—WD40 | 688 | 62.1 |
| CPL2 | AT5G01270 | NIF—dsrm | 774 | – | – | – | – |
| EXPORTIN1A/XPO1A | AT5G17020 | IBN_N—Xpo1—CRM1_repeat—CRM1_repeat_2—CRM1_repeat_3—CRM1_C | 1075 | Mp7g03970 | IBN_N—Xpo1—CRM1_repeat—CRM1_repeat_2—CRM1_repeat_3—CRM1_C | 1075 | 79.1 |
| CRM1B/XPO1B | AT3G03110 | IBN_N—Xpo1—CRM1_repeat—CRM1_repeat_2—CRM1_repeat_3—CRM1_C | 1076 | ||||
| FIP37 | AT3G54170 | Wtap | 330 | Mp1g24270 | Wtap | 364 | 44.2 |
| GRP7 | AT2G21660 | RRM_1 | 176 | Mp1g18780 | RRM_1 | 193 | 62.1 |
| HAKAI | AT5G01160 | – | 360 | Mp7g09040 | – | 566 | 25 |
| HESO1 | AT2G39740 | TUTase | 511 | – | – | – | – |
| HOS1 | AT2G39810 | ELYS | 927 | Mp7g02710 | ELYS | 780 | 35.4 |
| HST1 | AT3G05040 | Xpo1—Exportin-5 | 1203 | Mp2g09050 | Xpo1—Exportin-5 | 1216 | 41.6 |
| MOS2 | AT1G33520 | G-patch_2 | 462 | Mp5g05110 | G-patch_2 | 690 | 25.6 |
| MPK3 | AT3G45640 | Pkinase | 370 | – | – | – | – |
| MTA | AT4G10760 | MT-A70 | 685 | Mp1g08870 | MT-A70 | 816 | 50.2 |
| MTB | AT4G09980 | MT-A70 | 963 | Mp1g04450 | MT-A70 | 1520 | 29.9 |
| PIF4 | AT2G43010 | HLH | 472 | – | – | – | – |
| PPX2 | AT5G55260 | Metallophos | 346 | – | – | – | – |
| SMA1 | AT2G33730 | DEAD—Helicase_C | 733 | Mp1g21580 | DEAD—Helicase_C | 935 | 54.9 |
| STA1 | AT4G03430 | PRP1_N—TPR_14 | 1029 | Mp6g10790 | PRP1_N—TPR_14 | 942 | 69.1 |
| VIR | AT3G05680 | VIR_N | 2138 | Mp1g10290 | VIR_N | 2455 | 29.4 |
|
| |||||||
| AGO1 | AT1G48410 |
| 1050 | Mp1g18110 | ArgoN—ArgoL1—PAZ—ArgoL2—ArgoMid—Piwi | 1109 | 67.3 |
| AGO4 | AT2G27040 | ArgoN—ArgoL1—PAZ—ArgoL2— | 924 | Mp1g23190 | ArgoN—ArgoL1—PAZ—ArgoL2—Piwi | 943 | 40.4 |
| AGO10 | AT5G43810 | ArgoN—ArgoL1—PAZ—ArgoL2—ArgoMid—Piwi | 988 | – | – | – | – |
Proteins are classified into four sections (core, auxiliary, regulatory microprocessor, and miRISC formation proteins) and listed alphabetically in each section. Differences in domain architecture between Arabidopsis and Marchantia polymorpha orthologs are indicated in bold. Ortholog assignments between A. thaliana and M. polymorpha proteins were predicted by Orthofinder v. 2.5.4 using BLAST as the main sequence similarity search tool (Emms and Kelly, 2019). Pairwise alignments between orthologous protein sequences were calculated using the needle tool from the EMBOSS package (Rice ). Protein domains were annotated using the PfamScan tool and the Pfam 34.0 database (Mistry ).
Fig. 4.Differences in domain architecture between Arabidopsis and M. polymorpha orthologs involved in miRNA biogenesis. Protein domains were annotated using the PfamScan tool and the Pfam database (Mistry ). The figure includes the following Pfam accession numbers: ArgoL1 (PF08699), ArgoL2 (PF16488), ArgoMid (PF16487), ArgoN (PF16486), DEAD (PF00270), Dicer_dimerm (PF03368), DND1_DSRM (PF14709), dsRBD2 (PF17842), Gly-rich_Ago1 (PF12764), Helicase C (PF00271), Hen1_Lam_C (PF18441), KH_1 (PF00013), Methyltransf_21 (PF05050), Methyltransf_31 (PF13847), PAZ (PF02170), Piwi (PF02171), RESIII (PF04851), Ribonuclease 3 (PF00636), and WD40 (PF00400).