| Literature DB >> 17299599 |
Noah Fahlgren1, Miya D Howell, Kristin D Kasschau, Elisabeth J Chapman, Christopher M Sullivan, Jason S Cumbie, Scott A Givan, Theresa F Law, Sarah R Grant, Jeffery L Dangl, James C Carrington.
Abstract
In plants, microRNAs (miRNAs) comprise one of two classes of small RNAs that function primarily as negative regulators at the posttranscriptional level. Several MIRNA genes in the plant kingdom are ancient, with conservation extending between angiosperms and the mosses, whereas many others are more recently evolved. Here, we use deep sequencing and computational methods to identify, profile and analyze non-conserved MIRNA genes in Arabidopsis thaliana. 48 non-conserved MIRNA families, nearly all of which were represented by single genes, were identified. Sequence similarity analyses of miRNA precursor foldback arms revealed evidence for recent evolutionary origin of 16 MIRNA loci through inverted duplication events from protein-coding gene sequences. Interestingly, these recently evolved MIRNA genes have taken distinct paths. Whereas some non-conserved miRNAs interact with and regulate target transcripts from gene families that donated parental sequences, others have drifted to the point of non-interaction with parental gene family transcripts. Some young MIRNA loci clearly originated from one gene family but form miRNAs that target transcripts in another family. We suggest that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17299599 PMCID: PMC1790633 DOI: 10.1371/journal.pone.0000219
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Identification and analysis of Arabidopsis miRNAs and targets. (A) Flowchart for the prediction of miRNAs and their targets. (B) Validation of predicted targets for 13 non-conserved miRNAs. Positions of dominant 5′ RACE products (no. 5′ ends at position/total no. 5′ ends sequenced) are indicated by vertical arrows in the expanded regions. Predicted cleavage sites are indicated by a bolded nucleotide at position ten relative to the 5′ end of the miRNA or miRNA-variant. Positions of gene-specific primers are indicated with horizontal arrows above gene diagrams.
Reference set of Arabidopsis MIRNA families.
|
| Loci | Conserved | miRNA* sequenced | Reads | Target family | Target validation |
| miR156/miR157 | 12 | Y | Y | 15606 | Squamosa-promoter binding protein-like (SPL) | Y |
| miR158 | 2 | N | Y | 833 | Pentatricopeptide repeat (PPR) | N |
| miR159/miR319 | 6 | Y | Y | 24098 | MYB transcription factor | Y |
| TCP transcription factor | Y | |||||
| miR160 | 3 | Y | Y | 5783 | Auxin response factor (ARF) | Y |
| miR161 | 1 | N | Y | 9191 | Pentatricopeptide repeat (PPR) | Y |
| miR162 | 2 | Y | Y | 289 | Dicer-like (DCL) | Y |
| miR163 | 1 | N | Y | 263 | S-adenosylmethionine-dependent methyltransferase (SAMT) | Y |
| miR164 | 3 | Y | Y | 449 | NAC domain transcription factor | Y |
| miR166/miR165 | 9 | Y | Y | 4881 | HD-ZIPIII transcription factor | Y |
| miR167 | 4 | Y | Y | 8286 | Auxin response factor (ARF) | Y |
| miR168 | 2 | Y | Y | 1286 | Argonaute (AGO) | Y |
| miR169 | 14 | Y | Y | 42496 | HAP2 transcription factor | Y |
| miR171/miR170 | 4 | Y | Y | 6423 | Scarecrow-like transcription factor (SCL) | Y |
| miR172 | 5 | Y | Y | 2720 | Apetala2-like transcription factor (AP2) | Y |
| miR173 | 1 | N | Y | 78 | TAS1, TAS2 | Y |
| miR390/miR391 | 3 | Y | Y | 2889 | TAS3 | Y |
| miR393 | 2 | Y | Y | 94 | Transport inhibitor response 1 (TIR1)/Auxin F-box (AFB) | Y |
| bHLH transcription factor | Y | |||||
| miR394 | 2 | Y | Y | 47 | F-box | Y |
| miR395 | 6 | Y | Y | 14 | ATP-sulfurylase (APS) | Y |
| Sulfate transporter (AST) | Y | |||||
| miR396 | 2 | Y | Y | 408 | Growth regulating factor (GRF) | Y |
| miR397 | 2 | Y | N | 290 | Laccase (LAC) | Y |
| miR398 | 3 | Y | Y | 42 | Copper superoxide dismutase (CSD) | Y |
| Cytochrome-c oxidase | Y | |||||
| miR399 | 6 | Y | Y | 50 | E2 ubiquiting-conjugating protein (E2-UBC) | Y |
| miR400 | 1 | N | Y | 18 | Pentatricopeptide repeat (PPR) | N |
| miR402 | 1 | N | N | 10 | HhH-GPD base excision DNA repair | N |
| miR403 | 1 | Y | Y | 29 | Argonaute (AGO) | Y |
| miR408 | 1 | Y | Y | 42 | Laccase (LAC) | Y |
| Plantacyanin-like (PCL) | N | |||||
| miR447 | 3 | N | N | 25 | 2-phosphoglycerate kinase-related (2-PGK) | Y |
Reviewed in Jones-Rhoades et al. [11].
Conserved between A. thaliana and P. trichocarpa.
Number of reads are from all libraries in the ASRP database (http://asrp.cgrb.oregonstate.edu/db).
Reads for each locus encompass the defined miRNA sequence ±4 nts on each side.
New or recently identified miRNAs
| Reads | ||||||||
| MIRNA family | Sequence | Loci | Conserved | miRNA* sequenced | miRNA | miRNA* | Validated or predicted target family | Validated or top predicted targets |
|
| ||||||||
| miR472 | UUUUUCCUACUCCGCCCAUACC | 1 | Y | Y | 9 (206) | 2 (51) |
|
|
| miR773 | UUUGCUUCCAGCUUUUGUCUCC | 1 | N | N | 11 (735) | 0 (0) |
|
|
| miR774 | UUGGUUACCCAUAUGGCCAUC | 1 | N | N | 0 (348) | 0 (0) |
|
|
| miR775 | UUCGAUGUCUAGCAGUGCCA | 1 | N | Y | 1362 (1532) | 5 (3) |
|
|
| miR778 | UGGCUUGGUUUAUGUACACCG | 1 | N | Y | 0 (2) | 1 (7) |
|
|
| miR780.1 | UCUAGCAGCUGUUGAGCAGGU | 1 | N | Y | 57 (266) | 0 (118) |
|
|
| miR780.2 | UUCUUCGUGAAUAUCUGGCAU | |||||||
| miR824 | UAGACCAUUUGUGAGAAGGGA | 1 | N | Y | 261 (2646) | 399 (2) |
|
|
| miR827 | UUAGAUGACCAUCAACAAACU | 1 | N | Y | 11 (20) | 0 (1) |
|
|
| miR842 | UCAUGGUCAGAUCCGUCAUCC | 1 | N | Y | 2 (97) | 2 (0) |
|
|
| miR844 | AAUGGUAAGAUUGCUUAUAAG | 1 | N | Y | 58 (1) | 2 (0) |
|
|
| miR846 | UUGAAUUGAAGUGCUUGAAUU | 1 | N | N | 72 (0) | 0 (0) |
|
|
| miR856 | UAAUCCUACCAAUAACUUCAGC | 1 | N | Y | 62 (7) | 9 (0) |
|
|
| miR857 | UUUUGUAUGUUGAAGGUGUAU | 1 | N | N | 59 (0) | 0 (0) |
|
|
| miR858 | UUUCGUUGUCUGUUCGACCUU | 1 | N | N | 55 (4) | 0 (0) |
|
|
| miR859 | UCUCUCUGUUGUGAAGUCAAA | 1 | N | N | 2 (5) | 0 (0) |
| At3g17265 (0.5) |
|
| ||||||||
| miR771 | UGAGCCUCUGUGGUAGCCCUCA | 1 | N | Y | 16 (906) | 0 (44) | - | - |
| miR776 | UCUAAGUCUUCUAUUGAUGUUC | 1 | N | N | 1439 (487) | 0 (0) | Serine/threonine kinase | At5g62310 (3) |
| miR777 | UACGCAUUGAGUUUCGUUGCUU | 1 | N | N | 8 (80) | 0 (0) | - | - |
| miR779 | UUCUGCUAUGUUGCUGCUCAUU | 1 | N | N | 2 (98) | 0 (0) | - | - |
| miR781 | UUAGAGUUUUCUGGAUACUUA | 1 | N | Y | 0 (77) | 1 (0) | CD2-binding, MCM | At5g23480 (2.5) |
| miR823 | UGGGUGGUGAUCAUAUAAGAU | 1 | N | N | 107 (1) | 0 (0) | Chromomethylase | At1g69770 (2.5) |
| miR825 | UUCUCAAGAAGGUGCAUGAAC | 1 | N | N | 120 (0) | 0 (0) | Remorin, zinc finger homeobox family, frataxin-related | At2g41870 (2.5) |
| miR829.2 | AGCUCUGAUACCAAAUGAUGGAAU | 1 | N | Y | 134 (41) | 3 (25) | - | - |
| miR830-5p | UCUUCUCCAAAUAGUUUAGGUU | 1 | N | Y | 2 (1) | - | RanBP1 domain, kinesin motor-related | At1g52380 (3) |
| miR830-3p | UAACUAUUUUGAGAAGAAGUG | - | 3 (21) | - | - | |||
| miR833-3p | UAGACCGAUGUCAACAAACAAG | 1 | N | Y | 5 (2) | - | - | - |
| miR833-5p | UGUUUGUUGUACUCGGUCUAG | - | 2 (1) | F-box | At1g77650 (3.5) | |||
| miR840 | ACACUGAAGGACCUAAACUAAC | 1 | N | Y | 20 (1) | 7 (116) | WHIRLY transcription factor | At2g02740 (0) |
| miR843 | UUUAGGUCGAGCUUCAUUGGA | 1 | N | Y | 7 (0) | 2 (0) | F-box, 1-aminocyclopropane-1-carboxylate synthase | At3g13830 (0.5) |
| miR845a | CGGCUCUGAUACCAAUUGAUG | 2 | N | Y | 670 (21) | 1 (40) | - | - |
| miR845b | UCGCUCUGAUACCAAAUUGAUG | |||||||
| miR851-5p | UCUCGGUUCGCGAUCCACAAG | 1 | N | Y | 3 (281) | 1 (1) | - | - |
| miR852 | AAGAUAAGCGCCUUAGUUCUGA | 1 | N | Y | 3 (84) | 0 (1) | ATPase | At5g62670 (3) |
| miR853 | UCCCCUCUUUAGCUUGGAGAAG | 1 | N | N | 2 (0) | 0 (0) | - | - |
| miR860 | UCAAUAGAUUGGACUAUGUAU | 1 | N | Y | 14 (15) | 0 (1) | Histone deacetylase, ferrochelatase, RNA recognition motif | At5g26040 (0) |
| miR861-3p | GAUGGAUAUGUCUUCAAGGAC | 1 | N | Y | 6 (2) | - | - | - |
| miR861-5p | CCUUGGAGAAAUAUGCGUCAA | - | 1 (8) | - | - | |||
| miR862-5p | UCCAAUAGGUCGAGCAUGUGC | 1 | N | Y | 5 (0) | - | - | - |
| miR862-3p | AUAUGCUGGAUCUACUUGAAG | - | 2 (0) | - | - | |||
| miR863-3p | UUGAGAGCAACAAGACAUAAU | 1 | N | Y | 5 (0) | - | - | - |
| miR863-5p | UUAUGUCUUGUUGAUCUCAAU | - | 2 (0) | Kinase, Legumain (C13 protease) | At2g26700 (3), At1g62710 (3.5) | |||
| miR864-5p | UCAGGUAUGAUUGACUUCAAA | 1 | N | Y | 3 (0) | - | Triacylglycerol lipase | At1g06250 (3) |
| miR864-3p | UAAAGUCAAUAAUACCUUGAAG | - | 2 (0) | Expressed protein | At4g25210 (3) | |||
| miR865-5p | AUGAAUUUGGAUCUAAUUGAG | 1 | N | Y | 3 (0) | - | Serine carboxypeptidase, sulfate transporter | At5g42240 (3.5) |
| miR865-3p | UUUUUCCUCAAAUUUAUCCAA | - | 1 (0) | DEAD box RNA helicase, DNA-binding bromodomain-containing protein | At2g07750 (3) | |||
| miR866-3p | ACAAAAUCCGUCUUUGAAGA | 1 | N | Y | 2 (0) | - | Kinase, electron transport SCO1/SenC, NAD-dependent G-3-P dehydrogenase | At4g21400 (3) |
| miR866-5p | UCAAGGAACGGAUUUUGUUAA | - | 0 (5) | Expressed protein, C2 domain-containing protein | At4g21700 (3) | |||
| miR867 | UUGAACAUGGUUUAUUAGGAA | 1 | N | N | 30 (0) | 0 (0) | PHD finger-related/SET domain, kinase, phospholipase/carboxylesterase | At4g27910 (3.5) |
| miR868 | CUUCUUAAGUGCUGAUAAUGC | 1 | N | N | 9 (1) | 0 (0) | - | - |
| miR869.1 | UCUGGUGUUGAGAUAGUUGAC | 1 | N | N | 11 (5) | 0 (0) | - | - |
| miR869.2 | AUUGGUUCAAUUCUGGUGUUG | |||||||
| miR870 | UAAUUUGGUGUUUCUUCGAUC | 1 | N | N | 4 (32) | 0 (0) | - | - |
Conserved between A. thaliana and P. trichocarpa.
Number of reads are from all libraries in the ASRP (http://asrp.cgrb.oregonstate.edu/db) and MPSS Plus (http://mpss.udel.edu/at/) databases. Reads for each locus encompass the defined miRNA/miRNA* sequence ±4 nts on each side. ASRP (MPSS Plus).
Top three predicted targets with a score of 3.5 or less are listed with their score in parentheses. Targets validated by 5′ RACE are in bold. Remaining number of targets predicted with a score of 3.5 or less are listed in square brackets (Table S1). Dashes indicate no predicted targets with a score of 3.5 or less.
Targets validated by Lu et al. [23].
Target tested but failed in 5′RACE validation assays.
Seventeen nt MPSS Plus signature was extended 4 nts on the 3′ end.
Figure 2Comparison of conserved and non-conserved MIRNA families. (A) Effect of dcl1-7 and hen1-1 mutations on levels of target transcripts for conserved (black) and non-conserved (red) miRNAs. Expression data are shown for two validated or high-confidence predicted targets, if available, for each family. Arrows indicate targets for miR824 (AGL16), miR858 (MYB12) and miR161.1 (At1g63130, a PPR gene). (B) Numbers of gene family members for conserved and non-conserved MIRNAs (Tables 1 and 2). (C) Relative numbers of miRNA target family functions for conserved and non-conserved miRNAs (Tables 1 and 2). Only target classes that have been validated experimentally are included. Note that Table 2 shows many MIRNA families with weak or no predicted targets, and these are not represented in the chart.
Figure 3Expression profiling of MIRNA families using high-throughput pyrosequencing. (A) Comparison of most-abundant miRNA families between biological replicates of Col-0 inflorescence (inf.) tissue. Normalized reads for each miRNA family member were consolidated. Note that MIR159 and MIR319 derived members were counted separately, even though they are frequently assigned to the same family [31], [57]. (B) Fold-change of miRNAs in dcl1-7 inflorescence versus Col-0 inflorescence (left axis, bars). Total number of reads for each family is indicated (right axis, green line). As the dcl1-7 mutant contained no reads for many miRNA families, fold-change was calculated using normalized reads+1. This had the effect of dampening fold-change values for low-abundance families. (C) Fold-change of miRNA family reads in leaves at 1 hr and 3 hr post-inoculation with P. syringae (DC3000hrcC) (left axis, bars). Fold-change relative to uninoculated leaves was calculated based on normalized reads as described in panel (B). Total number of reads in the control and inoculated samples is shown (right axis, green line). Grey dashed lines indicate the p = 0.05 upper and lower thresholds.
Figure 4Identification of MIRNA foldbacks with similarity to protein-coding genes. (A) Flowchart for identification of MIRNA foldbacks with similarity, extending beyond the miRNA target site, to protein-coding genes. (B) Arabidopsis gene or transcript hits in FASTA searches using foldback sequences for all conserved and non-conserved MIRNA loci (Tables 1 and 2). The top four hits based on E-values are shown. (C) Z-scores for the Needleman-Wunche alignment values from MIRNA foldback arms with top four gene or transcript FASTA hits. Alignments were done with intact foldback arms (I), and with foldback arms in which miRNA or miRNA-complementary sequences were deleted (D). Z-scores were derived from standard deviation values for alignments of randomized sequences. In (B) and (C), a red symbol represents an experimentally validated target, a pink symbol indicates a gene from a validated target family, and an open symbol indicates a gene that is distinct from either the validated or predicted target family.
Figure 5Similarity between MIRNA foldback arms and protein-coding genes. Each alignment contains the coding strand for 1–3 genes, the miRNA* arm, and the miRNA arm. Orientation of the foldback arms is indicated by (+) for authentic polarity and (−) for the reverse complement polarity. Two alignments are given for MIR824 because the two arms are each most similar to distinct, duplicated regions within the AGL16 gene (At3g57230). Alignments were generated using T-COFFEE. Colors indicate alignment quality in a regional context.
Figure 6Targeting specificity of recently evolved MIRNAs. Two target prediction scores are shown for each of 16 miRNAs: best overall predicted target score (blue) and target scores calculated for MIRNA foldback-similar genes (grey). Left column indicates whether or not the best overall predicted target gene is in the same family as the foldback-similar gene. A dot indicates that the predicted gene is in an experimentally validated target family. Two calculations corresponding to the two major populations from the MIR161 locus (miR161.1 and miR161.2) are shown. The identities of targets are listed in Supplemental Table S3. The plot is centered on a target prediction score of 4, as this corresponds to the upper limit of a reasonable prediction.