| Literature DB >> 22815845 |
Stuart J Lucas1, Hikmet Budak.
Abstract
Individual chromosome-based studies of bread wheat are beginning to provide valuable structural and functional information about one of the world's most important crops. As new genome sequences become available, identifying miRNA coding sequences is arguably as important a task as annotating protein coding sequences, but one that is not as well developed. We compared conservation-based identification of conserved miRNAs in 1.5× coverage survey sequences of wheat chromosome 1AL with a predictive method based on pre-miRNA hairpin structure alone. In total, 42 sequences expected to encode conserved miRNAs were identified on chromosome 1AL, including members of several miRNA families that have not previously been reported to be expressed in T. aestivum. In addition, we demonstrate that a number of sequences previously annotated as novel wheat miRNAs are closely related to transposable elements, particularly Miniature Inverted Terminal repeat Elements (MITEs). Some of these TE-miRNAs may well have a functional role, but separating true miRNA coding sequences from TEs in genomic sequences is far from straightforward. We propose a strategy for annotation to minimize the risk of mis-identifying TE sequences as miRNAs.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22815845 PMCID: PMC3398953 DOI: 10.1371/journal.pone.0040859
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Conserved miRNAs shown to be expressed from T. aestivum chromosome 1AL.
| miRNA from chromosome 1AL | Conserved miRNA | Mature miRNA location | Pre-miRNA statistics | Matched sequence read | ||||||||
| ID | Sequence | Length(nt) | ID | Start | End | Arm | Length | MFE | GC% | MFEI | ID | |
| tae-miR171a | UGAUUGAGCCGCGCCAAUAU | 20 | zma-miR171a | 78 | 97 | 3′ | 118 | −59 | 48.31 | 1.04 |
| |
| tae-miR393a | UCCAAAGGGAUCGCAUUGAUCC | 22 | bdi-miR393a | 20 | 41 | 5′ | 130 | −66 | 53.08 | 0.95 | F2MIQBM01BALL6 | |
| tae-miR399b | UGCCAAAGGAGAAUUGCCCUG | 21 | bdi-miR399b | 120 | 140 | 3′ | 161 | −64 | 57.14 | 0.69 | F1NBZEY01AK17M | |
| tae-miR399b | UGCCAAAGGAGAAUUGCCCUG | 21 | bdi-miR399b | 111 | 131 | 3′ | 152 | −64 | 57.89 | 0.73 | F1NBZEY02GW67Y | |
| tae-miR5075 |
| 21 | osa-miR5075 | 20 | 40 | 5′ | 308 | −147 | 69.16 | 0.69 | F0RUNSI01CTDLK | |
| tae-miR5050 |
| 21 | hvu-miR5050 | 92 | 112 | 3′ | 133 | −72 | 60.90 | 0.89 |
| |
| tae-miR5050 |
| 21 | hvu-miR5050 | 92 | 112 | 3′ | 133 | −75 | 60.90 | 0.92 |
| |
| tae-miR5200 | UGUAGAUACUC | 21 | bdi-miR5200 | 76 | 96 | 3′ | 117 | −39 | 38.46 | 0.86 |
| |
Where two similar known miRNAs gave equally close matches to a sequence, the evolutionarily closest match is given.
Matched sequence reads shown in bold were also predicted to form miRNA hairpins by miRPara.
Mismatches to the conserved miRNA sequence are underlined and in bold.
MFE = Minimum Folding free Energy of predicted hairpin secondary structure.
MFEI = Minimum Folding Energy Index, calculated as described by Yin et al. [40].
High-confidence predicted miRNA coding sequences on chromosome 1AL.
| miRNA from chromosome 1AL | Conserved miRNA | Mature miRNA location | Pre-miRNA statistics | Matched sequence read | ||||||||
| ID | Sequence | Length(nt) | ID | Start | End | Arm | Length | MFE | GC% | MFEI | ID | |
| tae-miR156a | UGACAGAAGAGAGUGAGCAC | 20 | aly-miR156a | 20 | 39 | 5′ | 125 | −62 | 53.60 | 0.93 |
| |
| tae-miR156a | UGACAGAAGAGAGUGAGCAC | 20 | aly-miR156a | 20 | 39 | 5′ | 124 | −69 | 58.87 | 0.95 |
| |
| tae-miR164a | UGGAGAAGCAGGGCACGUGCA | 21 | aly-miR164a | 20 | 40 | 5′ | 140 | −75 | 55.71 | 0.96 |
| |
| tae-miR164a | UGGAGAAGCAGGGCACGUGCA | 21 | aly-miR164a | 20 | 40 | 5′ | 140 | −75 | 55.71 | 0.96 |
| |
| tae-miR166b* | GGAAUGUUGUCUGGUUCAAGG | 21 | zma-miR166b* | 20 | 40 | 5′ | 136 | −55 | 46.32 | 0.87 |
| |
| tae-miR166e | CUCGGACCAGGCUUCAUUCCC | 21 | bdi-miR166e | 94 | 114 | 3′ | As above |
| ||||
| tae-miR166a | UCGGACCAGGCUUCAUUCCCC | 21 | aly-miR166a | 95 | 115 | 3′ | 136 | −54 | 47.06 | 0.84 |
| |
| tae-miR166b* | GGAAUGUUGUCUGGUUCAAGG | 21 | zma-miR166b* | 20 | 40 | 5′ | As above |
| ||||
| tae-miR166a | UCGGACCAGGCUUCAUUCCCC | 21 | aly-miR166a | 96 | 116 | 3′ | 137 | −55 | 46.72 | 0.86 |
| |
| tae-miR166b* | GGAAUGUUGUCUGGUUCAAGG | 21 | zma-miR166b* | 20 | 40 | 5′ | As above |
| ||||
| tae-miR171a | UGAUUGAGCCGCGCCAAUAU | 20 | zma-miR171a | 78 | 97 | 3′ | 118 | −59 | 48.31 | 1.04 |
| |
| tae-miR171b | UUGAGCCGUGCCAAUAUCAC | 20 | zma-miR171b | 82 | 101 | 3′ | 122 | −59 | 59.02 | 0.82 |
| |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 133 | 152 | 3′ | 173 | −74 | 45.09 | 0.94 | F2MIQBM01BUBWA | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 133 | 152 | 3′ | 173 | −73 | 45.66 | 0.92 | F1NBZEY02GWGQ1 | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 134 | 153 | 3′ | 174 | −69 | 45.98 | 0.86 | F1ADE5F01BAC6G | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 136 | 155 | 3′ | 176 | −70 | 44.32 | 0.90 | F1ADE5F01AFV6T | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 135 | 154 | 3′ | 175 | −73 | 44.00 | 0.95 | F1ADE5F01DARNZ | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 136 | 155 | 3′ | 176 | −74 | 44.32 | 0.95 | F1ADE5F01AJ0FQ | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 133 | 152 | 3′ | 173 | −69 | 45.66 | 0.87 | F1ADE5F01BL63C | |
| tae-miR172a | AGAAUCUUGAUGAUGCUGCA | 20 | csi-miR172a | 136 | 155 | 3′ | 176 | −70 | 43.75 | 0.91 | F1ADE5F01AY2DE | |
| tae-miR399b | UGCCAAAGGAGAAUUGCCCUG | 21 | bdi-miR399b | 100 | 120 | 3′ | 141 | −64 | 59.57 | 0.76 |
| |
| tae-miR399b | UGCCAAAGGAGAAUUGCCCUG | 21 | bdi-miR399b | 139 | 159 | 3′ | 180 | −84 | 59.44 | 0.78 | F2MIQBM01AUXGN | |
| tae-miR399b | UGCCAAAGGAGAAUUGCCCUG | 21 | bdi-miR399b | 138 | 158 | 3′ | 179 | −84 | 59.22 | 0.79 | F2MIQBM01B24OQ | |
| tae-miR399k | UGCCAAAGGAAAUUUGCCCC | 21 | osa-miR399k | 93 | 113 | 3′ | 134 | −54 | 58.96 | 0.68 | F1ADE5F01C5M0T | |
| tae-miR1138 | G | 23 | tae-miR1138 | 20 | 42 | 5′ | 173 | −57 | 32.37 | 1.01 | F0RUNSI01BXR55 | |
| tae-miR2118g | UUCCUAAUGCCUCCCAUUCCUA | 22 | osa-miR2118g | 97 | 118 | 3′ | 139 | −71 | 43.88 | 1.17 |
| |
| tae-miR2118b | UUCCCGAUGCCUC | 22 | osa-miR2118b | 96 | 117 | 3′ | 138 | −59 | 46.38 | 0.92 |
| |
| tae-miR2118e | UU | 22 | zma-miR2118e | 98 | 119 | 3′ | 140 | −53 | 42.14 | 0.90 |
| |
| tae-miR2118f | UU | 22 | osa-miR2118f | 96 | 117 | 3′ | 138 | −49 | 40.58 | 0.88 |
| |
| tae-miR2118f | UUCCUGAUGCCUCCCAUUCCUA | 22 | osa-miR2118f | 101 | 122 | 3′ | 143 | −49 | 47.55 | 0.73 | F1ADE5F01D1MVB | |
| tae-miR2905 |
| 21 | osa-miR2905 | 61 | 81 | 3′ | 102 | −53 | 54.90 | 0.94 | F1ADE5F01EPHEM | |
| tae-miR2905 |
| 21 | osa-miR2905 | 61 | 81 | 3′ | 102 | −57 | 54.90 | 1.02 | F2MIQBM02EQP10 | |
| tae-miR5049 |
| 21 | hvu-miR5049 | 20 | 40 | 5′ | 88 | −56 | 38.64 | 1.65 |
| |
| tae-miR5050 |
| 21 | hvu-miR5050 | 94 | 114 | 3′ | 135 | −74 | 61.48 | 0.89 |
| |
Where two similar known miRNAs gave equally close matches to a sequence, the evolutionarily closest match is given.
Matched sequence reads shown in bold were also predicted to form miRNA hairpins by miRPara.
Mismatches to the conserved miRNA sequence are underlined and in bold.
MFE = Minimum Folding free Energy of predicted hairpin secondary structure.
MFEI = Minimum Folding Energy Index, calculated as described by Yin et al. [40].
miRPara did not predict these hairpins, but predicted a pre-miRNA on the complementary strand. For F003IAL01CL160, which contains two adjacent pre-miRNA hairpins, miRPara predicted the same strand for one but the complementary strand for the other.
TE-related possible miRNA coding sequences (TE-miR) on chromosome 1AL.
| miRNA from chromosome 1AL | Conserved miRNA | Mature miRNA location | Pre-miRNA statistics | Matched sequence read(s) | |||||||
| ID | Sequence | Length(nt) | ID | Start | End | Arm | Length | MFE | GC% | MFEI | ID |
| tae-miR437 | AAAGUUAGAGAAGUUUGACUU | 21 | osa-miR437 | 172 | 192 | 3′ | 199 | −52 | 26.13 | 1.01 | F1ADE5F01CDHM0 |
| tae-miR818a | AAU | 22 | osa-miR818a | 65 | 86 | 3′ | 107 | −62 | 33.64 | 1.73 |
|
| tae-miR1118 |
| 23 | tae-miR1118 | 20 | 42 | 5′ | 106 | −46 | 38.68 | 1.12 | F1NBZEY01AFQVZ |
| tae-miR1118 | CACUACAUU | 23 | tae-miR1118 | 201 | 223 | 3′ | 234 | −178 | 46.58 | 1.63 | F1ADE5F01EHKQU |
| tae-miR1118 | CACUACAUU | 23 | tae-miR1118 | 192 | 214 | 3′ | 235 | −176 | 46.81 | 1.60 | F1ADE5F01EHKQU |
| tae-miR1121 | AGUAGUGAUCUAAACGCUCUUA | 22 | tae-miR1121 | 62 | 83 | 3′ | 104 | −58 | 32.69 | 1.69 |
|
| tae-miR1121 | A | 22 | tae-miR1121 | 115 | 136 | 3′ | 157 | −64 | 32.48 | 1.25 | F0RUNSI01D7KCV |
| tae-miR1121 | A | 22 | tae-miR1121 | 115 | 136 | 3′ | 157 | −59 | 33.12 | 1.13 | F0RUNSI01D7KCV |
| tae-miR1121 | A | 22 | tae-miR1121 | 66 | 87 | 3′ | 108 | −59 | 31.48 | 1.74 |
|
| tae-miR1121 |
| 22 | tae-miR1121 | 64 | 85 | 3′ | 106 | −51 | 29.25 | 1.64 |
|
| tae-miR1121 | AGUA | 22 | tae-miR1121 | 61 | 82 | 3′ | 103 | −40 | 29.13 | 1.34 |
|
| tae-miR1125 | AACCAACGAGACC | 24 | tae-miR1125 | 20 | 43 | 5′ | 126 | −96 | 42.06 | 1.81 | F1ADE5F01EHKQU |
| tae-miR1125 | AACCAACGAGACC | 24 | tae-miR1125 | 20 | 43 | 5′ | 153 | −102 | 39.87 | 1.68 | F0RUNSI01BM4RW |
| tae-miR1127 | AACUACUCCCUCCGUCCCAUA | 21 | bdi-miR1127 | 20 | 40 | 5′ | 119 | −53 | 36.13 | 1.23 | F003IAL01CP7OA |
| tae-miR1127 |
| 21 | bdi-miR1127 | 20 | 40 | 5′ | 114 | −50 | 42.11 | 1.04 | F0RUNSI02IG9O8 |
| tae-miR1128 | UACUACUCCCUCCGU | 21 | ssp-miR1128 | 20 | 40 | 5′ | 94 | −27 | 41.49 | 0.68 | F1ADE5F01DQMRT |
| tae-miR1128 | UACUACUCCCUCCGUCCCA | 21 | ssp-miR1128 | 20 | 40 | 5′ | 100 | −48 | 38.00 | 1.26 | F003IAL01C154D |
| tae-miR1133 |
| 22 | tae-miR1133 | 20 | 41 | 5′ | 96 | −37 | 39.58 | 0.96 | F2MIQBM02EYQFE |
| tae-miR1139 |
| 21 | bdi-miR1139 | 20 | 40 | 5′ | 70 | −25 | 40.00 | 0.91 | F2MIQBM01A78VO |
| tae-miR1139 |
| 21 | bdi-miR1139 | 20 | 40 | 5′ | 84 | −29 | 25.00 | 1.37 | F1NBZEY02F39NT |
| tae-miR1139 |
| 21 | bdi-miR1139 | 20 | 40 | 5′ | 90 | −39 | 40.00 | 1.08 | F1ADE5F01E4WAQ, F003IAL01BA096 |
| tae-miR1439 | UUUUGGAACGGAG | 21 | osa-miR1439 | 62 | 82 | 3′ | 103 | −38 | 38.83 | 0.95 | F0RUNSI02G0499 |
| tae-miR5203 | ACUUAUUAUGGA | 21 | bdi-miR5203 | 83 | 103 | 3′ | 124 | −52 | 42.74 | 0.98 | F0RUNSI01BLKK9 |
| tae-miR5203 | ACUUAUUAUGGA | 21 | bdi-miR5203 | 84 | 104 | 3′ | 125 | −44 | 32.80 | 1.07 | F2MIQBM02DJIDZ |
Where two similar known miRNAs gave equally close matches to a sequence, the evolutionarily closest match is given.
Matched sequence reads shown in bold were also predicted to form miRNA hairpins by miRPara.
Mismatches to the conserved miRNA sequence are underlined and in bold.
MFE = Minimum Folding free Energy of predicted hairpin secondary structure.
MFEI = Minimum Folding Energy Index, calculated as described by Yin et al. [40].
Highly represented repeat-related miRNA families in 1AL survey sequences.
| Conserved miRNA | Sequence | Hits passinghairpin criteria | Hits matchingknown repeats | Families of knownrepeats matched |
| tae-miR1117 | UAGUACCGGUUCGUGGCACGAACC | 471 | 471 | CACTA, Unknown |
| tae-miR1118 | CACUACAUUAUGGAAUGGAGGGA | 76 | 73 | Mariner |
| hvu-miR1120 tae-miR1120 | ACAUUCUUAUAUUAUGGGACGGAG ACAUUCUUAUAUUAUGAGACGGAG | 220 | 166 | Mariner, CACTA |
| bdi-miR1122 far-miR1122 | UAGAUACAUCCGUAUUUGGA UAGAUACAUCCGUAUCUAGA | 437 | 437 | Mariner |
| tae-miR1125 | AACCAACGAGACCAACUGCGGCGG | 24 | 22 | Mariner |
| tae-miR1126 | UCCACUAUGGACUACAUACGGAG | 72 | 72 | Mariner |
| bdi-miR1127 tae-miR1127 | AACUACUCCCUCCGUCCGAUA UCCUUCCGUUCGGAAUUAC | 14 | 12 | Mariner, CACTA |
| ssp-miR1128 tae-miR1128 | UACUACUCCCUCCGUCCCAAA UACUACUCCCUCCGUCCGAAA | 99 | 97 | Mariner, CACTA |
| tae-miR1130 | CCUCCGUCUCGUAAUGUAAGACG | 66 | 31 | Mariner, CACTA |
| tae-miR1131 | UAGUACCGGUUCGUGGCUAACC | 182 | 182 | CACTA |
| tae-miR1133 | CAUAUACUCCCUCCGUCCGAAA | 61 | 60 | Mariner |
| bdi-miR1135 tae-miR1135 | UUUCGACAAGUAAUUCCGACCGGA CUGCGACAAGUAAUUCCGAACGGA | 201 | 201 | Mariner |
| tae-miR1136 | UUGUCGCAGGUAUGGAUGUAUCUA | 226 | 226 | Mariner |
| tae-miR1137 | UAGUACAAAGUUGAGUCAUC | 146 | 146 | Mariner |
| tae-miR1139 | AGAGUAACAUACACUAGUAACA | 168 | 164 | Harbinger |
| hvu-miR1436 | ACAUUAUGGGACGGAGGGAGU | 397 | 247 | Mariner, CACTA |
| osa-miR1439 | UUUUGGAACGGAGUGAGUAUU | 172 | 172 | Mariner |
| ath-miR5021 | UGAGAAGAAGAAGAAGAAAA | 19 | 19 | Trinucleotide, CACTA |
| bdi-miR5203 | ACUUAUUAUGGACCGGAGGGA | 11 | 9 | Mariner |
Repeats were classified using the system proposed by Wicker et al. [29].
Figure 1Representation of putative novel wheat miRNAs in 1AL survey sequences.
39 putative novel wheat miRNAs reported by Wei et al. [17] were screened for presence in the 1AL survey sequences. ‘1AL sequences’ = number of sequences similar to each putative miRNA with good hairpin characteristics. MITEs = the number of the same sequences that were identified as Miniature Inverted Terminal repeat Elements.
Putative novel wheat miRNAs discovered by Wei et al. [17] identified in chromosome 1AL survey sequences.
| Putative wheat-specific miRNA | Mature miRNA location | Pre-miRNA characteristics | Matched sequence read | ||||||||
| ID | Length | Sequence | Mismatch | Start | End | Arm | Length | MFE | GC% | MFEI | |
| tae-miR2003 | 22 | CGGUUGGGCUGUAUGAUGGCGA | 0 | 73 | 94 | 3′ | 115 | −56.6 | 44.35 | 1.11 | F0RUNSI02FRR0A |
| tae-miR2003 | 22 | CGGUUGGGCUGUAUGAUGGCGA | 0 | 74 | 95 | 3′ | 116 | −60.2 | 43.10 | 1.20 | F0RUNSI01AQXNL |
| tae-miR2003 | 22 | CGGUUGGGCUGUAUGAUGGCGA | 0 | 76 | 97 | 3′ | 118 | −58.9 | 42.37 | 1.18 | F1NBZEY01D4XAJ |
| tae-miR2003 | 22 | CGGUUGGGCUGUAUGAUGGCGA | 0 | 74 | 95 | 3′ | 116 | −63.5 | 43.97 | 1.25 | F1ADE5F01DCEMU |
| tae-miR2003 | 22 | CGGUUGGGCUGUAUGAUGGCGA | 0 | 74 | 95 | 3′ | 116 | −62.8 | 43.10 | 1.26 | F1ADE5F01A2VHK |
| tae-miR2003 | 22 | CGGUUGGGCUGUAUGAUGGCGA | 0 | 74 | 95 | 3′ | 116 | −62.8 | 43.10 | 1.26 | F1ADE5F01A9LZ0 |
| tae-miR2007 | 22 | CAAGAUAUUGGGUAUUUCUGUC | 0 | 45 | 66 | 3′ | 87 | −35.3 | 26.44 | 1.53 | F1ADE5F01CRURV |
| tae-miR2007 | 22 | CAAGAUAUUGGGUAUUUCUGUC | 0 | 46 | 67 | 3′ | 88 | −42.2 | 26.14 | 1.83 | F1ADE5F01BJ39D |
| tae-miR2007 | 22 | CAAGAUAUUGGGUAUUUCUGUC | 0 | 46 | 67 | 3′ | 88 | −40.7 | 27.27 | 1.70 | F1ADE5F01BJ39D |
| tae-miR2018 | 20 | GC | 1 | 20 | 39 | 5′ | 328 | −104 | 41.77 | 0.76 | F003IAL01AFVSB |
| tae-miR2020 | 21 | AUAGCAUCAUCCAUCCUACC | 1 | 20 | 40 | 5′ | 109 | −53.8 | 49.54 | 1.00 | F1ADE5F01DWH7W |
| tae-miR2023-a | 22 | UUUUGCCGGUUGAACGACCUCA | 0 | 20 | 41 | 5′ | 142 | −70.2 | 62.68 | 0.79 | F1ADE5F01D1QBQ |
| tae-miR2023-a | 22 | UUUUGCCGGUUGAACGACCUCA | 0 | 20 | 41 | 5′ | 140 | −74.1 | 62.14 | 0.85 | F1ADE5F01D2FWK |
| tae-miR2023-b | 22 | UUUUGCUGGUUGAACGACCUCA | 0 | 20 | 41 | 5′ | 142 | −75.1 | 61.97 | 0.85 | F1ADE5F01D2FWK |
| tae-miR2032 | 21 | UGUAGAUACUCCCUAAGGCUU | 0 | 76 | 96 | 3′ | 117 | −38.8 | 38.46 | 0.86 | F2MIQBM01ARRO6 |
This pre-miRNA had an identical wheat EST match.
pre-miRNA sequence matched a transposable element, but only one copy was found in 1AL.
miR2023 and miR5050 (see Tables 1 & 2) derive from opposite arms of the same miRNA:miRNA* duplex.
miR2032 is identical to miR5200 (see Table 1).
Mismatches to the conserved miRNA sequence are underlined and in bold.
MFE = Minimum Folding free Energy of predicted hairpin secondary structure.
MFEI = Minimum Folding Energy Index, calculated as described by Yin et al. [40].
Figure 2Comparison of mature miRNA locations in hairpins predicted by sequence similarity and by miRPara.
Each sequence is preceded by the unique sequence read ID with the start position of the predicted hairpin appended. The ranges of possible mature miRNA locations predicted by miRPara are shaded in blue. The location of the conserved mature miRNAs found by similarity search is shown in capital letters and underlined. Nucleotides of the conserved mature miRNA that are found outside the predicted hairpin are highlighted in red.
Figure 3Representation of different repeat element families in BAC-end sequences and predicted hairpins from 1AL.
1AL BES sequences were obtained as described previously [26]. Hairpins were predicted using miRPara’s plant miRNA prediction model. Repeat content was calculated as the cumulative length of all nucleotides marked as being part of repetitive elements, and expressed as a percentage of the total length of each dataset. The 1AL BESs included 7,568,093 nt of which 81.97% matched known repeat elements. The predicted hairpins included 8,081,278 nt of which 72.8% matched known repeat elements.