| Literature DB >> 30140690 |
Václav Brázda1, Jiří Lýsek2, Martin Bartas3, Miroslav Fojta1.
Abstract
Chloroplasts are key organelles in the management of oxygen in algae and plants and are therefore crucial for all living beings that consume oxygen. Chloroplasts typically contain a circular DNA molecule with nucleus-independent replication and heredity. Using "palindrome analyser" we performed complete analyses of short inverted repeats (S-IRs) in all chloroplast DNAs (cpDNAs) available from the NCBI genome database. Our results provide basic parameters of cpDNAs including comparative information on localization, frequency, and differences in S-IR presence. In a total of 2,565 cpDNA sequences available, the average frequency of S-IRs in cpDNA genomes is 45 S-IRs/per kbp, significantly higher than that found in mitochondrial DNA sequences. The frequency of S-IRs in cpDNAs generally decreased with S-IR length, but not for S-IRs 15, 22, 24, or 27 bp long, which are significantly more abundant than S-IRs with other lengths. These results point to the importance of specific S-IRs in cpDNA genomes. Moreover, comparison by Levenshtein distance of S-IR similarities showed that a limited number of S-IR sequences are shared in the majority of cpDNAs. S-IRs are not located randomly in cpDNAs, but are length-dependently enriched in specific locations, including the repeat region, stem, introns, and tRNA regions. The highest enrichment was found for 12 bp and longer S-IRs in the stem-loop region followed by 12 bp and longer S-IRs located before the repeat region. On the other hand, S-IRs are relatively rare in rRNA sequences and around introns. These data show nonrandom and conserved arrangements of S-IRs in chloroplast genomes.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30140690 PMCID: PMC6081594 DOI: 10.1155/2018/1097018
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Numbers and frequencies of S-IRs according to size.
| IR size | Amount in dataset | IR frequency per 1000bp | IR size | Amount in dataset | IR frequency per 1000bp | IR size | Amount in dataset | IR frequency per 1000bp |
|---|---|---|---|---|---|---|---|---|
| 6 | 10,351,040 | 26.899 | 15 | 13,370 | 0.035 | 24 | 1,619 | 0.004 |
|
| ||||||||
| 7 | 4,157,127 | 10.803 | 16 | 6,641 | 0.017 | 25 | 1,005 | 0.003 |
|
| ||||||||
| 8 | 1,656,101 | 4.304 | 17 | 5,505 | 0.014 | 26 | 760 | 0.002 |
|
| ||||||||
| 9 | 637,184 | 1.656 | 18 | 3,595 | 0.009 | 27 | 1,231 | 0.003 |
|
| ||||||||
| 10 | 264,249 | 0.687 | 19 | 2,783 | 0.007 | 28 | 450 | 0.001 |
|
| ||||||||
| 11 | 113,649 | 0.295 | 20 | 2,676 | 0.007 | 29 | 350 | 0.001 |
|
| ||||||||
| 12 | 50,229 | 0.131 | 21 | 2,108 | 0.005 | 30 | 302 | 0.001 |
|
| ||||||||
| 13 | 27,833 | 0.072 | 22 | 3,090 | 0.008 | >30 | 1,105 | 0.004 |
|
| ||||||||
| 14 | 13,935 | 0.036 | 23 | 1,577 | 0.004 | |||
Figure 1Variability of length of cpDNAs. Box plots show sequence length interquartile ranges for different species groups. The whiskers represent the minimum and maximum values.
Figure 2Frequency of S-IRs in mtDNAs for subgroups and numbers of mtDNAs. The box plot shows the interquartile ranges of S-IR frequencies per 1000 bp in different species groups. Whiskers represent the minimum and maximum values.
cpDNA sizes and S-IR frequencies and lengths.
| Group name | Number of seq. | Median size [bp] | Shortest sequence | Longest sequence | IR/kbp | Longest S-IR for 50% of seq. [bp] |
|---|---|---|---|---|---|---|
|
| 9 | 91,616 | Monomorphina aenigmatica | Euglena gracilis | 68 | 18 |
| (74,746 bp) | (143,171 bp) | 56 – 79 | ||||
|
| 37 | 122,660 | Aureococcus anophagefferens | Cylindrotheca closterium | 57 | 25 |
| (89,599 bp) | (165,809 bp) | 43 – 69 | ||||
|
| 60 | 171,284 | Cyanidioschyzon merolae | Bulboplastis apyrenoidosa | 59 | 19 |
| (149,987 bp) | (610,063 bp) | 34 -83 | ||||
|
| 90 | 157,916 | Ostreococcus tauri | Floydiella terrestris | 61 | 27 |
| (71,666 bp) | (521,168 bp) | 27 – 102 | ||||
|
| 11 | 142,017 | Spirogyra maxima | Cosmarium botrytis | 51 | 24 |
| (129,954 bp) | (207,850 bp) | 32 – 64 | ||||
|
| 8 | 123,868 | Syntrichia ruralis | Takakia lepidozioides | 67 | 32 |
| (122,630 bp) | (149,016 bp) | 44 – 78 | ||||
|
| 49 | 151,126 | Diplazium unilobum | Lygodium japonicum | 38 | 18 |
| (127,840 bp) | (157,260 bp) | 34 – 52 | ||||
|
| 85 | 127,659 | Cathaya argyrophylla | Macrozamia mountperriensis | 44 | 23 |
| (107,122 bp) | (166,341 bp) | 38 – 50 | ||||
|
| 13 | 159,881 | Schisandra chinensis | Trithuria inconspicua | 40 | 18 |
| (146,859 bp) | (165,389 bp) | 38 – 42 | ||||
|
| 41 | 159,443 | Cassytha filiformis | Piper kadsura | 40 | 18 |
| (114,622 bp) | (161,486 bp) | 39 – 43 | ||||
|
| 14 | 163,856 | Zostera marina | Wolffiella ryophyte | 46 | 22 |
| (143,877 bp) | (169,337 bp) | 40 – 50 | ||||
|
| 10 | 154,205 | Burmannia oblonga | Tacca leontopetaloides | 47 | 22 |
| (39,386 bp) | (162,477 bp) | 42 – 63 | ||||
|
| 41 | 152,677 | Amana wanzhensis | Heloniopsis tubiflora | 45 | 18 |
| (150,576 bp) | (158,229 bp) | 42 – 46 | ||||
|
| 125 | 153,953 | Oberonia japonica | Cypripedium formosanum | 44 | 24 |
| (142,996 bp) | (178,131 bp) | 42 – 64 | ||||
|
| 290 | 139,171 | Aegilops cylindrica | Carex neurocarpa | 41 | 17 |
| (113,490 bp) | (181,397 bp) | 38 – 52 | ||||
|
| 49 | 157,817 | Kingdonia uniflora | Berberis koreana | 42 | 19 |
| (147,378 bp) | (166,758 bp) | 39 – 45 | ||||
|
| 9 | 128,744 | Schoepfia jasminodora | Erythropalum scandens | 45 | 18 |
| (118,743 bp) | (156,154 bp) | 41 – 48 | ||||
|
| 10 | 152,692 | Phedimus takesimensis | Liquidambar formosana | 41 | 20 |
| (147,048 bp) | (160,410 bp) | 40 – 43 | ||||
|
| 32 | 151,686 | Carnegiea gigantea | Drosera rotundifolia | 45 | 19 |
| (113,064 bp) | (192,912 bp) | 40 – 47 | ||||
|
| 398 | 153,377 | Monotropa hypopitys | Adenophora divaricata | 43 | 19 |
| (35,336 bp) | (176,331 bp) | 38 – 61 | ||||
|
| 522 | 159,441 | Cytinus hypocistis | Pelargonium transvaalense | 45 | 20 |
| (19,400 bp) | (242,575 bp) | 35 – 75 | ||||
|
| 662 | 155,196 | Pilostyles aethiopica | Pleodorina starrii | 46 | 20 |
| (11,348) | (269,857) | 28 – 192 |
Figure 3Differences in S-IR frequency by DNA locus. The chart shows S-IR frequencies per 1000 bp between “gene” annotation and other annotated locations from the NCBI database. We analyzed frequencies of all S-IRs (all) and of S-IRs with lengths 8 bp and longer (8+), 10 bp and longer (10+), and 12 bp and longer (12+) within annotated locations (inside) and before (100 bp) and after (100 bp) annotated locations.