| Literature DB >> 29126205 |
Jana Cechová1, Jirí Lýsek2, Martin Bartas1,3, Václav Brázda1.
Abstract
Motivation: The NCBI database contains mitochondrial DNA (mtDNA) genomes from numerous species. We investigated the presence and locations of inverted repeat sequences (IRs) in these mtDNA sequences, which are known to be important for regulating nuclear genomes.Entities:
Mesh:
Year: 2018 PMID: 29126205 PMCID: PMC6030915 DOI: 10.1093/bioinformatics/btx729
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Variability of length and amount of mtDNAs. Box plots show sequence length interquartile ranges for different species groups. The whiskers represent the minimum and maximum values. Numbers of species in each group is visualized with bars (scale is on the secondary vertical axis)
Fig. 2.Frequency of IRs in mtDNAs for subgroups and numbers of mtDNAs. The box plot shows the interquartile ranges of IR frequencies per 1000 bp in different species groups. Whiskers represent the minimum and maximum values
Numbers and frequencies of IRs according to size
| IR size | Number in dataset | IR/1000bp | IR size | Number in dataset | IR/1000bp | IR size | Number in dataset | IR/1000bp |
|---|---|---|---|---|---|---|---|---|
| 6 | 4460126 | 24.8303 | 15 | 4359 | 0.0243 | 24 | 254 | 0.0014 |
| 7 | 1841110 | 10.2498 | 16 | 2849 | 0.0159 | 25 | 113 | 0.0006 |
| 8 | 717399 | 3.9939 | 17 | 1807 | 0.0101 | 26 | 108 | 0.0006 |
| 9 | 289601 | 1.6123 | 18 | 1177 | 0.0066 | 27 | 91 | 0.0005 |
| 10 | 117709 | 0.6553 | 19 | 889 | 0.0049 | 26 | 80 | 0.0004 |
| 11 | 52939 | 0.2947 | 20 | 621 | 0.0035 | 29 | 58 | 0.0003 |
| 12 | 26048 | 0.1450 | 21 | 490 | 0.0027 | 30 | 65 | 0.0004 |
| 13 | 14252 | 0.0793 | 22 | 297 | 0.0017 | >30 | 477 | 0.0027 |
| 14 | 7556 | 0.0421 | 23 | 228 | 0.0013 |
MtDNA sizes and IR frequencies and lengths
| Group name | Number of seq. | Median size [bp] | Shortest sequence | Longest sequence | IR/Kbp – mean range | Longest IR for 50% of seq. [bp] |
|---|---|---|---|---|---|---|
| 24 | 5 977 | Plasmodium vivax | Babasia microti | 47 | 14 | |
| (5 882 bp) | (11 109 bp) | 42–56 | ||||
| 76 | 46 840 | Physarum polycephalum | Chromera velia | 74 | 17 | |
| (14 503 bp) | (430 597 bp) | 28–156 | ||||
| 48 | 45 175 | Polytomella parva | Pseudendoclonium akinetum | 47 | 18 | |
| (3 018 bp) | (95 880 bp) | 17–81 | ||||
| 174 | 151 983 | Vicia faba | Corchorus capsularis | 34 | 18 | |
| (1 478 bp) | (1 999 602 bp) | 27–59 | ||||
| 8 | 69 465 | Mesostigma viride | Chlorokybus atmophyticus | 46 | 17 | |
| (42 424 bp) | (201 763 bp) | 35–76 | ||||
| 183 | 35 655 | Cryphonectria parasitica | Sclerotinia borealis | 85 | 17 | |
| (1 364 bp) | (203 051 bp) | 21–249 | ||||
| 29 | 69 195 | Moniliophthora roreri | Rhizoctonia solani | 72 | 15 | |
| (9 745 bp) | (235 849 bp) | 37–140 | ||||
| 23 | 58 788 | Spizellomyces punctatus | Gigaspora rosea DAOM | 54 | 15 | |
| (1 136 bp) | 194757 (97 350 bp) | 28–138 | ||||
| 96 | 13 968 | Taenia pisiformis | Schmidtea mediterranea | 35 | 12 | |
| (13 383 bp) | (27 133 bp) | 11–98 | ||||
| 137 | 13 960 | Xiphinema americanum | Romanomermis culicivorax | 56 | 15 | |
| (12 626 bp) | (26 194 bp) | 19–131 | ||||
| 2 294 | 16 595 | Gadus ogac | Rhinochimaera pacifica | 28 | 12 | |
| (15 564 bp) | (24 889 bp) | 22–47 | ||||
| 992 | 15 534 | Anaticola crassicornis | Hydropsyche pellucidula | 89 | 15 | |
| (8 118 bp) | (25 004 bp) | 23–195 | ||||
| 231 | 17 175 | Gegeneophis ramaswamii | Breviceps adspersus | 36 | 13 | |
| (15 897 bp) | (28 757 bp) | 22–56 | ||||
| 279 | 17 107 | Sphenodon punctatus | Heteronotia binoei | 30 | 12 | |
| (15 181 bp) | (25 972 bp) | 19–48 | ||||
| 534 | 16 826 | Malurus melanocephalus | Penelopides panini | 22 | 12 | |
| (15 568 bp) | (22 737 bp) | 18–28 | ||||
| 860 | 16 543 | Macrotis lagotis | Lepus timidus | 32 | 11 | |
| (15 289 bp) | (17 755 bp) | 20–59 | ||||
| 1 074 | 15 754 | Clathrina clathrus | Anadara sativa | 48 | 12 | |
| (5 596 bp) | (48 161 bp) | 12–157 | ||||
| 73 | 35 594 | Galdieria sulphuraria | Phaeodactylum tricornutum | 48 | 13 | |
| (21 428 bp) | (77 356 bp) | 9–84 |
Fig. 3.Differences in IR frequency by DNA locus. The chart shows IR frequencies comparison per 1000 bp between ‘gene’ annotation and other annotated locations from the NCBI database. We analyzed frequencies of all IRs (all) and of IRs with lengths 8 bp and longer (8+), 10 bp and longer (10+) and 12 bp and longer (12+) within annotated locations (inside) and before and after annotated locations (Color version of this figure is available at Bioinformatics online.)