| Literature DB >> 18522722 |
Daniela Lascaro1, Stefano Castellana, Giuseppe Gasparre, Giovanni Romeo, Cecilia Saccone, Marcella Attimonelli.
Abstract
BACKGROUND: To a greater or lesser extent, eukaryotic nuclear genomes contain fragments of their mitochondrial genome counterpart, deriving from the random insertion of damaged mtDNA fragments. NumtS (Nuclear mt Sequences) are not equally abundant in all species, and are redundant and polymorphic in terms of copy number. In population and clinical genetics, it is important to have a complete overview of NumtS quantity and location. Searching PubMed for NumtS or Mitochondrial pseudo-genes yields hundreds of papers reporting Human NumtS compilations produced by in silico or wet-lab approaches. A comparison of published compilations clearly shows significant discrepancies among data, due both to unwise application of Bioinformatics methods and to a not yet correctly assembled nuclear genome. To optimize quantification and location of NumtS, we produced a consensus compilation of Human NumtS by applying various bioinformatics approaches.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18522722 PMCID: PMC2447851 DOI: 10.1186/1471-2164-9-267
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Number of NumtS reported in a selected group of published compilations. Horizontal axis reports the reference number in this paper of the analysed compilation.
Differences in Blastn hit numbers.
| 1.-nothing | 100, 100, 50 | 435 |
| 2.-nothing | 5000, 5000, 1000 | 4903 |
| 3.-nothing | 10000, 10000, 1000 | 4903 |
| 4.-Homo sapiens BUT NOT mitochondrion | 100, 100, 50 | 116 |
| 5.-Homo sapiens BUT NOT mitochondrion | 5000, 5000, 1000 | 2497 |
Data are obtained by comparing reference Human mt Genomes (J01415.2 in GenBank) with Human Nuclear DNA sequences in differing conditions. Maximum fixed Description number, Graphic display number and Alignments number do not fit Hits# obtained; thus, true hit number is that obtained when set values are higher than number of obtained hits.
Differences in Blastn hit numbers by changing human genome searched datasets.
| 6. – nothing | 4903 | Human Complete mt genomes, Human D-loop, other species |
| 7. – Homo sapiens BUT NOT mitochondrion | 2497 | Genomic DNA, cDNA, D-loop also from other species |
| 8. – Homo sapiens [ORGN] | 4903 | Human Complete mt genomes, D-loop, |
| 9. – Homo sapiens [ORGN] NOT mitochondrion [PROP] | 4903 | Human Complete mt genomes, D-loop, other species |
| 10. – Homo sapiens [ORGN] AND genomic DNA [MOLTYPE] NOT mitochondrion [PROP] | 2154 | Human Genomic DNA, cDNA, 2 complete mt genomes |
| 11. – Homo sapiens [ORGN] NOT mitochondrion [ALL] | 2497 | Human Genomic DNA, cDNA, D-loop |
| 12. – Homo sapiens [ORGN] AND genomic DNA [MOLTYPE] NOT mitochondrion [ALL] | 123 | Human Genomic DNA, cDNA |
| 13. – Homo sapiens [ORGN] AND genomic DNA [MOLTYPE] NOT (mitochondrion OR mitochondrial) [ALL] | 119 | Human Genomic DNA |
| 14. – nothing | 16350 | Genomic, D-loop, mt genomes other organisms |
| 15. – Homo sapiens NOT mitochondrion | 2097 | Genomic DNA, D-loop, other organisms |
| 16. – Homo sapiens [ORGN] | 2106 | Human Genomic DNA, D-loop, mt complete genomes |
| 17. – Homo sapiens [ORGN] NOT (mitochondrion OR mitochondrial) [ALL] | 2097 | Human Genomic DNA, Human D-loop |
| 18. – Homo sapiens [ORGN] NOT mitochondrion [PROP] | 2154 | 2 human mt genomes, Genomic DNA |
| 20. – Homo sapiens [ORGN] NOT mitochondrion [ALL] | 2145 | HGPC+Celera |
Different Hits number and different class of selected entries obtained by changing subject sequence datasets through "Limits by Entrez query" function. Resulting subject sequences are subsets of non-redundant nucleotide database (query 6 to 13), ref_seq genome database (query 14 to 17), Chromosome human genome database (query 18 to 20) all available through Blastn at NCBI. E-value set at 0.001. Description#, graphic display# and alignments# were set at maximum values allowed. Runs 19 and 20, resulting in 2145 hits, were those most suited to our needs, i.e., to select completely assembled human nuclear sequences.
Figure 2PCR amplification of 41 selected NumtS. PCR amplification of (a) 27 selected NumtS in 4 healthy subjects from different ethnic groups (haplogroups H2b (Europe), L2a1c1 (North Africa), I3a (Latin America) and G1a1a (Japan)); (b) 14 in the H2b sample only. Primers were designed with PRIMER3 software, testing the amplification of the full NumtS (external-external primers) or part of it (external-internal primers or internal-internal primers). In NumtS 41–54, samples H2b and L2a1c1 have shorter amplicons, due to a lower number of repetitions. Triple band in NUMTS 12 was due to aspecific amplification, subsequently reduced by increasing stringency of primers annealing. Abbreviations: ext for external, int for internal; the number below each band refers to the NumtS code assigned within the RHNumtS compilation.
Longest NumtS of Reference Human NumtS compilation (RHNumtS)
| 1 | 1 | + | 3914 | 9755 | 5842 | 554327 | 560167 | 5840 | 3 | 4 |
| 3 | 1 | - | 6060 | 9316 | 3257 | 107146786 | 107150029 | 3243 | 15 | 1.25 |
| 4 | 1 | - | 1051 | 3162 | 2112 | 120286496 | 120288780 | 2284 | 173 | 0.75 |
| 9 | 1 | - | 9782 | 13593 | 3812 | 233768514 | 233772288 | 3774 | 39 | 1.75 |
| 10 | 1 | - | 636 | 6189 | 5554 | 236170699 | 236176250 | 5551 | 4 | 1.75 |
| 11 | 1 | - | 12218 | 16563 | 4346 | 236177249 | 236181582 | 4333 | 14 | 1.75 |
| 14 | 2 | + | 12220 | 16475 | 4256 | 82896241 | 82900506 | 4265 | 10 | 1.75 |
| 19 | 2 | + | 596 | 5892 | 5297 | 117495259 | 117500547 | 5288 | 10 | 3.25 |
| 20 | 2 | + | 9196 | 13574 | 4379 | 120685762 | 120690928 | 5166 | 788 | 1.75 |
| 22 | 2 | + | 3799 | 15354 | 11556 | 130745853 | 130757329 | 11476 | 81 | 2 |
| 23 | 2 | - | 10657 | 15398 | 4742 | 131843104 | 131847799 | 4695 | 48 | 1.75 |
| 24 | 2 | - | 3799 | 10519 | 6721 | 131853669 | 131860205 | 6536 | 186 | 1.5 |
| 25 | 2 | - | 598 | 5892 | 5295 | 140691291 | 140698242 | 6951 | 1657 | 3.5 |
| 27 | 2 | - | 9166 | 16563 | 7398 | 143566386 | 143574013 | 7627 | 230 | 1.75 |
| 30 | 2 | + | 11801 | 15067 | 3267 | 155875844 | 155879111 | 3267 | 1 | 2 |
| 33 | 2 | - | 10440 | 13131 | 2692 | 201785264 | 201787949 | 2685 | 8 | 2 |
| 34 | 2 | + | 6966 | 11240 | 4275 | 203187200 | 203191742 | 4542 | 268 | 2 |
| 36 | 2 | - | 596 | 3105 | 2510 | 212346765 | 212349578 | 2813 | 304 | 1.75 |
| 37 | 2 | + | 4854 | 7590 | 2737 | 212350179 | 212352885 | 2706 | 32 | 1.5 |
| 58 | 3 | - | 6604 | 9316 | 2713 | 89718693 | 89721366 | 2673 | 41 | 2 |
| 61 | 3 | - | 9787 | 12340 | 2554 | 108095676 | 108098627 | 2951 | 398 | 1.25 |
| 62 | 3 | + | 13536 | 15573 | 2038 | 108100533 | 108101514 | 981 | 1058 | 1.5 |
| 71 | 4 | - | 9781 | 12301 | 2521 | 25328634 | 25331437 | 2803 | 283 | 1.75 |
| 76 | 4 | - | 9485 | 16561 | 7077 | 65155336 | 65160181 | 4845 | 2233 | 1.75 |
| 79 | 4 | + | 596 | 3105 | 2510 | 117438367 | 117440855 | 2488 | 23 | 1.75 |
| 81 | 4 | - | 672 | 15325 | 14654 | 156592474 | 156607061 | 14587 | 68 | 3.5 |
| 88 | 5 | - | 341 | 2697 | 2357 | 79981597 | 79983943 | 2346 | 12 | 4.75 |
| 89 | 5 | + | 12662 | 16124 | 3463 | 93928917 | 93932379 | 3462 | 2 | 5 |
| 91 | 5 | - | 6117 | 15183 | 9067 | 99409541 | 99418648 | 9107 | 41 | 4.5 |
| 94 | 5 | - | 10270 | 15488 | 5219 | 134286898 | 134292116 | 5218 | 2 | 5 |
| 97 | 6 | + | 8437 | 10622 | 2186 | 92493159 | 92493750 | 591 | 1596 | 1.5 |
| 100 | 6 | + | 7451 | 11649 | 4199 | 154028400 | 154032608 | 4208 | 10 | 1.75 |
| 102 | 7 | + | 8505 | 15238 | 6734 | 57238827 | 57245471 | 6644 | 91 | 3.75 |
| 103 | 7 | + | 3819 | 15924 | 12106 | 57257414 | 57269467 | 12053 | 54 | 6 |
| 104 | 7 | - | 3117 | 11880 | 8764 | 63201998 | 63210482 | 8484 | 281 | 1.5 |
| 105 | 7 | - | 5513 | 8246 | 2734 | 68433640 | 68436926 | 3286 | 553 | 1.75 |
| 106 | 7 | + | 13065 | 15369 | 2305 | 111799937 | 111802234 | 2297 | 9 | 2 |
| 107 | 7 | - | 2793 | 6553 | 3761 | 141147677 | 141151744 | 4067 | 307 | 2 |
| 108 | 7 | + | 600 | 3095 | 2496 | 142052596 | 142055088 | 2492 | 5 | 2.25 |
| 110 | 8 | + | 636 | 4888 | 4253 | 32988565 | 32992739 | 4174 | 80 | 1.5 |
| 113 | 8 | - | 656 | 4880 | 4225 | 47858273 | 47861837 | 3564 | 662 | 4 |
| 114 | 8 | - | 9176 | 16569 | 7394 | 68655653 | 68662552 | 6899 | 496 | 1.75 |
| 117 | 8 | + | 1013 | 7114 | 6102 | 104164459 | 104171823 | 7364 | 1263 | 3.5 |
| 120 | 9 | + | 1294 | 13574 | 12281 | 5082095 | 5100699 | 18604 | 6324 | 1.5 |
| 121 | 9 | + | 598 | 3093 | 2496 | 33646633 | 33649128 | 2495 | 2 | 1.5 |
| 125 | 9 | - | 4773 | 6873 | 2101 | 82368550 | 82370501 | 1951 | 151 | 1.75 |
| 126 | 9 | + | 9202 | 11598 | 2397 | 93911111 | 93913772 | 2661 | 265 | 3.25 |
| 128 | 10 | + | 2417 | 4831 | 2415 | 20075681 | 20077114 | 1433 | 983 | 2.25 |
| 131 | 10 | + | 636 | 3105 | 2470 | 57027643 | 57030440 | 2797 | 328 | 1.5 |
| 132 | 10 | - | 3821 | 7698 | 3878 | 71020912 | 71025687 | 4775 | 898 | 1.75 |
| 134 | 11 | - | 577 | 2972 | 2396 | 10486010 | 10488403 | 2393 | 4 | 6 |
| 140 | 11 | + | 9820 | 15243 | 5424 | 80940264 | 80945683 | 5419 | 6 | 1.75 |
| 142 | 11 | - | 724 | 9666 | 8943 | 102778067 | 102786933 | 8866 | 78 | 3.25 |
| 150 | 13 | + | 13052 | 16472 | 3421 | 95142796 | 95146598 | 3802 | 382 | 1.5 |
| 156 | 14 | + | 11367 | 15325 | 3959 | 83708940 | 83713093 | 4153 | 195 | 1.75 |
| 158 | 15 | + | 9786 | 15318 | 5533 | 56229853 | 56235023 | 5170 | 364 | 1.5 |
| 159 | 16 | - | 2468 | 7683 | 5216 | 3357487 | 3362068 | 4581 | 636 | 3.5 |
| 160 | 16 | - | 8688 | 15327 | 6640 | 10720543 | 10726494 | 5951 | 690 | 1.5 |
| 164 | 17 | - | 596 | 5979 | 5384 | 19442485 | 19449425 | 6940 | 1557 | 3.5 |
| 165 | 17 | + | 14365 | 16569 | 2205 | 21942648 | 21944853 | 2205 | 1 | 3.75 |
| 166 | 17 | + | 1 | 11112 | 11112 | 21944854 | 21955968 | 11114 | 3 | 3.5 |
| 171 | 20 | - | 649 | 4038 | 3390 | 55366111 | 55369449 | 3338 | 53 | 3.5 |
| 174 | X | - | 581 | 5892 | 5312 | 55221910 | 55227180 | 5270 | 43 | 4 |
| 175 | X | + | 1049 | 3161 | 2113 | 61976282 | 61978565 | 2283 | 171 | 1.75 |
| 182 | X | + | 1054 | 4415 | 3362 | 142345841 | 142349570 | 3729 | 368 | 1.75 |
| 184 | Y | + | 596 | 4477 | 3882 | 8294669 | 8300289 | 5620 | 1739 | 1.25 |
Each NumtS was assigned an identifying numeric code, according to increasing values starting from chromosome 1; a letter code (A, B, or C) was assigned to only 3 NumtS, because they were located later, when all other NumtS had already been characterised. Chromosome and strand location is listed for each NumtS; both mt and nuclear coordinates of NumtS ("mt start" and "mt end", "chr start" and "chr end"); mitochondrial and nuclear fragment lengths; "difference" between mitochondrial and nuclear fragment lengths; and RHNumtS quality score are also reported. Additional file 1 reports the complete RHNumtS compilation: there, NumtS exclusively identified by Blastn are shown in grey and NumtS exclusively identified by Blastn, but only on Human Genome Celera Assembly, shown in black in columns "Nuc start" and "Nuc end"; repeated NumtS in bold type and underlined in columns "Mt start" and "Mt end". NumtS with scores higher than 3 are shown in grey in column "score".
Amplified and sequenced NumtS
| 2 | H2B | H2B |
| 12 | H2B | H2B |
| 13 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 28 | H2B | H2B |
| 38 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 41 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 42 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 43 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 44 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 45 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 46 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 47 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 48 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 49 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 50 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 51 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 52 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 53 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 54 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 63 | H2B, L2a1c1,G1a1a, I3a | H2b |
| 72 | H2B | H2B |
| 73 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 75 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 77 | H2B | H2B |
| 82 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 87 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a, I3a |
| 101 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 109 | H2B | H2B |
| 112 | H2B | H2B |
| 115 | H2B | H2B |
| 122 | H2B, L2a1c1,G1a1a, I3a | H2B, L2a1c1,G1a1a |
| 133 | H2B | H2B |
| 139 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 151 | H2B | H2B |
| 153 | H2B | H2B |
| 157 | H2B | H2B |
| 169 | H2B, L2a1c1,G1a1a, I3a | H2B |
| 170 | H2B, L2a1c1,G1a1a, I3a | sequencing failed |
| 179 | H2B | sequencing failed |
| 187 | H2B | sequencing failed |
| C | H2B, L2a1c1,G1a1a, I3a | H2B |
For each of the 41 analysed NumtS, the mt haplogroup code of the sample, if amplified and if sequenced, is reported. NumtS 170, 179 and 187 in the H2b sample and 122 in the I3a sample have not been sequenced because primers were not optimal for sequencing.
Figure 3Mapping of Human NumtS on Human mitochondrial genome. Column 1 reports mt gene name, column 2 and 3 report location of the mt gene inside the genome and, starting from column 4, the RHNumtS identifiers of the NumtS containing the mt gene are shown. An RHNumtS identifier present in three contiguous genes indicates that NumtS contains the centrally located gene in its entire length; otherwise, it is partially contained. Green: regulatory regions; yellow: tRNA genes; red: protein coding genes; pink: rRNA genes.