| Literature DB >> 24040623 |
Yosr Bouhlal1, Selena Martinez, Henry Gong, Kevin Dumas, Joseph T C Shieh.
Abstract
When applying genome-wide sequencing technologies to disease investigation, it is increasingly important to resolve sequence variation in regions of the genome that may have homologous sequences. The human mitochondrial genome challenges interpretation given the potential for heteroplasmy, somatic variation, and homologous nuclear mitochondrial sequences (numts). Identical twins share the same mitochondrial DNA (mtDNA) from early life, but whether the mitochondrial sequence remains similar is unclear. We compared an adult monozygotic twin pair using high throughput-sequencing and evaluated variants with primer extension and mitochondrial pre-enrichment. Thirty-seven variants were shared between the twin individuals, and the variants were verified on the original genomic DNA. These studies support highly identical genetic sequence in this case. Certain low-level variant calls were of high quality and homology to the mitochondrial DNA, and they were further evaluated. When we assessed calls in pre-enriched mitochondrial DNA templates, we found that these may represent numts, which can be differentiated from mtDNA variation. We conclude that twin identity extends to mitochondrial DNA, and it is critical to differentiate between numts and mtDNA in genome sequencing, particularly since significant heteroplasmy could influence genome interpretation. Further studies on mtDNA and numts will aid in understanding how variation occurs and persists.Entities:
Keywords: genome; heteroplasmy; mitochondrial; primer extension; sequencing; twins
Year: 2013 PMID: 24040623 PMCID: PMC3768015 DOI: 10.1002/mgg3.20
Source DB: PubMed Journal: Mol Genet Genomic Med ISSN: 2324-9269 Impact factor: 2.183
Short Tandem Repeat (STR) marker analysis for the twin pair and a DNA control, showing that twin individuals share the same alleles at 16 different loci across the genome confirming their monozygosity at >0.99999 confidence
| Alleles | |||
|---|---|---|---|
| Locus | Twin A | Twin B | Control |
| AMEL | XX | XX | XX |
| CSF1PO | 12 12 | 12 12 | 10 12 |
| D13S317 | 10 11 | 10 11 | 11 11 |
| D16S539 | 10 10 | 10 10 | 11 12 |
| D18S51 | 14 21 | 14 21 | 15 19 |
| D21S11 | 31 31 | 31 31 | 30 30 |
| D3S1358 | 17 17 | 17 17 | 14 15 |
| D5S818 | 11 11 | 11 11 | 11 11 |
| D7S820 | 10 11 | 10 11 | 10 11 |
| D8S1179 | 13 13 | 13 13 | 13 13 |
| FGA | 25 27 | 25 27 | 23 24 |
| Penta_D | 9 10 | 9 10 | 12 12 |
| Penta_E | 12 14 | 12 14 | 12 13 |
| TH01 | 9.3 9.3 | 9.3 9.3 | 8 9.3 |
| TPOX | 11 11 | 11 11 | 8 8 |
| vWA | 17 17 | 17 17 | 17 18 |
Sequencing reads and mapping efficiency
| Twin A | Twin B | |||
|---|---|---|---|---|
| Sequencing run 1: V1 flow cell HiSeq system | ||||
| Total number of reads | 125,989,576 | 101,559,160 | ||
| Stringency | Alignment 1 | Alignment 2 | Alignment 1 | Alignment 2 |
| Number of mapped reads | 59,402,944 | 113,836,436 | 47,883,234 | 91,638,246 |
| Percentage of mapped reads | 47.15 | 90.35 | 47.15 | 90.23 |
| Sequencing run 2: V3 flow cell HiSeq system | ||||
| Total number of reads | 275,033,292 | 314,499,024 | ||
| Stringency | Alignment 1 | Alignment 2 | Alignment 1 | Alignment 2 |
| Number of mapped reads | 130,156,770 | 249,089,720 | 148,617,794 | 284,273,752 |
| Percentage of mapped reads | 47.32 | 90.57 | 47.26 | 90.39 |
Figure 1Mitochondrial DNA sequencing coverage and variant map for twin A (upper plot) and twin B (lower plot). The x-axis represents the nucleotide position on the mitochondrial genome and the y-axis shows the number of reads (depth of coverage) for each nucleotide position (maximum of 1084 reads for twin A and 1395 reads for twin B). Homoplasmic variants (colored vertical lines at specific genomic locations) in both twin A and twin B are concordant. Each variant is colored according to the base type (red for T, green for A, blue for C, and brown for G) compared to the hg19 reference base.
Figure 2High-throughput called variants common to both twins. (A) Homoplasmic/nearly homoplasmic variants detected for twin A and twin B are concordant. The y-axis represents the ratio of variant to reference base. The x-axis represents the alignment position of the variant detected. (B) Low-level heteroplasmic variant calls detected in both twin A and twin B. (C) Distribution of novel and reported mitochondrial sequence variants detected in both twins A and B. The y-axis represents the number of variants. The x-axis represents the mitochondrial genes. Bars represent reported (light) and unreported variants (dark).
Positions and predicted effect of homoplasmic and nearly homoplasmic variants common to both twins A and B (n = 37 variants)
| mtDNA position | Gene | Reference allele | Variant | Mutation type | First AA | Second AA | Variant identifier | Coverage twin A | Variant ratio twin A (%) | Coverage twin B | Variant ratio twin B (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 150 | MT-DLOOP | T | C | Transversion | NA | NA | rs62581312 | 30 | 100 | 28 | 100 |
| 153 | MT-DLOOP | A | G | Transition | NA | NA | RP | 34 | 100 | 28 | 100 |
| 195 | MT-DLOOP | C | T | Transversion | NA | NA | rs2857291 | 8 | 100 | 9 | 100 |
| 408 | MT-DLOOP | A | T | Transversion | NA | NA | rs28412942 | 372 | 100 | 382 | 100 |
| 489 | MT-DLOOP | T | C | Transversion | NA | NA | rs28625645 | 440 | 100 | 570 | 100 |
| 1888 | MT-RNR2 | G | A | Transversion | NA | NA | rs28358577 | 860 | 100 | 1070 | 100 |
| 2353 | MT-RNR2 | C | T | Transversion | NA | NA | rs28358579 | 854 | 100 | 1035 | 100 |
| 2483 | MT-RNR2 | C | T | Transversion | NA | NA | rs80056772 | 881 | 100 | 1036 | 100 |
| 3552 | MT-ND1 | T | A | Transversion | A | A | rs28358587 | 713 | 100 | 907 | 100 |
| 4715 | MT-ND2 | A | G | Transition | G | G | RP | 854 | 100 | 898 | 99.9 |
| 7196 | MT-CO1 | C | A | Transversion | L | L | RP | 900 | 99.5 | 1127 | 100 |
| 8584 | MT-ATP6 | G | A | Transversion | A | T | rs55728079 | 733 | 100 | 963 | 100 |
| 9377 | MT-CO3 | G | A | Transition | W | W | rs28380140 | 684 | 100 | 855 | 100 |
| 9545 | MT-CO3 | A | G | Transition | G | G | RP | 803 | 100 | 944 | 100 |
| 9962 | MT-CO3 | G | A | Transition | L | L | RP | 929 | 100 | 1086 | 100 |
| 10400 | MT-ND3 | C | T | Transversion | T | T | rs28358278 | 822 | 100 | 1064 | 100 |
| 10819 | MT-ND4 | G | A | Transition | K | K | rs28358283 | 952 | 100 | 1090 | 100 |
| 11017 | MT-ND4 | C | T | Transversion | S | S | rs28594904 | 858 | 100 | 975 | 100 |
| 11722 | MT-ND4 | C | T | Transversion | H | Y | rs28471078 | 795 | 100 | 972 | 100 |
| 11914 | MT-ND4 | G | A | Transversion | T | T | rs2853496 | 951 | 100 | 1169 | 100 |
| 12092 | MT-ND4 | C | T | Transversion | L | F | RP | 912 | 100 | 1097 | 100 |
| 12414 | MT-ND5 | T | C | Transition | P | P | RP | 879 | 100 | 965 | 99.9 |
| 12850 | MT-ND5 | G | A | Transversion | V | I | rs28705385 | 881 | 100 | 1251 | 100 |
| 13263 | MT-ND5 | A | G | Transition | Q | Q | rs28359175 | 852 | 100 | 1098 | 100 |
| 14212 | MT-ND6 | C | T | Transversion | V | V | rs28357672 | 819 | 100 | 975 | 100 |
| 14318 | MT-ND6 | T | C | Transversion | N | S | rs28357675 | 847 | 100 | 1042 | 100 |
| 14783 | MT-CYB | T | C | Transversion | L1 | L2 | rs28357680 | 804 | 100 | 981 | 100 |
| 14905 | MT-CYB | A | G | Transversion | M | M | rs28357682 | 770 | 100 | 1084 | 100 |
| 15043 | MT-CYB | G | A | Transversion | G | G | rs56038008 | 873 | 100 | 1128 | 100 |
| 15487 | MT-CYB | A | T | Transversion | P | P | rs28357370 | 870 | 100 | 1007 | 100 |
| 15930 | MT-TT | G | A | Transversion | NA | NA | rs41441949 | 836 | 100 | 947 | 100 |
| 15932 | MT-TT | C | C | Transition | NA | NA | rs28601282 | 859 | 100 | 1013 | 100 |
| 16153 | MT-DLOOP | G | A | Transversion | NA | NA | rs2853512 | 189 | 100 | 273 | 100 |
| 16172 | MT-DLOOP | C | T | Transversion | NA | NA | rs2853817 | 80 | 100 | 100 | 100 |
| 16190 | MT-DLOOP | C | T | Transversion | NA | NA | RP | 79 | 100 | 82 | 100 |
| 16299 | MT-DLOOP | T | C | Transversion | NA | NA | RP | 251 | 100 | 221 | 100 |
| 16520 | MT-DLOOP | C | T | Transversion | NA | NA | rs3937033 | 47 | 100 | 101 | 100 |
RP, reported polymorphism; AA, First amino acid is reference while Second amino acid is the effect of the variant.
Novel and nonsynonymous variants common to both twins A and B (n = 30 variants)
| mtDNA position | Gene | Major allele | Twin A | Twin B | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Minor allele | Minor allele ratio (%) | First AA | Second AA | Mutation type | Minor allele | Minor allele ratio (%) | First AA | Second AA | Mutation type | |||
| 3455 | MT-ND1 | C | T | 0.11 | A | V | Transition | A | 0.13 | A | D | Transversion |
| 3668 | MT-ND1 | G | A | 0.12 | W | STOP | Transition | A | 0.16 | W | STOP | Transition |
| 3805 | MT-ND1 | A | G | 0.10 | T | A | Transition | T | 0.12 | T | S | Transversion |
| 4971 | MT-ND2 | G | T | 0.10 | G | C | Transversion | T | 0.12 | G | C | Transversion |
| 4998 | MT-ND2 | A | T | 0.09 | K | STOP | Transversion | G | 0.12 | K | E | Transition |
| 5014 | MT-ND2 | C | G | 0.09 | S | C | Transversion | A | 0.12 | S | Y | Transversion |
| 5055 | MT-ND2 | T | C | 0.35 | Y | H | Transition | C | 0.13 | Y | H | Transition |
| 5107 | MT-ND2 | C | T | 0.12 | T | I | Transition | A | 0.13 | T | N | Transversion |
| 5470 | MT-ND2 | C | T | 0.11 | H | M | Transition | A | 0.12 | T | K | Transversion |
| 5481 | MT-ND2 | C | A | 0.10 | P | T | Transversion | A | 0.12 | P | T | Transversion |
| 5906 | MT-CO1 | G | A | 0.11 | M | M | Transition | T | 0.14 | M | M | Transition |
| 6569 | MT-CO1 | C | A | 0.22 | P | P | Transversion | A | 0.14 | R | S | Transversion |
| 6849 | MT-CO1 | A | C | 0.08 | T | P | Transversion | T | 0.14 | T | S | Transversion |
| 6935 | MT-CO1 | C | T | 0.08 | H | Y | Transition | T/A | 0.10 | H | N | Transversion |
| 6998 | MT-CO1 | C | A | 0.08 | I | M | Transversion | A | 0.10 | I | M | Transversion |
| 7131 | MT-CO1 | G | T | 0.09 | A | S | Transition | T | 0.11 | A | S | Transition |
| 7219 | MT-CO1 | G | T | 0.18 | R | L | Transversion | T | 0.11 | R | L | Transversion |
| 7610 | MT-CO2 | C | A | 0.13 | L | M | Transversion | A | 0.14 | L | M | Transversion |
| 7978 | MT-CO2 | G | T | 0.09 | G | V | Transition | T | 0.12 | G | V | Transition |
| 8048 | MT-CO2 | A | T | 0.09 | T | S | Transversion | G | 0.12 | T | A | Transversion |
| 8085 | MT-CO2 | C | G | 0.10 | T | STOP | Transversion | A | 0.12 | T | K | Transversion |
| 8156 | MT-CO2 | G | C | 0.28 | V | L | Transversion | C | 0.33 | V | L | Transversion |
| 8243 | MT-CO2 | G | T | 0.10 | E | STOP | Transversion | A/T | 0.12 | E | K/STOP | Transition |
| 9744 | MT-CO3 | G | T | 0.09 | E | STOP | Transversion | T | 0.22 | E | STOP | Transversion |
| 10867 | MT-ND4 | C | A | 0.09 | I | M | Transversion | G | 0.11 | I | M | Transversion |
| 11351 | MT-ND4 | G | T | 0.09 | A | S | Transversion | T | 0.11 | A | S | Transversion |
| 11638 | MT-ND4 | C | T | 0.11 | H | Y | Transition | A | 0.11 | T | N | Transversion |
| 11941 | MT-ND4 | T | A | 0.08 | L | Q | Transversion | G | 0.10 | L | Q | Transversion |
| 13725 | MT-ND5 | C | A | 0.14 | F | L | Transversion | A | 0.15 | F | L | Transversion |
| 15266 | MT-CYB | A | C | 0.10 | T | P | Transversion | G | 0.17 | T | A | Transversion |
AA, amino acid effect of variant.
Comparison of SNP detection using the HiSeq and the primer extension techniques
| mtDNA position | Gene | Primer extension | HiSeq | HiSeq minor allele twin A | HiSeq minor allele twin B | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Major allele | Minor allele | Major allele | Run2 HS | Run2 MS | Run1 HS | Run1 MS | Run2 HS | Run2 MS | Run1 HS | Run1 MS | ||
| 3275 | MT-TL1 | – | C | T | A | – | – | G | G | – | – | |
| 3455 | MT-ND1 | C | A | C | A | – | – | – | T | T | – | – |
| 3668 | MT-ND1 | G | – | G | A | A | – | – | A | A | – | – |
| 3805 | MT-ND1 | A | – | A | T | T | – | – | G | G | – | – |
| 4456 | MT-TM | C | T | C | C | T | – | – | T | – | – | – |
| 4971 | MT-ND2 | G | – | G | T | T | – | – | T | T | – | – |
| 4998 | MT-ND2 | A | C | A | G | – | C | – | T | T | C | – |
| 5014 | MT-ND2 | C | A | C | A | A | A | – | G | A | – | – |
| 5055 | MT-ND2 | T | – | T | C | C | – | – | C | C | – | – |
| 5107 | MT-ND2 | C | – | C | A | A | – | – | T | A | – | – |
| 5470 | MT-ND2 | C | – | C | A | – | – | – | T | – | – | – |
| 5481 | MT-ND2 | C | – | C | A | – | – | – | – | A | – | A |
| 5906 | MT-CO1 | G | – | G | T | T | – | – | A | – | T | T |
| 6569 | MT-CO2 | C | A | C | A | A | – | – | A | A | – | – |
| 6849 | MT-CO1 | A | – | A | T | T/C | C | C | C | T | C | C |
| 6935 | MT-CO1 | C | A | C | T/A | T | T | – | T | T | T | – |
| 6998 | MT-CO1 | C | T | C | A | A | – | T | A | – | – | – |
| 7131 | MT-CO1 | G | – | G | T | T | – | – | T | T | – | – |
| 7219 | MT-CO1 | G | – | G | T | T | A | A | T | T | T | T |
| 7610 | MT-CO2 | C | – | C | A | – | T | T | A | – | – | – |
| 7978 | MT-CO2 | G | C | G | T | C | – | – | T | C | – | – |
| 8048 | MT-CO2 | A | – | A | G | T | – | – | T | T | – | – |
| 8085 | MT-CO2 | C | – | C | A | A | – | – | G | – | – | – |
| 8156 | MT-CO2 | G | – | G | C | C | C | C | C | C | C | C |
| 8243 | MT-CO2 | G | – | G | A/T | T | – | – | T | T | T/C | T/C |
| 9744 | MT-CO3 | G | – | G | T | T | T | T | T | T | – | – |
| 10867 | MT-ND4 | C | C | G | – | A | – | A | – | – | – | |
| 12258 | MT-TS2 | C | – | C | A | – | A | A | A | A | – | – |
| 11351 | MT-ND4 | G | – | G | T | T | C | – | T | T | – | T |
| 11638 | MT-ND4 | C | – | C | A | A | A | – | T | T | T | T |
| 11941 | MT-ND4 | T | A | T | G | – | G | G | A | A | – | – |
| 13725 | MT-ND5 | C | – | C | A | – | – | – | A | – | A | – |
| 15266 | MT-CYB | A | – | A | G | G | G | G | C | C | C | C |
HiSeq, high-throughput sequencing; MS, aligned with moderate stringency; HS, aligned with high stringency. Dash indicates minor allele was not detected.
Figure 3Electropherograms showing primer extension assay for two positions from either twin A (left) or twin B (right). The blue and black peaks correspond to the major alleles (G at position 9606, C at 6998), while the small red peaks indicate the presence of a T allele at position 6998 at a very low level.
Figure 4Electropherograms showing the genotype detected in each twin by primer extension on genomic DNA (A) or mitochondrial hemigenome templates (B). The black peak corresponds to the reference allele “C” at position 6998 while the red peak indicates the presence of a “T” allele at a low level. The low-level variant detected using primer extension from genomic DNA (B) was absent in assay with the mitochondrial hemigenome (B) suggesting the T allele signal may result from an extramitochondrial source (numt).
Figure 5Mapping reads to test for nuclear mitochondrial sequences. Resulting low-level variant reads for twins when reads are mapped to the hg 19 reference genome (red dots) or to the mtDNA genome only (blue circles). The y-axis represents the number of reads with the variant allele for each position on the mitochondrial sequence, x-axis.