| Literature DB >> 29038757 |
Zachary R Hanna1,2,3,4, James B Henderson3,4, Anna B Sellas4,5, Jérôme Fuchs3,6, Rauri C K Bowie1,2, John P Dumbacher3,4.
Abstract
We report here the successful assembly of the complete mitochondrial genomes of the northern spotted owl (Strix occidentalis caurina) and the barred owl (S. varia). We utilized sequence data from two sequencing methodologies, Illumina paired-end sequence data with insert lengths ranging from approximately 250 nucleotides (nt) to 9,600 nt and read lengths from 100-375 nt and Sanger-derived sequences. We employed multiple assemblers and alignment methods to generate the final assemblies. The circular genomes of S. o. caurina and S. varia are comprised of 19,948 nt and 18,975 nt, respectively. Both code for two rRNAs, twenty-two tRNAs, and thirteen polypeptides. They both have duplicated control region sequences with complex repeat structures. We were not able to assemble the control regions solely using Illumina paired-end sequence data. By fully spanning the control regions, Sanger-derived sequences enabled accurate and complete assembly of these mitochondrial genomes. These are the first complete mitochondrial genome sequences of owls (Aves: Strigiformes) possessing duplicated control regions. We searched the nuclear genome of S. o. caurina for copies of mitochondrial genes and found at least nine separate stretches of nuclear copies of gene sequences originating in the mitochondrial genome (Numts). The Numts ranged from 226-19,522 nt in length and included copies of all mitochondrial genes except tRNAPro , ND6, and tRNAGlu . Strix occidentalis caurina and S. varia exhibited an average of 10.74% (8.68% uncorrected p-distance) divergence across the non-tRNA mitochondrial genes.Entities:
Keywords: Barred owl; Bird; Control region; Mitochondrial genome; Mtgenome; Northern spotted owl; mtDNA
Year: 2017 PMID: 29038757 PMCID: PMC5639871 DOI: 10.7717/peerj.3901
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Strix specimen data.
We here provide further information regarding the datasets that archive the Strix specimens to which we refer throughout the manuscript.
| Specimen | Data publisher | Date accessed | Link to dataset |
|---|---|---|---|
| CAS:ORN:95964 | CAS Ornithology (ORN), California Academy of Sciences, San Francisco, California, United States of America | 2016 Aug 15 |
|
| CAS:ORN:98821 | CAS Ornithology (ORN), California Academy of Sciences, San Francisco, California, United States of America | 2016 Aug 15 |
|
| CNHM<USA-OH>:ORNITH:B41533 | Museum of Natural History & Science, Cincinnati Museum Center, Cincinnati, Ohio, United States of America | 2017 Sep 3 |
|
Sequence of primers used in Sanger sequencing of control regions.
These are the sequences of all of the primers that we used to amplify control regions 1 and 2 in order to confirm the final sequence of these regions in the mitochondrial genome assemblies.
| Primer name | Relevant region | Species used on | External or internal | Primer sequence (5′ → 3′) | Source |
|---|---|---|---|---|---|
| cytb-F1 | CR1 | External | ATCCTCATTCTCTTCCCCGT | This study | |
| 17122R | CR1 | External | GGTGGGGGTTATTATTAACTTT | This study | |
| CR1-F1 | CR1 | Internal | CTCSASCAAATCCCAAGTTT | This study | |
| CR1-F1-RC | CR1 | Internal | AAACTTGGGATTTGSTSGAG | This study | |
| CR1-R2 | CR1 | Internal | GGAGGGCGAGAATAGTTGRT | This study | |
| CR1-R2-RC | CR1 | Internal | AYCAACTATTCTCGCCCTCC | This study | |
| N1 | CR1 | Internal | AACATTGGTCTTGTAAACCAA | ||
| 41R | CR2 | External | GCATCTTCAGTGCCATGCTT | This study | |
| 17572F | CR2 | External | ATTATCCAAGGTCTGCGGCC | This study | |
| 17589F | CR2 | Internal | GCCTGAAAAACCGCCGTTAA | This study | |
| 18327F | CR2 | Internal | CACTTTTGCGCCTCTGGTTC | This study | |
| 19911R | CR2 | Internal | AGAGAGGCTCTGATTGCTTG | This study | |
| ND6-ext1F | CR2 | External | ACAACCCCATAATAYGGCGA | This study | |
| 12S-ext1R | CR2 | External | GGTAGATGGGCATTTACACT | This study | |
| final-CR2F | CR2 | Internal | TCAAACCAAACGATCGAGAA | This study | |
| 18547F | CR2 | Internal | CTCACGTGAAATCAGCAACC | This study | |
| 19088R | CR2 | Internal | ATTCAACTAAAATTCGTTACAAATCTT | This study | |
| 19088R-RC | CR2 | Internal | AAGATTTGTAACGAATTTTAGTTGAAT | This study |
Figure 1Ancestral avian mitochondrial gene order surrounding the control region compared with that of Strix occidentalis caurina and Strix varia.
The Chicken panel displays the gene order of Gallus gallus, which is the presumed ancestral avian gene order. The Spotted Owl panel depicts the gene order of Strix occidentalis caurina and the Barred Owl panel depicts the gene order of Strix varia. All rRNAs, tRNAs, and protein-coding genes outside of the displayed region exhibit the same order in all of these mitochondrial genomes. “CR” denotes the control region with “CR1” and “CR2” referring to control regions 1 and 2, respectively. We added 100 nucleotides to each of the tRNAs to improve visualization. Apart from the tRNAs, the annotations are to scale relative to each other with the numbers at the top of the figure denoting nucleotides. The orders of the genes outside of the region depicted in this figure are the same in the chicken, spotted owl, and barred owl.
Tandem repeat annotations.
This summarizes the repetitive regions of the northern spotted owl (Strix occidentalis caurina or S. o. caurina) and barred owl (S. varia) mitochondrial genomes annotated by Tandem Repeats Finder. “Period size” refers to the size of the repeated motif. “Copy number” refers to the number of copies of the repeat in the region. “Consensus size” is the length of the consensus sequence summarizing all copies of the repeat, which may or may not be different from the period size. “Percent matches” refers to the percentage of nucleotides that match between adjacent copies of the repeat. “Percent indels” refers to the percentage of indels between adjacent copies of the repeat. We present the percent composition of each of the four nucleotides in the repetitive region. We have included the genomic regions that intersect each repetitive span in the “Region” column. “CR1” and “CR2” refer to control region 1 and control region 2, respectively.
| Taxon | Coordinates (nt) | Region | Period size (nt) | Copy number | Consensus size (nt) | Percent matches (%) | Percent indels (%) | A (%) | C (%) | G (%) | T (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 10,267–10,309 | 18 | 2.3 | 19 | 84 | 4 | 25 | 46 | 0 | 27 | ||
| 15,066–15,162 | CR1 | 22 | 4.3 | 22 | 70 | 7 | 37 | 27 | 6 | 28 | |
| 15,169–15,311 | CR1 | 67 | 2.1 | 67 | 83 | 8 | 40 | 31 | 6 | 21 | |
| 16,243–16,715 | CR1 | 70 | 6.8 | 70 | 98 | 1 | 39 | 21 | 4 | 33 | |
| 16,245-16,715 | CR1 | 139 | 3.4 | 139 | 99 | 0 | 39 | 22 | 4 | 33 | |
| 16,403–16,515 | CR1 | 37 | 3.2 | 37 | 61 | 27 | 40 | 23 | 3 | 32 | |
| 17,679–17,795 | CR2 | 44 | 2.6 | 45 | 87 | 4 | 43 | 30 | 4 | 21 | |
| 17,719–17,795 | CR2 | 22 | 3.5 | 22 | 89 | 0 | 45 | 31 | 3 | 19 | |
| 18,798–19,076 | CR2 | 70 | 4.0 | 70 | 99 | 0 | 39 | 21 | 4 | 34 | |
| 18,800–19,076 | CR2 | 139 | 2.0 | 139 | 100 | 0 | 39 | 21 | 4 | 34 | |
| 18,958-19,070 | CR2 | 37 | 3.2 | 37 | 61 | 27 | 40 | 23 | 3 | 32 | |
| 19,110–19,853 | CR2 | 78 | 9.5 | 78 | 99 | 0 | 41 | 15 | 15 | 27 | |
| 15,126–15,209 | CR1 | 22 | 3.8 | 22 | 82 | 4 | 36 | 27 | 4 | 30 | |
| 15,193–15,340 | CR1 | 67 | 2.2 | 68 | 83 | 1 | 37 | 32 | 8 | 22 | |
| 17,384–17,482 | CR2 | 22 | 4.4 | 23 | 87 | 5 | 41 | 34 | 5 | 19 | |
| 18,548–18,951 | CR2 | 78 | 5.2 | 77 | 93 | 2 | 40 | 17 | 15 | 26 |
Figure 2Alignment of control regions 1 and 2 within Strix occidentalis caurina and Strix varia.
(A) depicts an alignment of the Strix occidentalis caurina control regions 1 and 2. (B) displays an alignment of the Strix varia control regions 1 and 2. The numerical coordinates at the top of each panel correspond to the coordinates of the alignment. Black rectangles for each control region denote continuous sequence, whereas intervening horizontal lines denote gaps in the alignment. The sequence identity rectangle is green at full height when there is agreement between the sequences, yellow at less than full height when the sequences disagree, and flat in gap regions. The location of the goose hairpin sequence in each control region is annotated in blue. The alignment locations of the primers we developed to amplify control regions 1 and 2 as well as the D16 primer used by Barrowclough, Gutierrez & Groth (1999) to amplify a portion of control region 1 are annotated in reddish purple.
Figure 3Alignment of Strix occidentalis caurina control regions 1 and 2 with those of Strix varia.
(A) depicts an alignment of the Strix occidentalis caurina control region 1 with that of Strix varia. (B) displays an alignment of the Strix occidentalis caurina control region 2 with that of Strix varia. The numerical coordinates at the top of each panel correspond to the coordinates of the alignment. Black rectangles for each control region denote continuous sequence, whereas intervening horizontal lines denote gaps in the alignment. The sequence identity rectangle is green at full height when there is agreement between the sequences, yellow at less than full height when the sequences disagree, and flat in gap regions. The location of the goose hairpin sequence in each control region is annotated in blue. The alignment locations of the primers we developed to amplify control regions 1 and 2 as well as the D16 primer used by Barrowclough, Gutierrez & Groth (1999) to amplify a portion of control region 1 are annotated in reddish purple. The annotation of primer final-CR2F is elongated as it is situated across a gap region in the alignment.
Mitochondrion-derived nuclear pseudogenes (Numts) identified in the Strix occidentalis caurina nuclear genome sequence and statistics of the results of BLASTN searches.
We indicate the mitochondrial genes that a Numt spans in the “Genes included” column. If a Numt spans more than two genes, we indicate the first and last genes that it spans as well as a gene in the middle of the Numt in order to indicate the direction that the Numt extends. The Numt additionally spans all of the intervening genes in such cases. “Start mtDNA” and “End mtDNA” indicate the mitochondrial genome assembly sequence positions and “Start Scaffold” and “End Scaffold” denote the nuclear genome assembly contig/scaffold sequence positions in the alignments of the mitochondrial genome assembly to the nuclear genome assembly. “% ID” indicates the percentage of identical matches in an alignment. “E-value” is the Expect value. “Bit score” is a log-scaled version of the alignment score. We characterized some of the Numts by examining more than one alignment and concluding that a Numt spanned across those individual alignments.
| Genes included | Start mtDNA | End mtDNA | Nuclear genome scaffold | Start scaffold | End scaffold | Orientation | % ID | Bit score | Length alignment (nt) | Length | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 2,225 | scaffold478 | 47,666 | 49,858 | + | 79.92 | 0.0 | 1,565 | 2,261 | 19,522 | |
| 2,367 | 2,645 | scaffold478 | 49,871 | 50,143 | + | 87.81 | 2.16e−84 | 322 | 279 | – | ||
| 2,706 | 5,223 | scaffold478 | 50,161 | 52,680 | + | 80.66 | 0.0 | 1,921 | 2,549 | – | ||
| 5,219 | 6,932 | scaffold478 | 57,635 | 59,328 | + | 83.22 | 0.0 | 1,552 | 1,716 | – | ||
| 6,988 | 7,103 | scaffold478 | 59,382 | 59,496 | + | 87.18 | 1.41e−26 | 130 | 117 | – | ||
| 8,382 | 13,249 | scaffold478 | 59,498 | 64,306 | + | 80.59 | 0.0 | 3,672 | 4,893 | – | ||
| 14,047 | 14,733 | scaffold478 | 44,785 | 45,459 | + | 82.82 | 1.92e−169 | 604 | 687 | − | ||
| 14,729 | 14,878 | scaffold478 | 46,066 | 46,222 | + | 82.80 | 1.09e−27 | 134 | 157 | – | ||
| 2 | 1,682 | 2,603 | scaffold215 | 5,517,239 | 5,518,161 | − | 81.97 | 0.0 | 773 | 932 | 923 | |
| 3 | 6,989 | 9,584 | scaffold215 | 5,513,222 | 5,515,749 | − | 79.01 | 0.0 | 1,690 | 2,615 | 2,528 | |
| 4 | 2,290 | 2,788 | scaffold632 | 1,548,886 | 1,549,372 | + | 77.50 | 6.14e−70 | 274 | 511 | 487 | |
| 5 | 2,810 | 4,646 | scaffold167 | 11,322,764 | 11,324,590 | + | 80.54 | 0.0 | 1,400 | 1,840 | 2,732 | |
| 4,692 | 5,597 | scaffold167 | 11,324,598 | 11,325,495 | + | 83.68 | 0.0 | 846 | 907 | – | ||
| 6 | 3,851 | 5,526 | scaffold1500 | 35,914 | 37,582 | − | 84.21 | 0.0 | 1,620 | 1,678 | 1,669 | |
| 7 | 4,500 | 5,348 | scaffold173 | 750,945 | 751,785 | − | 81.40 | 0.0 | 680 | 855 | 841 | |
| 8 | 12,082 | 12,310 | scaffold143 | 586,822 | 587,047 | + | 81.30 | 3.83e−42 | 182 | 230 | 226 | |
| 9 | 15,026 | 15,640 | scaffold294 | 2,356,468 | 2,357,059 | − | 83.07 | 9.17e−148 | 532 | 620 | 592 | |
| 17,677 | 18,195 | scaffold294 | 2,356,468 | 2,356,986 | − | 80.87 | 9.70e−108 | 399 | 528 | – |
Divergence of Strix occidentalis caurina and Strix varia at all protein-coding genes.
This provides the number of base substitutions per site for all mitochondrial protein-coding genes and rRNAs between the mitochondrial sequences of Strix occidentalis occidentalis and S. varia. P-distance refers to an uncorrected pairwise distance while TN93 refers to the pairwise distance corrected by the Tamura-Nei 1993 model (Tamura & Nei, 1993).
| Gene | Number of sites in alignment (nt) | Distance with TN93 model | |
|---|---|---|---|
| 984 | 5.79% | 6.61% | |
| 1,589 | 5.48% | 6.14% | |
| 681 | 9.10% | 11.07% | |
| 165 | 14.55% | 20.81% | |
| 1,548 | 7.88% | 9.31% | |
| 681 | 9.10% | 11.23% | |
| 783 | 7.54% | 8.89% | |
| 1,140 | 9.21% | 11.35% | |
| 957 | 10.66% | 13.46% | |
| 1,038 | 9.34% | 11.60% | |
| 174 | 10.92% | 13.96% | |
| 174 | 11.49% | 14.86% | |
| 1,377 | 10.31% | 13.12% | |
| 294 | 11.22% | 14.29% | |
| 1,818 | 9.19% | 11.29% | |
| 516 | 9.69% | 14.71% |