| Literature DB >> 32184981 |
Anne-Lyse Ducrest1, Samuel Neuenschwander2, Emanuel Schmid-Siegert2, Marco Pagni2, Clément Train3,4,5, David Dylus3,4,5, Yannis Nevers3,4,5, Alex Warwick Vesztrocy6, Luis M San-Jose7, Mélanie Dupasquier8, Christophe Dessimoz3,4,5, Ioannis Xenarios4, Alexandre Roulin1, Jérôme Goudet1,5.
Abstract
New genomic tools open doors to study ecology, evolution, and population genomics of wild animals. For the Barn owl species complex, a cosmopolitan nocturnal raptor, a very fragmented draft genome was assembled for the American species (Tyto furcata pratincola) (Jarvis et al. 2014). To improve the genome, we assembled de novo Illumina and Pacific Biosciences (PacBio) long reads sequences of its European counterpart (Tyto alba alba). This genome assembly of 1.219 Gbp comprises 21,509 scaffolds and results in a N50 of 4,615,526 bp. BUSCO (Universal Single-Copy Orthologs) analysis revealed an assembly completeness of 94.8% with only 1.8% of the genes missing out of 4,915 avian orthologs searched, a proportion similar to that found in the genomes of the zebra finch (Taeniopygia guttata) or the collared flycatcher (Ficedula albicollis). By mapping the reads of the female American barn owl to the male European barn owl reads, we detected several structural variants and identified 70 Mbp of the Z chromosome. The barn owl scaffolds were further mapped to the chromosomes of the zebra finch. In addition, the completeness of the European barn owl genome is demonstrated with 94 of 128 proteins missing in the chicken genome retrieved in the European barn owl transcripts. This improved genome will help future barn owl population genomic investigations.Entities:
Keywords: Strigiformes; Tytonidae; assembly; barn owl; bird; genome
Year: 2020 PMID: 32184981 PMCID: PMC7069322 DOI: 10.1002/ece3.5991
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1European barn owl (Tyto alba alba). ©Guillaume Rapin, Switzerland
Comparison of the European barn owl genome assembly metrics and genome completeness using BUSCO to the American barn owl, the zebra finch, the collared flycatcher, and the chicken genome assemblies
| European barn owl | American barn owl | Zebra finch | Collared flycatcher | Chicken | |
|---|---|---|---|---|---|
| Number of scaffolds | 21,509 | 62,122 | 37,096 | 21,428 | 23,475 |
| Number of scaffolds (≥500 bp) | 21,509 | 57,936 | 37,096 | 9,718 | 23,208 |
| Number of scaffolds (≥1,000 bp) | 10,312 | 47,332 | 37,094 | 4,033 | 22,945 |
| Largest scaffold | 22,155,979 | 502,267 | 156,412,533 | 157,563,209 | 196,202,544 |
| Assembly length | 1,219,191,878 | 1,120,143,088 | 1,232,135,591 | 1,118,343,587 | 1,230,258,557 |
| N50 | 4,615,526 | 52,818 | 62,374,962 | 64,724,594 | 82,310,166 |
| N75 | 1,861,816 | 25,700 | 15,652,063 | 21,727,166 | 14,109,371 |
| L50 | 72 | 5,943 | 7 | 6 | 5 |
| L75 | 177 | 13,502 | 18 | 13 | 16 |
| Genome size (Gbp) | 1.59 | 1.59 | 1.22 | 1.20 | 1.25 |
| NG50 | 2,701,956 | 29,716 | 62,374,962 | 64,724,594 | 82,310,166 |
| GC (%) | 42 | 40 | 41 | 44 | 43 |
| N's (%) | 0.79 | 0.79 | 0.75 | 1.43 | 0.96 |
|
| |||||
| Complete (%) | 94.8 | 84.3 | 93.6 | ND | 94.8 |
| Single‐copy (%) | 94.0 | 83.9 | 90.8 | ND | 93.8 |
| Duplicated (%) | 0.8 | 0.4 | 2.8 | ND | 1.0 |
| Fragmented (%) | 3.4 | 10.6 | 3.8 | ND | 2.9 |
| Missing (%) | 1.8 | 5.1 | 2.6 | ND | 2.3 |
All stats were done for scaffolds >500 bp. To evaluate genome completeness, the European barn owl genome was compared with the other genomes using 4,915 conserved avian orthologous genes using BUSCO. Except for genome size (Gbp) all the data are given in bp. The genome size of the barn owl was the mean derived from the C‐values of 1.73 and 1.53 pg of DNA that gave genomic size of 1.69 and 1.50 Gbp with an average of 1.59 Gbp (De Vita et al., 1994; Venturini et al., 1986).
Abbreviation: ND, not done.
Metrics of the libraries used for the de novo assembly of the European barn owl
| Library type | Length (bp) | Insert size (bp) | Number of reads | Total size (Gbp) | Coverage |
|---|---|---|---|---|---|
| Illumina paired‐end | 2 × 100 | 180 | 243,335,851 | 48.67 | 41× |
| Illumina paired‐end | 2 × 100 | 500 | 187,046,962 | 37.41 | 31× |
| Illumina paired‐end | 2 × 100 | 500 | 175,190,557 | 35.04 | 29× |
| Illumina mate‐pair | 2 × 100 | 2,000 | 38,906,455 | 7.78 | 6× |
| Illumina mate‐pair | 2 × 100 | 5,000 | 67,900,282 | 13.58 | 11× |
| PacBio | 500–49,386 | 3,169,413 | 15.03 | 12× | |
| in total: | 158.00 | 129× |
The coverage was computed assuming a genome size of 1.219 Gbp equal to the assembly size.
These libraries were used for assembling and scaffolding.
These libraries were used solely for scaffolding.
Figure 2Relation between the number of scaffolds and the percentage of genome assembly of the American barn owl (red), the European barn owl (black), the zebra finch (green), the collared flycatcher (blue), and the chicken (gray). The horizontal dashed lane represents 90% of genome assembly
Summary of the repetitive elements present in the European barn owl assembly
| Number of elements | Length (bp) | % assembly | |
|---|---|---|---|
|
|
|
| |
| Total SINEs | 2,046 | 180,446 | 0.01 |
| ALUs | 0 | 0 | 0.00 |
| MIRs | 0 | 0 | 0.00 |
| Total LINEs | 73,341 | 30,014,401 | 2.46 |
| LINE1 | 0 | 0 | 0.00 |
| LINE2 | 0 | 0 | 0.00 |
| L3/CR1 | 73,341 | 30,014,401 | 2.46 |
| Total LTR elements | 6,081 | 2,932,372 | 0.24 |
| ERVL | 1,553 | 1,560,499 | 0.13 |
| ERVL‐MaLRs | 0 | 0 | 0.00 |
| ERV_classI | 1,223 | 768,947 | 0.06 |
| ERV_classII | 528 | 340,257 | 0.03 |
| Total DNA elements | 0 | 0 | 0.00 |
| hAT‐Charlie | 0 | 0 | 0.00 |
| TcMar‐Tigger | 0 | 0 | 0.00 |
| Unclassified | 81,182 | 30,363,917 | 2.49 |
|
|
|
| |
| Small RNA | 0 | 0 | 0.00 |
| Satellites | 1 | 473 | 0.00 |
| Simple repeats | 360,048 | 15,018,487 | 1.23 |
| Low complexity | 63,493 | 3,550,806 | 0.29 |
Repeats that contain insertion or deletion were counted as one element.
Summary metrics and quality assessments of the European barn owl annotations compared with the available American barn owl annotation
| American barn owl | European barn owl | |
|---|---|---|
| Number of proteins | 14,905 | 38,895 |
| Min length | 21 | 13 |
| Mean length | 489 | 308 |
| Median length | 362 | 471 |
| Max length | 22,559 | 23,122 |
|
| ||
| Total | 9,109 | 10,392 |
| Unique | 8,357 | 8,946 |
| Duplicates | 752 | 1,446 |
|
| ||
| Complete (%) | 73.9 | 73.9 |
| Single‐copy (%) | 70.4 | 72.8 |
| Duplicated (%) | 3.5 | 1.1 |
| Fragmented (%) | 10.4 | 13.0 |
| Missing (%) | 15.6 | 13.1 |
The quality assessments were based on the search for chicken and metazoa BUSCO proteins.
Global‐global search of similar chicken proteins in the European barn owl gene annotations using the chicken (Gallus gallus ensembl release 88).
Figure 3Effect of GC content on genome sequencing of European and American barn owl. (a) GC content of 57 Sanger sequenced genes of the European barn owl. The gene sequences were split in the 5'UTR, the coding sequence (CDS) and the 3'UTR. The mean and the standard deviation for the 57 genes are plotted. (b) Percent of exon 1 that were sequenced by Illumina (American barn owl, blue) or Illumina/PacBio (European barn owl, red) sequencing out of the 57 Sanger sequenced genes, binned according to their GC content. The number of genes for each group of first exon GC content is the following: ≥70:15, 60–69:16, 50–59:9, 40–49:14, <40:3
Figure 4Detection of scaffolds rearrangements in the European and American barn owls. (a) Raw read coverage of the American (blue) and the European (red) barn owls for the scaffold 97 that contains the RAB32 and androglobin (ADGB) genes (written in red), which may be partially duplicated in the American barn owl. The chicken genes surrounding the duplicated region are written in black. The relative coverage of the raw reads of the European barn owl over the sum of the raw reads of the European barn owl plus the American barn owl is depicted with the black dots for each read. A drop of the relative coverage means a duplication in the American barn owl genome. (b) Detection of the duplication in the American barn owl at the Contig 97 by real‐time PCR (qPCR). The copy number of DNA of the exon 2 (in the duplicated region in the American barn owls) and the copy number of the exon 1 (unduplicated) of the RAB32 gene are quantified by qPCR with primers and probes located in the exon 2 and exon 1 of RAB32. (c) Mean value and standard deviation (bars) for the relative copy number of the exon 2 over the exon 1 of RAB32 in 4 male (M) and 4 female (F) European barn owls and in 3 male and 4 female American barn owls
Figure 5Comparison of European and American barn owl contigs with the zebra finch chromosomes. (a) Chromosomal synteny plot between the zebra finch genome assembled at the chromosome level (black) and the European barn owl scaffolds (green) (a) for all zebra finch chromosomes (black) and (b) for the zebra finch Z chromosome (black). The innermost part represents the localization of the European barn owl scaffolds to the zebra finch chromosomes (gray lines). The outermost line plot represents breadth of coverage of European (red) and American (blue) barn owl scaffolds. For (b) the innermost part represents the localization of the European barn owl scaffolds (gray and orange) to the zebra finch chromosomes with inversions denoted by orange lines
Figure 6Avian phylogenetic trees based on the American and European barn owl proteins predicted with the American and European barn owl annotations. Depending on the dataset, the position of Tytonidae varies on the tree. Left tree used the protein predictions produced by the American annotation, right tree uses the protein predictions produced in this work. Shades of blue show positions in which differences in topology are detected. Nodes without number have a bootstrap support of 100%; Small trees on the sides show the real branch lengths. The purple background represented the group of Passerimorphae, the orange the Coraciimorphae and the light blue the Accipitrimorphae