| Literature DB >> 29947749 |
Eva Puerma1, Dorcas J Orengo1, Fernando Cruz2, Jèssica Gómez-Garrido2, Pablo Librado3,4, David Salguero1, Montserrat Papaceit1, Marta Gut2,5, Carmen Segarra1, Tyler S Alioto2,5, Montserrat Aguadé1.
Abstract
Drosophila guanche is a member of the obscura group that originated in the Canary Islands archipelago upon its colonization by D. subobscura. It evolved into a new species in the laurisilva, a laurel forest present in wet regions that in the islands have only minor long-term weather fluctuations. Oceanic island endemic species such as D. guanche can become model species to investigate not only the relative role of drift and adaptation in speciation processes but also how population size affects nucleotide variation. Moreover, the previous identification of two satellite DNAs in D. guanche makes this species attractive for studying how centromeric DNA evolves. As a prerequisite for its establishment as a model species suitable to address all these questions, we generated a high-quality D. guanche genome sequence composed of 42 cytologically mapped scaffolds, which are assembled into six super-scaffolds (one per chromosome). The comparative analysis of the D. guanche proteome with that of twelve other Drosophila species identified 151 genes that were subject to adaptive evolution in the D. guanche lineage, with a subset of them being involved in flight and genome stability. For example, the Centromere Identifier (CID) protein, directly interacting with centromeric satellite DNA, shows signals of adaptation in this species. Both genomic analyses and FISH of the two satellites would support an ongoing replacement of centromeric satellite DNA in D. guanche.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29947749 PMCID: PMC6101566 DOI: 10.1093/gbe/evy135
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Genome Assembly Statistics
| Contigs | Scaffolds | Super-Scaffolds | |
|---|---|---|---|
| Number | 33,372 | 13,506 | 6 |
| N10 | 520.04 kb | 18.89 Mb | 29.60 Mb |
| N20 | 347.02 kb | 12.80 Mb | 29.60 Mb |
| N50 | 168.18 kb | 7.25 Mb | 23.03 Mb |
| N80 | 40.96 kb | 1.02 Mb | 22.90 Mb |
| N90 | 3.46 kb | 0.01 Mb | 19.46 Mb |
| Length | 137.97 Mb | 140.63 Mb | 121.04 Mb |
Note that 86.1% of the assembly was assigned to chromosomes (42 scaffolds) and the other 13.9% remains in the 13,464 unplaced scaffolds.
. 1.—Super-scaffolds obtained for each chromosome of D. guanche by placing 42 scaffolds via FISH on the species polytene chromosomes. The name of each D. guanche chromosome—A, J, U, E, O, and dot—is indicated on the left side of the corresponding super-scaffold, with the name of the corresponding Muller element—A, D, B, C, E, and F, respectively—given in parentheses. Colored arrows indicate the orientation of each scaffold included in a super-scaffold whereas nonoriented scaffolds are represented by colored boxes. *The breakage-prone nature of the most centromere-proximal part of polytene chromosomes in cytological preparations places some uncertainty in the orientation of these centromere-proximal scaffolds.
. 2.—CIRCOS representation of the distribution of genes, repeats, and the subset of repeats corresponding to the SGM and sat290 sequences on each Drosophila guanche assembled chromosome as well as on unplaced scaffolds. The number of elements per each 100-kb nonoverlapping window is plotted as a histogram. The y-axis range is set to the maximum value observed per track with the exception of the SGM track, which uses the same scale as the sat290 track in order to better visualize relative abundance for these satellites. The x-axis is labeled in units of Mb for each chromosome.
Genome Annotation Statistics
| Protein Coding | lncRNAs | |
|---|---|---|
| Number of genes | 13,453 | 3,324 |
| Median gene length (bp) | 2,262 | 624 |
| Number of transcripts | 21,088 | 3,732 |
| Median transcript length (bp) | 1,719 | 587 |
| Median coding sequence length (bp) | 1,203 | – |
| Median exon length (bp) | 282 | 411 |
| Median intron length (bp) | 70 | 72 |
| Median UTR length (bp) | 1,020 | – |
| Coding GC content | 55.12% | – |
| Exons/transcript | 4.16 | 1.36 |
| Transcripts/gene | 1.56 | 1.12 |
| Multiexonic transcript (%) | 82 | 25 |
. 3.—Schematic representation of biological processes related to genome stability where some of the candidate genes to have adaptively evolved in the Drosophila guanche lineage are involved.