| Literature DB >> 35524941 |
Cristian Cuevas-Caballé1, Joan Ferrer Obiol1,2, Joel Vizueta1,3, Meritxell Genovart4, Jacob Gonzalez-Solís5, Marta Riutort1, Julio Rozas1.
Abstract
The Balearic shearwater (Puffinus mauretanicus) is the most threatened seabird in Europe and a member of the most speciose group of pelagic seabirds, the order Procellariiformes, which exhibit extreme adaptations to a pelagic lifestyle. The fossil record suggests that human colonisation of the Balearic Islands resulted in a sharp decrease of the Balearic shearwater population size. Currently, populations of the species continue to be decimated mainly due to predation by introduced mammals and bycatch in longline fisheries, with some studies predicting its extinction by 2070. Here, using a combination of short and long reads, we generate the first high-quality reference genome for the Balearic shearwater, with a completeness amongst the highest across available avian species. We used this reference genome to study critical aspects relevant to the conservation status of the species and to gain insights into the adaptation to a pelagic lifestyle of the order Procellariiformes. We detected relatively high levels of genome-wide heterozygosity in the Balearic shearwater despite its reduced population size. However, the reconstruction of its historical demography uncovered an abrupt population decline potentially linked to a reduction of the neritic zone during the Penultimate Glacial Period (∼194-135 ka). Comparative genomics analyses uncover a set of candidate genes that may have played an important role into the adaptation to a pelagic lifestyle of Procellariiformes, including those for the enhancement of fishing capabilities, night vision, and the development of natriuresis. The reference genome obtained will be the crucial in the future development of genetic tools in conservation efforts for this Critically Endangered species.Entities:
Keywords: Balearic shearwater; Procellariiformes marine adaptation; comparative genomics; conservation genomics
Mesh:
Year: 2022 PMID: 35524941 PMCID: PMC9117697 DOI: 10.1093/gbe/evac067
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 4.065
Sequencing Data, Library Information, and Samples Used in this Study
| Library | Total Number of Base Pairs | Number of Reads | Coverage | Individual Code | Location |
|---|---|---|---|---|---|
| HiSeq X Ten – TruSeq DNA PCR Free. 2 × 150 bp | 143,765,593,200 | 958,437,288 | 118× | Male-Mll | Sa Cella (Mallorca) |
| ONT – Ligation kit SQK-LSK109 1D | 12,142,789,693 | 2,576,486 | 10× | Unsexed-Ei | Sa Conillera (Eivissa) |
| NovaSeq 6000 – TruSeq RNA Sample Prep Kit v2. 2 × 100 bp | 14,997,592,000 | 149,975,920 | – | Chick-Mll | Conills islet (Mallorca) |
Balearic Shearwater Genome Assembly Metrics
| Assembly length (bp) | 1,218,519,395 |
| Number of scaffolds | 4,169 |
| Longest scaffold (Mbp) | 11.06 |
| N50 (Mbp) | 2.13 |
| L50 | 164 |
| GC content (%) | 42.52 |
| Repetitive content (%) | 9.95 |
| Mitogenome (bp) | 19,855 |
| No. protein-coding genes | 21,959 |
| BUSCO % | |
| Complete | 95.9 |
| Single copy | 95.6 |
| Duplicated | 0.3 |
| Fragmented | 1.1 |
| Missing | 3.0 |
Fig. 1.(a) Map depicting the known Balearic shearwater breeding colonies in the Balearic Islands. Circle size is proportional to population size as shown in the legend. Modified from Arcos (2011). (b) Snail plot summarizing genome assembly statistics (Challis et al. 2020). From inside to outside, the light-grey spiral shows the cumulative scaffold count on a log scale with white scale lines depicting changes of order of magnitude. Dark-grey segments show the distribution of scaffold lengths, and the plot radius is scaled to the longest scaffold (shown in red). Orange and light-orange rings represent the N50 and N90 scaffold lengths, respectively. Blue and light-blue rings show GC, AT, and N percentages along the genome assembly. (c) MSMC2 reconstruction of effective population size estimates (Ne) over time, estimated using generation time of 12.8 years and mutation rate (μ) of 2.89 × 10−9 substitutions per nucleotide per generation. Light-brown vertical bars represent interglacial periods. Upper panel represents global temperature changes as inferred from the EPICA (European Project for Ice Coring in Antarctica) Dome C ice core (Augustin et al. 2004). Lower panel represents sea level changes inferred from a stack of 57 globally distributed benthic δ18O records (Lisiecki and Raymo 2005). Balearic shearwater illustration by Martí Franch reproduced with permission.
Fig. 2.Comparison of genome-wide heterozygosity among Procellariiformes. (a) Density plots showing the distribution of individual nucleotide diversity (π) values in nonoverlapping 25 Kb windows for each of the eight Procellariiformes species with an available reference genome. Scientific names of large-bodied (>450 g) and small-bodied species (<200 g) are shown in green and orange, respectively. Color-scale represents π values tail probabilities as shown in the legend. The white line depicts median values and black lines depict 25th and 75th percentiles. (b) Density plots showing the distribution of π values in large-bodied and small-bodied species groups.
Fig. 3.Ultrametric tree based on the 4D CDS ML tree calibrated with r8s. Minimum number of gains (green) and losses (red) per branch are represented according to BadiRate analysis. Numbers in ancestral nodes and in the tips (in parenthesis) indicate the inferred number of genes. Illustrations of seabird species were reproduced with permission from Lynx Edicions and Martí Franch.