Literature DB >> 33693605

Characterization of a pericentric inversion in plateau fence lizards (Sceloporus tristichus): evidence from chromosome-scale genomes.

Ana M Bedoya1, Adam D Leaché2.   

Abstract

Spiny lizards in the genus Sceloporus are a model system among squamate reptiles for studies of chromosomal evolution. While most pleurodont iguanians retain an ancestral karyotype formula of 2n = 36 chromosomes, Sceloporus exhibits substantial karyotype variation ranging from 2n =  22 to 46 chromosomes. We present two annotated chromosome-scale genome assemblies for the Plateau Fence Lizard (Sceloporus tristichus) to facilitate research on the role of pericentric inversion polymorphisms on adaptation and speciation. Based on previous karyotype work using conventional staining, the S. tristichus genome is characterized as 2n =  22 with six pairs of macrochromosomes and five pairs of microchromosomes and a pericentric inversion polymorphism on chromosome 7 that is geographically variable. We provide annotated, chromosome-scale genomes for two lizards located at opposite ends of a dynamic hybrid zone that are each fixed for different inversion polymorphisms. The assembled genomes are 1.84-1.87 Gb (1.72 Gb for scaffolds mapping to chromosomes) with a scaffold N50 of 267.5 Mb. Functional annotation of the genomes resulted in ∼15K predicted gene models. Our assemblies confirmed the presence of a 4.62-Mb pericentric inversion on chromosome 7, which contains 62 annotated coding genes with known functions. In addition, we collected population genomics data using double digest RAD-sequencing for 44 S. tristichus to estimate population structure and phylogeny across the Colorado Plateau. These new genomic resources provide opportunities to perform genomic scans and investigate the formation and spread of pericentric inversions in a naturally occurring hybrid zone.
© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.

Entities:  

Keywords:  zzm321990 Sceloporuszzm321990 ; chromosome rearrangement; ddRADseq; genome sequencing; hybrid zone; pericentric inversion

Mesh:

Year:  2021        PMID: 33693605      PMCID: PMC8022919          DOI: 10.1093/g3journal/jkab036

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Introduction

Chromosomal rearrangements play important roles in adaptation, divergence, and speciation (Wellenreuther and Bernatchez 2018). Pericentric chromosome inversions, which are classified as inversions that include a centromere, impose significant evolutionary and ecological constraints on genome evolution by trapping alleles on inverted chromosome segments. This greatly reduces recombination with noninverted regions and increases linkage disequilibrium within inverted regions (Kirkpatrick 2010). Furthermore, when locally adapted alleles are located inside of an inversion, such as those that are ecologically relevant, the inversion can spread to fixation in a population (Kirkpatrick and Barton 2006). The phrynosomatid lizard genus Sceloporus is a diverse clade containing 108 species with a broad distribution across North America (Leaché ). Differentiation in the fundamental number of chromosomes is hypothesized to be a factor responsible for driving the rapid diversification of the genus (Hall 2009; Leaché and Sites Jr. 2009). While many species have the ancestral karyotype formula of 2n =  36 chromosomes, Sceloporus exhibits substantial karyotype variation (ranging from 2n =  22 to 46), sex chromosome evolution (Lisachov ), and large genome rearrangements (Leaché and Sites Jr. 2009). Existing genomic resources for Sceloporus include an annotated genome for S. undulatus (Westfall ), a de novo assembled shotgun genome for S. occidentalis and partial genomes for 34 other species (Arthofer ). Here, we present two annotated chromosome-scale genome assemblies for the Plateau Fence Lizard (Sceloporus tristichus), which is 2n = 22 with six pairs of macrochromosomes and five pairs of microchromosomes, a karyotype formula shared with all 10 member of the undulatus species group (Leaché ). An early study of the karyological differences within the group using conventional staining techniques identified a distinctive pericentric inversion polymorphism on chromosome 7 that varies geographically (Cole 1972). This inversion polymorphism produces distinctly recognizable variants classified by the position of the centromere (Cole 1972), and individuals with heteromorphic pairs of distinct chromosome 7 inversions can be found in a dynamic hybrid zone (Leaché and Cole 2007). The hybrid zone occurs in Arizona’s Colorado Plateau at the ecotone between Great Basin Conifer Woodland and Grassland habitats (Leaché and Cole 2007). Temporal comparisons of clinal variation in kayotypes, morphology, mitochondrial DNA, and SNPs suggests that the hybrid zone is moving, possibly as a response to habitat modification (Leaché and Cole 2007; Leaché ). The new genomic resources, reported here, provide a framework for increasing our understanding of the formation and spread of large chromosome inversions across broad temporal and spatial scales.

Materials and methods

Sampling, library preparation, and sequencing

We sequenced two female specimens of S. tristichus collected from a hybrid zone in Navajo County, Arizona, USA in 2002. The two individuals were karyotyped to determine the position of the centromere on chromosome 7 in a previous study (Leaché and Cole 2007). One specimen (NCBI BioSample ID SAMN15450893; voucher specimen AMNH 153954) is from the northern end of the hybrid zone (Holbrook; HOL), and is submetacentric at chromosome 7 (a nonmedial centromere that is closer to the middle than to either end of the chromosome). The second specimen (NCBI BioSample ID SAMN15450894; voucher specimen AMNH 154032) is from the southern end of the hybrid zone (Snowflake; SNOW), and is telocentric at chromosome 7 (a terminal centromere that is close to the end of the chromosome) (Table S1). Genome sequencing was performed by Dovetail Genomics (Santa Barbara, CA, USA). High molecular weight genomic DNA was extracted from flash frozen liver tissues and validated for large fragment sizes (50–100 kb) using pulsed field gel electrophoresis. 10X Chromium libraries were prepared using the Chromium Genome Reagent Kit (10x Genomics, Pleasanton, CA, USA) followed by Illumina HiSeqX sequencing (150-bp paired-end reads). One Chicago (Putnam ) and one Hi-C library (Belton ) were prepared for each sample. Quality control for these libraries was performed by mapping reads (75-bp paired-end MiSeq reads) to draft 10x Supernova assemblies. Finally, both libraries were sequenced on an Illumina HiSeqX (150-bp paired-end reads). Raw sequencing reads are deposited in the NCBI Sequence Read Archive (BioProjectID PRJNA644186).

Genome assembly and annotation

For each individual, a draft genome was de novo assembled from 709.35 million 10X Chromium sequence reads using the SuperNova assembly pipeline. Chromosome-scale scaffolding was achieved by mapping the Chicago and Dovetail Hi-C libraries back to the 10x Supernova assembly (Weisenfeld ) using the HiRise software pipeline (Putnam ). Functional annotation of the two genomes was conducted with Funannotate v1.7.2. (Palmer and Stajich 2019) on a 64-core Ubuntu machine and the Hyak NextGen cluster supercomputer at the University of Washington. Briefly, Funnanotate aligns raw RNASeq reads to a genome sequence with minimap2 and assembles them using Trinity Grabherr . Such assemblies are used along with PASA predictions Haas to build consensus gene model predictions, and train Augustus Stanke and GeneMark-ES/ET Lomsadze . Repetitive regions are softmasked in Funannotate and the softmasked genomes are passed to the ab initio predictors. We used the Tetrapoda BUSCO (Benchmarking Universal Single-Copy Orthologs) dataset for gene prediction (Simão ) and specified the option –repeats2evm, which passes repeat information to Evidence Modeler. Finally, Funnanotate performs genome functional annotations using the outputs of PFAM (Finn ), MEROPS (Rawlings ), InterProScan5 (Jones ), eggNOG-mapper (Huerta-Cepas ), and Phobius (Käll ). We ran the pipeline for each of the two HiRise genome assemblies and included a raw RNASeq dataset prepared from skeletal muscle of an adult female of S. undulatus (BioSample SAMN06312743) to build consensus gene model predictions. Functional annotation was performed on the 11 scaffolds or putative chromosomes, which included 86.4% and 91.3% of the total length of the HiRise assemblies (Table 1).
Table 1

Summary statistics for the HiRise genome assemblies

MetricHOLSNOW
Assembly size1,870.611,844.30
Scaffolds >10 kb25,615.68X32,060.90X
L5033
N50269.061267.474
L901510
N9017.36736.595
Chromosome 1366.851366.429
Chromosome 2304.053319.091
Chromosome 3269.060267.474
Chromosome 4249.456247.712
Chromosome 5173.679170.938
Chromosome 6164.133162.452
Chromosome 750.42450.665
Chromosome 838.60138.926
Chromosome 936.60236.595
Chromosome 1019.41316.366
Chromosome 1111.80911.585

Sequence lengths are given in megabases. L50, minimum number of scaffolds that make up 50% of the total assembly length; N50, minimum scaffold size of fragments that make up 50% of the total assembly length; L90, minimum number of scaffolds that make up 90% of the total assembly length; N90, minimum scaffold size of fragments that make up 90% of the total assembly length.

Summary statistics for the HiRise genome assemblies Sequence lengths are given in megabases. L50, minimum number of scaffolds that make up 50% of the total assembly length; N50, minimum scaffold size of fragments that make up 50% of the total assembly length; L90, minimum number of scaffolds that make up 90% of the total assembly length; N90, minimum scaffold size of fragments that make up 90% of the total assembly length.

Pericentric inversion polymorphism

To identify the locations of inversions on chromosome 7 between HOL and SNOW we used progressive Mauve v.2.4.0 (Darling ). The program was run with default “seed families” and default values for all other parameters. We determined the identity of large scaffolds by comparing S. tristichus assemblies to the assembled chromosome-scale genome of the closely related species S. undulatus (Westfall ) (Table S3 and S4). Furthermore, HiCUP v0.8.0 (Wingett et al. 2015) was used to map the Hi-C paired end raw sequencing reads of HOL against the assembled chromosome 7 of SNOW using bowtie2 v.2.4.2. Briefly, HiCUP was used to digest the SNOW genome with DpnII, truncate the HOL raw reads at the putative Hi-C ligation junction, map Hi-C raw reads to the SNOW digested reference genome, and remove Hi-C artefacts and putative PCR duplicates. A pairwise interaction matrix was computed using 500-kb windows with HOMER (Heinz ). The matrix was normalized for sequence depth. Finally, a normalized linkage density heatmap of the first 25 Mb was generated with TreeView3 (Page 1996).

Population genomics of the plateau fence lizard

To investigate the population structure of S. tristichus, we collected ddRADseq data for 44 samples from across the Colorado Plateau, USA (Table S5). The demultiplexed data are available at the NCBI Sequence Read Archive (SAMN14488621–SAMN14488665). The SNOW genome presented here was used as a reference to guide the assembly and alignment. To estimate the number of populations (K), we used PCA and discriminant analysis of principal components (DAPCs). A concatenated phylogenetic analysis of the SNP data using maximum likelihood with an acquisition correction model was estimated using one sample of Sceloporus cowlesi to root the phylogenetic tree. Detailed methods for sequencing and analysis of the population genomics data are provided in the Supplementary Materials.

Data availability

The raw data for genome sequencing underlying this article are available at the NCBI Sequence Read Archive (SRA) under Bioproject ID: PRJNA616379 (SRR12147734–SRR12147739). GenBank accession numbers for the two genome assemblies are JACSCI000000000 (HOL) and JACSCJ000000000 (SNOW). ddRADseq raw reads are under NCBI SRA accessions (SAMN14488621–SAMN14488665). Supplementary data including annotations and Supplemental Methods, Tables S1–S8, and Figures S1–S10 are available at figshare: https://doi.org/10.6084/m9.figshare.13180268.

Results and discussion

Assembly and annotation

10X SuperNova assemblies included 31,453 and 30,454 scaffolds at 53.06X and 53.88X coverage, with a total assembly length of 1870.17 and 1843.78 Mb. In turn, the final HiRise assemblies resulted in 27,095 and 25,281 scaffolds at 25,615X and 32,060X coverage, adding to a total length of 1,870.61 and 1,844.30 Mb for HOL and SNOW, respectively. Genome assembly statistics are provided in Table 1 and in the Supplementary Material (Figures S1 and S2; Table S2 and S3). The top 11 scaffolds account for 86.44% and 91.79% of the total length of the HOL and SNOW assemblies, and comparisons to the S. undulatus genome confirmed that these scaffolds are the 11 chromosomes (Supplementary Figures S3 and S4). The top six scaffolds are on average ∼206 Mb longer than the subsequent five, which is expected given that S. tristichus has 6 macrochromsomes. Functional annotation of the genomes resulted in up to 14,128 and 15,469 predicted gene models for HOL and SNOW, respectively (https://doi.org/10.6084/m9.figshare.13180268). Chromosome 7 spanned 864 annotated genes for HOL, whereas 1,030 were identified for SNOW. The higher number of annotated genes in SNOW is likely the result of a higher quality and more complete assembly compared to HOL as interpreted from the assembly summary statistics in Table 1. Alignment of chromosome 7 with Mauve and HOL Hi-C reads mapping to chromosome 7 in SNOW resulted in the detection of a 4.62-Mb pericentric inversion (Figures 1 and 2). A chromosomal rearrangement at chromosome 7 distinguishes populations of S. tristichus located at opposite ends of the hybrid zone and is expected to reduce recombination in hybrids with heteromorphic pairs of chromosome 7, which are found at the center of the hybrid zone.
Figure 1

Pericentric inversion polymorphism on chromosome 7 in Sceloporus tristichus. (A) Karyotype of a naturally occurring hybrid from Arizona with a heteromorphic pair of chromosome 7 (telocentric + submetacentric; specimen voucher AMNH 112492; photo courtesy of Charles J. Cole). (B) Chromosome 7 alignment using Mauve for the two S. tristichus genomes sequenced in this study. The samples used for genome sequencing are from opposite sides of a hybrid zone (Holbrook and Snowflake). The two scaffolds that span chromosome 7 in HOL are indicated with a red line. Color bars indicate syntenic blocks, and connecting lines indicate correspondence of blocks across genomes. Blocks below the center line indicate regions that are inverted. The height of the bars indicates the average level of conservation in that region of the genome. (C) Population structure results using ddRADseq data for 44 samples assuming a K = 2 model supports northern and southern populations that come into close geographic proximity at a hybrid zone in Arizona between Holbrook (northern population) and Snowflake (southern population). (D) Annotated genes located inside the pericentric inversion. Genes marked with an asterisk were not annotated for HOL so their position in SNOW is given.

Figure 2

Normalized linkage density heatmap of the first 25 Mb of chromosome 7 of S. tristichus. Bins of 500 kb were used. The heatmap is visualized as log2 ratios of observed interactions relative to expected interactions. Raw Hi-C sequence reads of HOL were mapped to SNOW. Blue squares indicate high interactions that are visible at the length reported for the inversion in this study (4.62MB). Interactions along the diagonal are the highest, as expected. The blue arrow indicates the approximate number of interaction counts at the inversion.

Pericentric inversion polymorphism on chromosome 7 in Sceloporus tristichus. (A) Karyotype of a naturally occurring hybrid from Arizona with a heteromorphic pair of chromosome 7 (telocentric + submetacentric; specimen voucher AMNH 112492; photo courtesy of Charles J. Cole). (B) Chromosome 7 alignment using Mauve for the two S. tristichus genomes sequenced in this study. The samples used for genome sequencing are from opposite sides of a hybrid zone (Holbrook and Snowflake). The two scaffolds that span chromosome 7 in HOL are indicated with a red line. Color bars indicate syntenic blocks, and connecting lines indicate correspondence of blocks across genomes. Blocks below the center line indicate regions that are inverted. The height of the bars indicates the average level of conservation in that region of the genome. (C) Population structure results using ddRADseq data for 44 samples assuming a K = 2 model supports northern and southern populations that come into close geographic proximity at a hybrid zone in Arizona between Holbrook (northern population) and Snowflake (southern population). (D) Annotated genes located inside the pericentric inversion. Genes marked with an asterisk were not annotated for HOL so their position in SNOW is given. Normalized linkage density heatmap of the first 25 Mb of chromosome 7 of S. tristichus. Bins of 500 kb were used. The heatmap is visualized as log2 ratios of observed interactions relative to expected interactions. Raw Hi-C sequence reads of HOL were mapped to SNOW. Blue squares indicate high interactions that are visible at the length reported for the inversion in this study (4.62MB). Interactions along the diagonal are the highest, as expected. The blue arrow indicates the approximate number of interaction counts at the inversion. A total of 62 annotated genes with known function are found within the region that spans the inversion (Figure 1 and Supplementary Material Table S4). Whether any of these genes are involved in local adaptation, functional enrichment, or fixation of the inversion remains to be examined. The large size of this inversion makes it more likely that it would span multiple loci involved in local adaptation. However, it also makes it more likely to suffer meiotic costs that could reduce hybrid fitness (Kirkpatrick and Barton 2006). Adding sequencing data to refine our assemblies would help to confirm the extent of the inversion, as chromosome 7 in HOL included two scaffolds. As we used S. undulatus as a reference to confirm our chromosome-scale assemblies, we expect to find 1:1 synteny with S. undulatus (Westfall ). Therefore, the pericentric inversion could be much larger than reported here, since it is possible that either one or both of the chromosome 7 scaffolds in HOL are reversed. Reference-based assembly of the ddRADseq data for 44 S. tristichus provided 8,652 loci mapping to scaffolds and a total of 20,299 SNPs (Supplementary Table S7). PCA analysis revealed regional population structuring across the Colorado Plateau, and DAPC results provide biologically plausible models for K = 2, 3, and 4 (Supplementary Figures S5–S9). The hybrid zone samples (HOL and SNOW) belong to distinct populations under each of these models; HOL is a member of a northern population that extends north across the Colorado Plateau into Utah, Colorado, and Wyoming, while SNOW clusters with other southern samples from the Mogollon Rim and central Arizona. The phylogenetic analysis used 8,130 biallelic SNPs and supported clades that are concordant with the populations identified using PCA and DAPC (Supplementary Figure S10 and Supplementary Table S8). This analysis provides an evolutionary context for studying the formation of the hybrid zone, which appears to be the result of secondary contact and not primary divergence (Supplementary Figure S10). These new genomic resources provide a starting point for understanding how a pericentric inversion polymorphism could cycle in frequency in a dynamic hybrid zone that is moving through time. Identifying the location, size, and gene content of this inversion polymorphism provides the necessary tools for investigating the role of inversions on local adaptation and speciation. Click here for additional data file.
  26 in total

1.  Mauve: multiple alignment of conserved genomic sequence with rearrangements.

Authors:  Aaron C E Darling; Bob Mau; Frederick R Blattner; Nicole T Perna
Journal:  Genome Res       Date:  2004-07       Impact factor: 9.043

2.  Chromosome variation, genomics, speciation and evolution in Sceloporus lizards.

Authors:  W P Hall
Journal:  Cytogenet Genome Res       Date:  2010-03-25       Impact factor: 1.636

3.  Genomic resources notes accepted 1 August 2014-30 September 2014.

Authors:  Wolfgang Arthofer; B L Banbury; Miguel Carneiro; Francesco Cicconardi; Thomas F Duda; R B Harris; David S Kang; A D Leaché; Viola Nolte; Coralie Nourisson; Nicola Palmieri; Birgit C Schlick-Steiner; Christian Schlötterer; Fernando Sequeira; Cheolho Sim; Florian M Steiner; Marcelo Vallinoto; David A Weese
Journal:  Mol Ecol Resour       Date:  2014-11-25       Impact factor: 7.090

4.  Evidence for concerted movement of nuclear and mitochondrial clines in a lizard hybrid zone.

Authors:  Adam D Leaché; Jared A Grummer; Rebecca B Harris; Ian K Breckheimer
Journal:  Mol Ecol       Date:  2017-02-18       Impact factor: 6.185

5.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities.

Authors:  Sven Heinz; Christopher Benner; Nathanael Spann; Eric Bertolino; Yin C Lin; Peter Laslo; Jason X Cheng; Cornelis Murre; Harinder Singh; Christopher K Glass
Journal:  Mol Cell       Date:  2010-05-28       Impact factor: 17.970

6.  Gene identification in novel eukaryotic genomes by self-training algorithm.

Authors:  Alexandre Lomsadze; Vardges Ter-Hovhannisyan; Yury O Chernoff; Mark Borodovsky
Journal:  Nucleic Acids Res       Date:  2005-11-28       Impact factor: 16.971

7.  InterProScan 5: genome-scale protein function classification.

Authors:  Philip Jones; David Binns; Hsin-Yu Chang; Matthew Fraser; Weizhong Li; Craig McAnulla; Hamish McWilliam; John Maslen; Alex Mitchell; Gift Nuka; Sebastien Pesseat; Antony F Quinn; Amaia Sangrador-Vegas; Maxim Scheremetjew; Siew-Yit Yong; Rodrigo Lopez; Sarah Hunter
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

8.  Phylogenomics of a rapid radiation: is chromosomal evolution linked to increased diversification in north american spiny lizards (Genus Sceloporus)?

Authors:  Adam D Leaché; Barbara L Banbury; Charles W Linkem; Adrián Nieto-Montes de Oca
Journal:  BMC Evol Biol       Date:  2016-03-22       Impact factor: 3.260

9.  The Pfam protein families database: towards a more sustainable future.

Authors:  Robert D Finn; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Jaina Mistry; Alex L Mitchell; Simon C Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A Salazar; John Tate; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2015-12-15       Impact factor: 16.971

10.  HiCUP: pipeline for mapping and processing Hi-C data.

Authors:  Steven Wingett; Philip Ewels; Mayra Furlan-Magaril; Takashi Nagano; Stefan Schoenfelder; Peter Fraser; Simon Andrews
Journal:  F1000Res       Date:  2015-11-20
View more
  2 in total

1.  A chromosome-level genome assembly for the eastern fence lizard (Sceloporus undulatus), a reptile model for physiological and evolutionary ecology.

Authors:  Aundrea K Westfall; Rory S Telemeco; Mariana B Grizante; Damien S Waits; Amanda D Clark; Dasia Y Simpson; Randy L Klabacka; Alexis P Sullivan; George H Perry; Michael W Sears; Christian L Cox; Robert M Cox; Matthew E Gifford; Henry B John-Alder; Tracy Langkilde; Michael J Angilletta; Adam D Leaché; Marc Tollis; Kenro Kusumi; Tonia S Schwartz
Journal:  Gigascience       Date:  2021-10-01       Impact factor: 6.524

2.  The recombination landscapes of spiny lizards (genus Sceloporus).

Authors:  Cyril J Versoza; Julio A Rivera; Erica Bree Rosenblum; Cuauhcihuatl Vital-García; Diana K Hews; Susanne P Pfeifer
Journal:  G3 (Bethesda)       Date:  2022-02-04       Impact factor: 3.542

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.