Literature DB >> 29897548

Highly dense linkage maps from 31 full-sibling families of turbot (Scophthalmus maximus) provide insights into recombination patterns and chromosome rearrangements throughout a newly refined genome assembly.

F Maroso1, M Hermida2, A Millán1, A Blanco2, M Saura3, A Fernández3, G Dalla Rovere4, L Bargelloni4, S Cabaleiro5, B Villanueva3, C Bouza2, P Martínez2.   

Abstract

Highly dense linkage maps enable positioning thousands of landmarks useful for anchoring the whole genome and for analysing genome properties. Turbot is the most important cultured flatfish worldwide and breeding programs in the fifth generation of selection are targeted to improve growth rate, obtain disease resistant broodstock and understand sex determination to control sex ratio. Using a Restriction-site Associated DNA approach, we genotyped 18,214 single nucleotide polymorphism in 1,268 turbot individuals from 31 full-sibling families. Individual linkage maps were combined to obtain a male, female and species consensus maps. The turbot consensus map contained 11,845 markers distributed across 22 linkage groups representing a total normalised length of 3,753.9 cM. The turbot genome was anchored to this map, and scaffolds representing 96% of the assembly were ordered and oriented to obtain the expected 22 megascaffolds according to its karyotype. Recombination rate was lower in males, especially around centromeres, and pairwise comparison of 44 individual maps suggested chromosome polymorphism at specific genomic regions. Genome comparison across flatfish provided new evidence on karyotype reorganisations occurring across the evolution of this fish group.

Entities:  

Mesh:

Year:  2018        PMID: 29897548      PMCID: PMC6105115          DOI: 10.1093/dnares/dsy015

Source DB:  PubMed          Journal:  DNA Res        ISSN: 1340-2838            Impact factor:   4.458


1. Introduction

The information provided by genomes has become invaluable for research fields that deal with animal production, conservation biology or evolution. High-density (HD) genetic maps provide information on genome organisation by establishing the cartography of thousands of markers, which can aid to disentangle the genetic architecture of productive or evolutionary traits. Moreover, these maps represent an essential tool for comparative genomics and can facilitate whole genome assembly., The advent of next generation sequencing (NGS) has dramatically reduced the cost and time required for genomic analysis,, thus contributing to a fast increase of genomic resources. Nevertheless, while DNA sequencing has become an automated task at platforms, genome assembling still relies on the hierarchical ordering of short DNA fragments (contigs, scaffolds), currently being facilitated by the novel long-read assembling approaches (Nanopore, PacBio, Dovetail, Bionano). HD genetic mapping can assist the process of genome assembly through positioning and ordering scaffolds or validating bioinformatic assembly.,, This is a particularly challenging task in teleosts, considering the ancestral teleost-specific whole genome duplication event in this group of fish. A huge number of markers and samples can now be analysed at very low cost using genotyping by sequencing (GBS) methods,,, which enables to simultaneously identify and genotype thousands of single nucleotide polymorphisms (SNPs) across samples. A popular GBS method is Restriction Associated DNA sequencing (RADseq), a technique that combines the power of NGS with the simplification of genomes through restriction enzyme digestion. RADseq is being increasingly used for the construction of HD genetic maps in fish comprising thousands of markers (e.g. Refs ). These maps have allowed the identification of genomic regions associated with productive traits, or with adaptation to specific environments., The pattern of samples and families used in fish for map construction has been quite similar, and only very recently, the number of families has increased., However, when many families have been analysed, the number of offspring per family has been much smaller (e.g. Tsai et al. 2016: 60 families with less than 10 offspring each; Palaiokostas et al. 2016: 75 families with 10 offspring on average). This is a way to achieve a consensus reference map at species level by averaging inter-individual variation, but the low family sampling size can result in suboptimal estimation of inter-marker distances and mapping order., The presence of genomic reorganisations and variation of mapping parameters across individual or sex-specific maps has been hardly addressed to date in fish despite its importance from both evolutionary and practical perspectives. For example, inversions might aid to maintain blocks of coadapted genetic variants at specific environments, such as reported in fish like cod or stickleback., Moreover, sex-specific mapping features are important when tracking markers associated to economically relevant traits since recombination patterns in males and females may differ and thus, they should be considered for accurate evaluations. NGS technologies now enable studying intra-specific variation on genome organisation and recombination with enough depth to improve the analysis of associations with traits of interest. This would aid for a more accurate mining to identify responsible genes and mutations and choose the best strategy for marker assisted selection. The turbot (Scophthalmus maximus) is the main worldwide aquaculture flatfish, People’s Republic of China being currently the most important producer (>60,000 tons/year). Genetic breeding programs in this species began in the 90s. Nowadays, a broad battery of genetic markers and genomic tools are available in turbot, which can aid to understand the genetic basis of the main economically important traits (growth rate, disease resistance, sex control) and for applying a more efficient selection. The first genetic maps in turbot, based on anonymous or gene-associated microsatellites and SNPs, were later integrated in a consensus map. More recently, a high-density map was reported by Wang et al., but the scarce information provided on genetic marker-associated sequences made it not feasible to integrate with previously reported maps. Centromeres have been located in the turbot genetic map using half-tetrad analysis, and recently, a high coverage draft genome has been assembled and anchored to the reported consensus map. Moreover, the development of genomic resources for other important flatfish species and their integration in a common framework has allowed comparative analyses to transfer genomic information on production-related traits between species and to look for insights into flatfish evolution., In this study, we genotyped 18,214 SNPs using the 2b-RAD technology in 1,268 individuals from 31 turbot full-sibling (sib) families with the aim of: (i) obtaining a robust consensus genetic map for the species; (ii) integrating the genetic and physical maps in a common framework to facilitate transference of segregation data into the physical genome; (iii) refining the turbot genome assembly and comparative mapping with other flatfish; and (iv) assessing the variation in genome organisation and recombination between sexes and individuals. Our results rendered much deeper information on intra-specific mapping variation than previously reported in fish and provided useful integrated genomic resources to understand the architecture of economically important traits for further applications in industrial breeding programs.

2. Materials and methods

2.1. Families for mapping

The broodstock of CETGA (Aquaculture Cluster of Galicia, Ribeira, Spain), a population representative of turbot from the Atlantic area, was used to found a set of 44 families within the goals of the FISHBOOST project (EU 613611). This project is focused on improving the efficiency and profitability of finfish European aquaculture by fostering selective breeding to a next level, including the application of genomic selection. This goal also involved the refinement and improvement of existing mapping resources and their integration in a common framework within the reported turbot genome assembly. The experiment was performed under approval of the ethics committee of CETGA to adjust to animal welfare legislation. A total of 31 full-sib families were selected among the 44 families considering both successful genotyping and informativeness (Fig. 1). The partly factorial mating of 22 males × 22 females performed made available 27, either maternal of paternal, half-sibs families (eight males and nine females contributed to more than one family). A total of 1,224 offspring were used for linkage analysis representing on average 39.5 individuals per family (range: 36–45).
Figure 1.

Families selected for linkage analysis in S. maximus. Male (M) × female (F) crosses and the number of offspring per family are indicated. Note the presence of several half-sib families sharing the father or the mother.

Families selected for linkage analysis in S. maximus. Male (M) × female (F) crosses and the number of offspring per family are indicated. Note the presence of several half-sib families sharing the father or the mother.

2.2. SNP calling and genotyping

2.2.1. DNA extraction and library preparation

Genomic DNA was extracted from fin clips using SSTNE buffer (a TNE buffer modified by adding spermidine and spermine) and a standard NaCl isopropanol precipitation. Library preparation followed the 2b-RAD protocol slightly modified to include replicated polymerase chain reactions (PCRs) per sample (to reduce PCR duplicate bias) and to quantify DNA sampling before pooling (to increase sample coverage homogeneity). Detailed information about the protocol can be found in Supplementary Material text. Briefly, DNA of each sample was separately digested using AlfI restriction enzyme, adaptors including specific sample barcodes ligated, and the resulting fragments amplified. After PCR purification, samples were merged in pools of 128 samples for sequencing and genotyping the offspring, while the 44 parents were genotyped in a single pool, to increase coverage and genotyping accuracy. A total of 11 pools (1 for the parents and 10 for the offspring) were sequenced on an Illumina NextSeq sequencer using a 50 bp single end protocol at BMR Genomics (Padova, Italy).

2.2.2. Raw data filtering and bioinformatics

Raw data were demultiplexed at the sequencing facility. All reads were trimmed at 36 bp and centred on the enzyme recognition site. Stacks' process_radtag pipeline, was used to filter raw reads according to sequencing quality. All reads with an uncalled base were removed, and quality was checked using a sliding window of 25% of the sequence length. If the mean quality score of this window dropped below 20, the sequence was removed. Filtered reads were mapped against the turbot genome using Bowtie 1.2.0 and allowing a maximum of three mismatches in the first 33 bases. Only a unique match on turbot genome was permitted, otherwise reads were discarded. Mapped reads from parent samples were fed into Stacks' module ref_map.pl using the bounded model with a significance threshold of 0.05 and an upper bound of 0.05. As a further filter, only SNPs genotyped in 80% of the parents with a minimum read depth of 10 reads were retained using the populations module. When a RAD-tag contained more than one SNP, only the first SNP from the 5' end was retained. The set of SNPs obtained from parents was used as a reference for offspring genotyping using the Stacks module sstacks. The obtained dataset (parents and offspring) was further filtered according to population parameters and Mendelian segregation using the 26 most informative families. Those markers showing consistent deviations from Mendelian segregation (P < 0.01) across families or extreme departures from Hardy–Weinberg equilibrium (HWE) in the parental population (P < 0.001) due to heterozygote excess (FIS < −0.5, P < 0.0001; paralogous genes) or deficit (FIS > 0.5, P < 0.0001; null alleles) were filtered out. Finally, RAD-tags containing SNPs were mapped against the turbot gene catalog using Blastn (E-value < e−20) to obtain sequence annotations.

2.3. Construction of genetic maps

Genotypes of filtered SNP markers were properly coded as JOINMAP Cross Pollinator (CP) cross type with unknown linkage phase. Markers genotyped in less than 90% of the offspring were excluded for linkage analysis in each family. All genetic maps (individual, family, male consensus, female consensus and species consensus) were constructed using JoinMap 4.1. First, markers were associated to their linkage groups (LGs) using the Grouping function of JoinMap based on a series of LOD scores (logarithm of the likelihood ratio that two genes are linked regarding that they segregate independently) increasing by one, from 3 to 10. The LOD score was then selected in each family based on the number of turbot chromosomes (n = 22); in all cases, a LOD > 6.0 was used. Unlinked markers and markers assigned to very small LGs exceeding the haploid number of 22 chromosomes were excluded from further analysis. Second, marker ordering was performed using the Maximum Likelihood algorithm with default parameters, and the recombination frequencies converted into map distances in centi-Morgans (cM) using the Kosambi mapping function. For each family, individual female and male maps, as well as a consensus family map, were generated. Due to computational limitations, only the three most informative families were selected to construct the species consensus, the female consensus and the male consensus maps using MergeMap. All maps constructed were drawn using MapChart ver. 2.3. In order to investigate the heterogeneity of recombination frequency (RF) between sexes, RFs were extracted from JoinMap analysis. The pattern of RF in the father and the mother could not be straightforwardly compared within each family due to the different panel of informative SNPs; consequently, the set of markers shared between both parents was selected in each family for a detailed evaluation of RF variation between sexes across LGs within each family (heterozygous in both parents with 1:2:1 segregation).

2.4. Genetic versus physical map: anchoring scaffolds to linkage groups

The consensus genetic map was used to refine the turbot genome assembly and to obtain a physical map consistent on 22 megascaffolds, representing the complete sequence of each of the 22 LGs of turbot map., This task implied (i) to assign previously unlinked scaffolds to specific LGs in the new high-dense (HD) genetic map; (ii) to correct scaffold misassemblies by comparing the genetic and physical maps; and (iii) to establish the correct orientation of scaffolds within each megascaffold. All contigs including at least one marker significantly assigned to a LG were anchored in the new map. When markers of a specific scaffold were assigned to more than one LG, the scaffold was split into fragments pertaining to different LGs or to the same LG but in different positions (see Results). To establish the most confident splitting of these scaffolds, we compared the sequence of the gap between the flanking markers assigned to different LGs with the two model fish genomes, Gasterosteus aculeatus and Oryzias latipes, which had demonstrated large synteny with turbot., Then, we looked for the turbot genes in the gap assigned to different LGs in the model species in order to narrow down as much as possible the gap (Supplementary Fig. S1). Following this approach, the narrowest gap was finally cut in the middle and halves assigned to each LG. Reorganisations were only validated when information was consistent with both model species. To find out the correct order and orientation of scaffolds within LGs, we obtained the average map position of the markers belonging to each scaffold using the maps of the 10 most informative families. Then, we checked for the correlation between the physical position of markers and their genetic position. When the correlation was positive, the original orientation was maintained, and when negative, the reverse complementary sequence of the scaffold was retrieved for assembling.

2.5. Integrating genetic and physical maps with cytogenetic data

Genetic and physical maps were then integrated with the turbot cytogenetic map using previous information.,, Starting from centromere positions and considering the relationship between genetic and cytogenetic maps, we established the correspondence between LGs, megascaffolds and chromosomes as far as possible. Then, we compared the consensus genetic map of each chromosome with its physical assembly (megascaffold) to assess their correspondence with centromere, telomere and interstitial heterochromatin locations. When no marker was available close to the centromeric region, the closest markers around the centromere were used as reference or, in a few cases, the farthest marker from the centromere was used for its rough location at the other chromosome end, considering the acrocentric morphology of most turbot chromosomes. Finally, this comparison was also carried out between sexes using the sex-specific consensus maps, in order to assess RF differences across chromosomes between males and females.

2.6. Inter-individual genomic reorganisations in turbot

The availability of 31 family maps and 44 parental maps (Fig. 1), enabled an exhaustive comparison of genetic maps between individuals and sexes to assess inter-individual variation on RF across the genome. This variation, which might be related to chromosome polymorphisms existing in turbot populations, had not been previously addressed in fish with the number of markers and families here evaluated. In order to identify putative genomic reorganisation between individuals, we performed a pair-wise comparison between all parental maps by plotting the positions of each marker. Plots were created with R and the different LGs were labelled with different colours for a better visualisation. In order to handle the amount of data obtained, 62 individual maps at 22 LGs (due to half-sib families; Fig. 1), each LG was divided in stretches of 50 SNPs representing ∼1.6 Mb on average (e.g. 23 blocks for LG02, the longest LG with 1,132 SNPs). Then, the correlation between the physical and genetic maps was analysed for each block at each individual map in order to identify potential inversions. A heat map was generated to provide a more straightforward view of the results.

2.7. Interspecific genomic reorganisations between turbot and other flatfish species

The improved turbot genome was further exploited to complete and refine previous information about genomic reorganisations and chromosome evolution of flatfish., For this, we compared the improved turbot genome with those of Japanese flounder (Paralichthys olivaceus) and half-smooth tongue sole (Cynoglossus semilaevis) at chromosome level using LASTZ (using options ‘–notransition – step = 400 – nogapped – format = rdotplot’) and compared each turbot chromosome with the orthologues of the other two species. Further, in order to assess the relation between flatfish megascaffolds to a model fish genome outside Pleuronectiformes, an additional LASTZ analysis was carried out with stickleback (Gasterosteus aculeatus) using the same parameters.

3. Results

3.1. SNP calling, filtering and genotyping

A total of 3,946,354,960 reads (3.9 Gb) were produced at the BMR sequencing platform, corresponding to 1,282 individuals represented by 3,078,280 reads on average (range: 312 – 9,610,380). Fourteen offspring were removed due to the low number of usable reads (< 75,000), leaving a total of 1,268 individuals (44 parents and 1,224 offspring) pertaining to 31 full-sib families for subsequent analyses. After quality filtering and alignment to the turbot genome, the average number of reads per sample was 6,663,809 (range: 2,601,955–8,946,621) for parents and 2,774,579 (range: 283,203–5,596,610) for offspring. After Stacks' populations filter, 139,293 RAD-tags were retained, of which 25,511 (18.3%) contained at least one SNP. Since only one SNP was retained per RAD-tag, a total of 25,511 SNPs were considered after eliminating the remaining SNPs in the same RAD-tag (Supplementary Table S1). Among these 25,511 SNPs, 18,214 were finally retained after filtering loci by: (i) consistent deviations from Mendelian segregation (5,866 loci: P < 0.01 in at least two families and representing more than 20% of the informative families); (ii) very low minimum allele frequency in the parental population (< 0.015) suggestive of genotyping errors (991 SNPs); and (iii) extreme deviations from HWE suggesting either paralogous genes (364 loci) or high null allele frequency (91 loci) (Supplementary Table S1). This set of 18,214 SNPs represented a frequency of one SNP every 163 bp in the reduced portion of the genome sequenced with 2b-RAD, a frequency slightly lower than that reported in other studies in turbot,, and a density of one SNP every 31,130 bp in the whole turbot genome (567 Mb). The average proportion of genotyping errors (Mendelian inconsistencies in families) of our 2b-RAD genotyping approach was 0.68% throughout all families in the filtered SNP dataset. Out of the 18,214 SNPs finally retained, nearly half matched to turbot genes (9,028). Among those lying within exons (834), 400 corresponded to non-synonymous and 430 to synonymous allelic variants (Supplementary Table S1).

3.2. Construction of genetic maps

On average, 5,385 SNPs were used for map construction in each of the 31 families analysed (range: 2,079 in Fam55 to 6,757 in Fam30; Supplementary Table S2). Maps consisted of 22 LGs corresponding to the 22 chromosomes of the turbot haploid karyotype, averaging 140.5 cM and 245 markers per LG and family (Supplementary Tables S2 and S3). Note that LGs nomenclature follow previous publications and accordingly, LG18 is missing due to its merging with LG08 (LG08 + 18, here LG08). LG22 was on average the shortest LG (94.9 cM; range: 28.1 cM in Fam55 to 223.5 in Fam19) and included the lowest number of markers (189.9; range: 56 in Fam16 to 257 in Fam39). LG02 was on average the longest LG (207.7 cM; range: 49.9 cM in Fam59 to 414.0 cM in Fam19) with the highest number of markers (341.7; range: 97 in Fam55 to 457 in Fam39). A total of 17,169 SNPs were successfully mapped at least in one family, 1,045 remaining unmapped. Of the last, 1,029 were located in scaffolds where other markers were anchored to the linkage map and could therefore be assigned to specific LGs. Only 16 SNPs could not be assigned to any LG. Marker interval across all families averaged 0.6 cM and ranged from 0.43 cM in Fam11 to 1.08 cM in Fam19. If unique positions instead of markers were accounted for, marker interval averaged 2.0 cM and ranged from 1.81 in Fam13 cM to 2.74 cM in Fam55. The turbot consensus map constructed with the three most informative families using MergeMap included 11,845 markers distributed over 22 LGs representing a total length of 8,532.6 cM (Supplementary Table S3A; Fig. 2). LG length varied from 282.9 cM (LG22) to 588.7 cM (LG02), with an average length of 387.8 cM. The average marker interval was 0.72 cM and ranged from 0.57 cM in LG01 to 0.89 cM in LG4. Because of the inflated length observed for this consensus map, we also estimated the average LG lengths from the three families used to construct the consensus, as these values represent more realistic lengths than those rendered by the consensus MergeMap algorithm (see Discussion) (Table 1). These LG lengths sum up to 3,753.9 (i.e. an average marker interval of 0.32 cM).
Figure 2.

Consensus genetic map of S. maximus using the three most informative full-sib families; genetic distance in cM in the left bar; the position of centromeres (C) or genetic markers far from centromere (T) in acrocentric chromosomes are indicated.

Table 1

Average length and constitution (SNP and scaffold number) of LGs in the species, female and male S. maximus consensus maps

KaryotypePhysical map
Species consensus map
Female consensus map
Male consensus map
Chromosome IDLGMegascaffold length (bp)No SNPsNo scaffoldsAverage contig length (bp)Mapped markersAverage lengthMb/cMMapped markersAverage lengthMb/cMMapped markersAverage lengthMb/cM
C70126,874,6051,03132839,831.4671178.30.151487176.60.152409162.90.165
C10231,897,0781,14446708,824.0761255.80.125513261.60.122546215.00.148
C170321,400,93578523930,475.4506180.70.118317179.10.119395156.30.137
C50429,497,69690731951,538.6610220.80.134412251.00.118439171.70.172
C120524,809,483707201,240,474.2443125.00.198300129.10.192275107.20.231
C100625,205,37085126969,437.3587186.10.135395205.30.123455153.30.164
C130724,510,356699201,225,517.7412168.30.146317179.40.137301143.30.171
C40830,930,4391,06638813,958.9640266.30.116434303.30.102427196.00.158
C80925,788,341929201,289,417.1593200.00.129415209.60.123406165.10.156
C111025,098,921922251,003,956.8650169.10.148419166.80.150497156.50.160
C61127,408,09580833830,548.3494150.90.182406174.10.157323108.90.252
C91225,240,469796211,201,927.1536194.90.130395204.20.124376162.20.156
C191320,256,549726201,012,827.5472154.00.132349176.30.115346119.20.170
C161421,450,603778201,072,530.2514145.10.148374158.20.136402114.30.188
C31524,127,32488733731,131.0593146.40.165413156.70.154383113.00.214
C21623,833,59681332768,825.7531197.80.120375201.10.119371171.10.139
C211715,881,47568016992,591.8467127.60.124335141.30.112332103.90.153
C151921,781,05775522990,048.0492128.70.169372123.00.177345118.00.185
C142022,751,21977623989,183.4456145.50.156304163.10.139322116.40.195
C182121,354,662760211,016,888.7501136.20.157356138.60.154386123.80.172
C222214,917,408578141,065,529.1408121.00.123291128.60.116259102.40.146
C202319,910,12979622905,005.9508155.40.128378151.40.132378129.40.154
Sub-total524,925,81018,19455811,8453,753.90.1408,3573,978.40.1328,3733,109.90.169
Not anchored19,312,9921615,9361,211.96,3699,8579,841
Total 544,238,80218,21416,494 18,21418,214  18,214  
Average length and constitution (SNP and scaffold number) of LGs in the species, female and male S. maximus consensus maps Consensus genetic map of S. maximus using the three most informative full-sib families; genetic distance in cM in the left bar; the position of centromeres (C) or genetic markers far from centromere (T) in acrocentric chromosomes are indicated. Female and male consensus genetic maps differed significantly totalling 8,274.8 cM and 6,791.6 cM (F:M ratio = 1.22), with average LG length of 376.1 and 308.7 cM, respectively. In both females and males LG02 was the largest LG (545.8 vs 474.9 cM) and LG22 the shortest one (292.1 vs 198.7 cM) (Supplementary Fig. S2). As for the species consensus map, Table 1 reports average LG lengths also for female and male consensus maps. Individual female and male genetic maps (62 in total; Supplementary Tables S2 and S3) were also constructed for a more detailed comparison of RF between individuals and sexes, and to look for putative chromosome polymorphisms in the turbot population (44 parents). The length of female genetic maps was higher than that of male maps in all families and for all LGs (average F:M ratio = 1.36). For a more detailed comparison, 10,168 markers shared between sexes and mapped in at least one family were selected (Supplementary Table S4). As expected, the average inter-marker distance for the same pair of markers was significantly higher in females than in males (2.2 and 1.7 cM, respectively; P < 0.001, t-test for related samples). When F:M intervals were compared at each LG, the difference was significant for all groups excluding LG02, LG17 and LG22 (P > 0.05) (Supplementary Table S4). When F:M intervals were compared across all LGs within each family, again inter-marker distances were higher and significant in females in all but four families (families 16, 22, 47 and 55, P > 0.05; Supplementary Table S4).

3.3. Anchoring turbot genetic and physical maps: megascaffolds

Scaffolds were assigned to the 22 LGs of the turbot consensus map, then ordered and oriented, thus constituting 22 megascaffolds. Series of 100 Ns were used in the megascaffold sequence to show the gap between adjacent scaffolds denoting missing information. Additionally, the short uncertain regions around the boundaries of split scaffolds were written in lowercase to highlight their lower confidence. The complete nucleotide sequence of the 22 megascaffolds with their correspondent genes located and annotated has been uploaded to NCBI (SUB3239805). A total of 28 scaffolds were split and reallocated according to the very consistent mapping information in all the 31 families analysed (Supplementary Table S5). Most scaffolds (22) were split into two fragments and reallocated to different LGs. In three cases, scaffolds were split in three different LGs (sm5_s00043, sm5_s00056 and sm5_s00084), and in other three cases, the two fragments of the scaffold were reallocated within the same LG (sm5_s00006, sm5_s00007 and sm5_s00049). In all cases, the boundaries of the breakage point were assigned with notable accuracy (Supplementary Fig. S1), and uncertain gap regions averaged 38,010 bp, ranging from 549 bp (sm5_s00086) to 160,155 bp (sm5_s00084) (Supplementary Table S5). Rearranged scaffolds represented a total of 108.5 Mb, to say ∼20% of the entire turbot genome. A total of 17,169 SNPs corresponding to 558 scaffolds (after splitting 28) rendered a significant match to the turbot genetic map (Supplementary Table S1). On average each scaffold contained 32.7 SNPs ranging from 610 SNPs in sm5_s00003_17 (LG17) to one SNP in 204 scaffolds. The total length of the anchored sequences was 524,925,810 bp, representing 96.5% of the turbot genome assembly (544 Mb). The average megascaffold length was 23,860,264 bp, ranging from 14,917,408 bp (LG22) to 31,897,078 bp (LG02) (Table 1). Megascaffolds contained on average 827 SNPs (range 578–1144) and 25.4 scaffolds (range 14–46) (Table 1). All scaffolds assigned to chromosomes were consistently ordered within the megascaffold according to the mapping information across families; the orientation within each chromosome could be verified for those 336 that contained at least two informative markers. The remaining 204 scaffolds with only one marker could not be oriented, but they represented a very minor proportion of the total assembly (2.0%) because of their small size. Interestingly, they were mostly located at the end of chromosomes (telomeres), suggesting assembling problems caused by repetitive motifs.

3.4. Integrating cytogenetic information: chromosomes vs LGs vs megascaffolds

Turbot genetic, physical and cytogenetic information obtained in this work and from previous studies was integrated as far as possible, and accordingly, a unified nomenclature is proposed in this study (Table 1). Four chromosomes (C1, C2, C3 and C22) were unequivocally assigned to specific LGs: the two metacentric C1 (LG02) and C2 (LG16), the submetacentric nucleolus organiser region (NOR)-bearing chromosome C3 (LG15) and the smallest acrocentric chromosome C22 (LG22), constituting the four marker chromosomes identifiable in the turbot karyotype. The remaining megascaffolds, which correspond to acrocentric chromosomes of decreasing size, were termed according to their physical length from C4 to C21 (Table 1). The consensus genetic and physical maps showed a linear correspondence along most chromosome length in all LGs, excluding those regions around centromeres and telomeres (Fig. 3A and Supplementary Fig. S3). A cloud of dots indicative of poor mapping accuracy were observed around most mapped centromeres (i.e. LG01, LG10, LG11, LG19 and LG20). Dot clouds were located in intermediate regions of the metacentric chromosomes (LG02 and LG16) or at the terminal regions in the acrocentrics in accordance with their cytogenetic structure. LG15, the NOR-bearing chromosome, showed a moderate dispersion of dots throughout the proximal centromere chromosome half likely related to the tandem rDNA clusters and their associated heterochromatin in that region. Also, close to telomeres, although not so markedly, dots were either dispersed (i.e. LG03, LG08, LG11 and LG14) or showing particular drawings (LG02, LG06 and LG15) suggestive of wrong physical assembly or genetic mapping. Finally, the pattern of the acrocentric chromosome C5 (LG04) was remarkable because of the presence of an internal cloud of dots in addition to that associated around the centromere at the chromosome end. This LG very likely correspond to the chromosome 5 of turbot karyotype, which showed an interstitial heterochromatic block with C-banding. When the same comparison between genetic and physical maps was performed for the female and male consensus maps, dot clouds around centromeres were very prominent in the male map and subtle, and sometimes inexistent, in the female maps (Fig. 3B and Supplementary Fig. S4; i.e. LG01, LG05, LG09, LG15, LG19 and LG22). So, it seems that the length difference between male and female maps is mainly related to the lower recombination rates around centromeres characterising male maps.
Figure 3.

Correspondence between physical (abscissae) and genetic (ordinates) maps showing the dispersion of dots around the centromere at LG11 (A) and the difference between male and female maps at LG11 (B). The vertical bar in both figures show the position of the centromere according to Martínez et al. (2008). Genetic markers corresponding to different scaffolds are shown in different colors.

Correspondence between physical (abscissae) and genetic (ordinates) maps showing the dispersion of dots around the centromere at LG11 (A) and the difference between male and female maps at LG11 (B). The vertical bar in both figures show the position of the centromere according to Martínez et al. (2008). Genetic markers corresponding to different scaffolds are shown in different colors.

3.5. Intraspecific chromosome reorganisations

Marker positions showed a good correspondence between the 62 parental genetic maps (including ‘replicates’ from half-sib families) although particular drawings were suggestive of different marker order between individual maps (Fig. 4 and Supplementary Fig. S5). The negative correlations between physical and genetic maps at these regions supported this observation, and the presence of putative chromosome rearrangements that involved either single blocks (50 SNPs, ∼1.6 Mb) or longer chromosome tracts was identified (Supplementary Fig. S6). Inverted single blocks were very frequent at LG02 (blocks 7 and 23), LG05 (block 12), LG08 (block 2), LG13 (blocks 14 and 15) and LG22 (block 12), while larger inverted chromosome tracts were observed in some individuals at LG06 (blocks 5–9) and LG09 (blocks 13–16) (Supplementary Figs S5 and S6). However, these putative inversions, particularly those related to single blocks, were found close to telomeres or centromeres, where mapping accuracy is poorer both due to assembling problems and to the lower ratio of genetic/physical distance. This suggests that they are more likely related to technical devices than to true reorganisations. Otherwise, the longer tracts observed at LG06 and LG09 were located at interstitial regions, and thus, signatures of inversions should be more consistent.
Figure 4.

Particular drawings denoting the lack of collinearity between two individual genetic maps of S. maximus: Female Fam09 vs Male Fam07 maps; consecutive LGs are represented in different colours from LG01 (left bottom) to LG23 (right top).

Particular drawings denoting the lack of collinearity between two individual genetic maps of S. maximus: Female Fam09 vs Male Fam07 maps; consecutive LGs are represented in different colours from LG01 (left bottom) to LG23 (right top).

3.6. Interspecific chromosome reorganisations

The whole turbot megascaffold sequences were aligned against the genomes of Japanese flounder and tongue sole using the LASTZ program and a 1:1 macrosyntenic pattern was observed (Supplementary Fig. S7A and B), as reported previously. However, some Robertsonian translocations (fusions or fissions) and other minor inter-chromosomal rearrangements were observed. The most notable cases when comparing turbot (n = 22) and Japanese flounder (n = 24) genomes were the correspondence of turbot LG02 with Japanese flounder Chr06 and Chr14 (Fig. 5A) and that of LG16 with Chr09 and Chr16 (Supplementary Fig. S7A). In the same way, turbot LG02 was related to tongue sole Chr13 and Chr14 (Fig. 5B), which suggests that LG02 fusion is a derived condition (apomorphy) in the turbot lineage. The position of turbot centromeres at LG02 was the expected one if a centric fusion between two chromosomes had occurred in the turbot lineage (Fig. 5), as also occurred for LG16 (Supplementary Fig. S7A). These chromosome fusions, starting from the ancestral teleost karyotype (2n = 48), would explain the karyotype differences between turbot and Japanese flounder, as suggested by Robledo et al. Despite the general macrosyntenic pattern observed, pairwise alignment of chromosomal sequences indicated that extensive intrachromosomal reorganisations (mainly inversions) would have taken place throughout the evolution of these species (Supplementary Fig. S7A and B). Reorganisations were much more extensive when comparing turbot to half-smooth tongue sole than to Japanese flounder (Fig. 5C and D). The results of the comparison with stickleback are presented in Supplementary Figure S7C. Similar fusion patterns as those identified in the comparison with the other flatfish species are visible, as for example at LG16, that showed macrosyntenic patterns with stickleback’s ChrVII and ChrIV.
Figure 5.

LASTZ plots between S. maximus linkage groups (LG) and chromosomes (Chr) of P. olivaceus and C. semilaevis: (A) LG02 vs Chr06 and 14 of P. olivaceus; (B) LG02 vs Chr13 and 14 of C. semilaevis; (C) LG04 vs Chr07 of P. olivaceus; (D) LG04 vs Chr05 C. semilaevis. The gross vertical bar points the position of centromeres according to Martínez et al. (2008) and the light ones the limits between consecutive scaffolds.

LASTZ plots between S. maximus linkage groups (LG) and chromosomes (Chr) of P. olivaceus and C. semilaevis: (A) LG02 vs Chr06 and 14 of P. olivaceus; (B) LG02 vs Chr13 and 14 of C. semilaevis; (C) LG04 vs Chr07 of P. olivaceus; (D) LG04 vs Chr05 C. semilaevis. The gross vertical bar points the position of centromeres according to Martínez et al. (2008) and the light ones the limits between consecutive scaffolds.

4. Discussion

The reduced cost of the new genotyping by sequencing (GBS) techniques has facilitated the development of highly dense genetic maps in aquaculture fish species,, including turbot. Here, we combined for the first time a high number of markers and families for a deep mapping analysis aimed at integrating genetic and physical map resources in turbot to understand its genome organisation and variation, a study not performed to date in fish., The average number of SNPs in the 31 full-sib families analysed in our work was in the upper range reported in the most recent linkage mapping studies (5–6k SNPs) and ∼40 individuals were genotyped at each family. The turbot consensus map (8,532.6 cM) was much longer than any previous linkage map reported for the species (6,647 SNPs, 2,622.1 cM), but this elongation is the consequence of combining a high number of markers with several mapping families and the MergeMaps software, an artefact highlighted previously by different authors.,– A direct consequence of this ‘inflated’ map length is that the average ratio between physical and genetic distance is much lower than usual (0.066 Mb/cM vs ∼0.33 Mb/cM in teleost). A way to compensate this bias is to normalise for each LG consensus map coordinates to the average genetic distance obtained from the individual maps used to construct the consensus (see Table 1). Accordingly, the average length and average inter-marker distance of the three single family maps used to construct the consensus was 3753.9 and 0.32, respectively, figures much closer to those found by Wang et al. using RADseq. Interestingly, the sex-determining region-bearing chromosome (LG05) showed one of the lowest RF ratio (cM/Mb), which agrees with theoretical expectations on the evolution of the sex-determining chromosome. The anchorage of the assembled turbot genome to a robust and dense genetic map is essential to transfer marker associations detected by segregation analysis in families to physical positions in the genome for gene mining. In our study, ∼20% of turbot genome assembly was refined, thanks to the mapping information obtained across the 10 most informative families. Furthermore, scaffolds representing 96.5% of turbot genome assembly were anchored to chromosomes, ordered and oriented, and thus, a megascaffold was set up for each LG. This represents a significant improvement from the 80% anchored with the previous medium density turbot map. Also, the available cytogenetic information,, was integrated as much as possible in a common framework with genetic and physical maps, and four marker chromosomes of the turbot karyotype were associated with their respective megascaffolds, and the remaining 18 roughly assigned according to their physical length. Using this information, a new nomenclature is here proposed for the 22 chromosomes/LGs of the species, which will facilitate the integration of the different levels of genome organisation for comparative studies within flatfish and teleosts. The high correlation observed between genetic and physical maps supports the confidence of marker ordering and physical assembling of turbot genome. However, this general trend was lost at specific chromosome structures such as centromeres and telomeres. Around centromeres extensive dot clouds were visualised, suggestive of poor mapping accuracy very likely related to the decay of RF in the vicinity of heterochromatic blocks., This was observed not only for pericentromeric heterochromatin but also for interstitial heterochromatic blocks such as the most conspicuous one detected at LG04 (C5). A remarkable RF difference was observed between males and females around heterochromatic structures, especially the centromeres, where females hardly showed RF decay. This fact is the main responsible for the length difference between male and female maps of turbot, as reported previously in other fish species., It is tempting to ascribe the different RF between males and females observed in fish, to different crossing-over patterns around heterochromatic blocks. However, other explanations have been proposed for different species, such as the GC content in Asian sea bass; stronger selection pressure in male gametes during the haploid life stage in zebrafish or in some plants; and differences in chromatin distribution between sexes as for example in humans . More information is needed for a comprehensive explanation to these observations, and indeed, NGS technologies can provide suitable data for this task. The lack of correspondence between physical and genetic maps observed at telomeres in several families and LGs is likely a consequence of a wrong assembly and/or mapping at chromosome ends. These regions were characterised by the presence of a number of short contigs, suggestive of assembling problems likely due to the presence of repetitive DNA. Similarly, when many markers and a comparatively small offspring number per family are handled, mapping programs throw the least confident markers to LG ends, which lead to the loss of collinearity between physical and genetic maps. Improving genomic assembly at chromosome ends by using long-read sequencing methods and increasing progeny number per family will be needed to solve this problem. Genetic maps of bony fish reported to date have been based on a few families,,, and thus, the presence of inter-individual rearrangements affecting marker order has been scarcely addressed. Previous information showed that, despite a general macrosyntenic 1:1 pattern in Acanthopterygii and specifically in Pleuronectiformes, a huge intrachromosomal reorganisation (mainly inversions) has taken place along flatfish evolution., This observation was confirmed in our study, where the genomic comparison between turbot and tongue sole, species pertaining to the distant families Scophthalmidae and Cynoglossidae,, revealed great intrachromosomal reorganisations. Accordingly, polymorphic inversions could occur within flatfish species, and the comparison between individual maps could shed some light on this issue. Using nine families and a much lower amount of markers, Bouza et al. suggested a general collinearity among individual maps, and only minor discrepancies were detected at 11 LGs involving markers at short distances (< 3 cM). In our study, two medium-sized chromosome tracts (∼7 Mb) at LG06 and LG09 showed features that could be compatible with polymorphic inversions although further work will be necessary to confirm this observation. Combining fluorescence in situ hybridization with BAC probes on these chromosomes or checking linkage between markers within and around the potential inversions with a much higher offspring number could be possible strategies. If confirmed, this information should be considered not only for evolutionary studies but also for marker assisted selection in turbot breeding programs. Flatfish represent one of the most extraordinary examples of anatomical specialisation to a particular lifestyle in vertebrates, and a bitter evolutionary controversy took place in the past about their origin and phylogeny., The 1:1 macrosyntenic pattern observed between Acanthopterygii was confirmed when comparing the available flatfish genomes: turbot, tongue sole and Japanese flounder.,, The integration of our data with previous information, which includes the recently assembled genome of P. olivaceus, enabled us to confirm and refine the suggested chromosome fusions along flatfish evolution. Our results highlight that the fusion of ancestral chromosomes in the evolution of turbot genome resulted in the formation of new chromosomes, whose centromeres should correspond to the ancestral centromere of one of the fusing chromosomes, as reported in the origin of human chromosome 2. The precise location of centromere-associated markers in the genetic and physical maps strongly supports two main centric fusions in the origin of turbot karyotype (LG02 and LG16) from the ancestral 2n = 48 observed in Japanese flounder., Flatfish show large variation in chromosome number even across closely related taxa and more than half of flatfish species present a chromosome number lower than the ancestral, suggestive of several fusions, with the minimum diploid number (2n = 26) reported in Citharichthys spilopterus. It would be worth investigating in more detail if the high number of chromosome fusions occurring in the families with lower diploid numbers (e.g. Soleidae, Bothidae and Achiridae and some Paralichthydae) show a preferential fusion pattern, similar to what recently found in notothenioids. In addition to chromosome fusions, other minor reorganizations between non-orthologous chromosomes of turbot, Japanese flounder and half-smooth tongue sole were identified in our study. These chromosome rearrangements are thought to be important in karyotypic evolution and species differentiation, and thus, further comparative mapping studies are encouraged to understand the reasons behind these rearrangements in the evolution of flatfish. The turbot HD consensus genetic map here constructed using a set of 18,124 SNPs was a suitable reference to anchor the turbot genome and to integrate all previous genomic information. The comparison of the male and female consensus maps enabled to identify sharp RF differences around centromeres that explained their length differences. The 62 individual genetic maps constructed allowed to detect suggestive polymorphic rearrangements in the species putatively related to coadapted gene blocks. Finally, inter- and intra-chromosomal reorganisations in Pleuronectiformes were identified comparing the chromosome sequences available for three flatfish species taking the turbot map as reference. The consistent HD genetic map reported in turbot represents an invaluable genomic tool for further genome-wide association and evolutionary genomic studies.

Acknowledgements

This study has been supported by the FISHBOOST project (ref. 613611) from the European Community's Seventh Framework Programme (FP7/2007-2013) and by Consellería de Cultura, Educación e Ordenación Universitaria, Xunta de Galicia local government (ref. GRC2014/010). Computational support for bioinformatic analysis were provided by Centro de Supercomputaciòn de Galicia (CESGA).

Accession numbers

CP026243-CP026264.

Conflict of interest

None declared. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  64 in total

1.  Chromosomal evolution and speciation: a recombination-based approach.

Authors:  Kevin Livingstone; Loren Rieseberg
Journal:  New Phytol       Date:  2004-01       Impact factor: 10.151

2.  The genome and transcriptome of Japanese flounder provide insights into flatfish asymmetry.

Authors:  Changwei Shao; Baolong Bao; Zhiyuan Xie; Xinye Chen; Bo Li; Xiaodong Jia; Qiulin Yao; Guillermo Ortí; Wenhui Li; Xihong Li; Kristin Hamre; Juan Xu; Lei Wang; Fangyuan Chen; Yongsheng Tian; Alex M Schreiber; Na Wang; Fen Wei; Jilin Zhang; Zhongdian Dong; Lei Gao; Junwei Gai; Takashi Sakamoto; Sudong Mo; Wenjun Chen; Qiong Shi; Hui Li; Yunji Xiu; Yangzhen Li; Wenteng Xu; Zhiyi Shi; Guojie Zhang; Deborah M Power; Qingyin Wang; Manfred Schartl; Songlin Chen
Journal:  Nat Genet       Date:  2016-12-05       Impact factor: 38.330

Review 3.  From 2R to 3R: evidence for a fish-specific genome duplication (FSGD).

Authors:  Axel Meyer; Yves Van de Peer
Journal:  Bioessays       Date:  2005-09       Impact factor: 4.345

4.  Are flatfishes (Pleuronectiformes) monophyletic?

Authors:  Matthew A Campbell; Wei-Jen Chen; J Andrés López
Journal:  Mol Phylogenet Evol       Date:  2013-07-19       Impact factor: 4.286

5.  Chromosomal speciation and molecular divergence--accelerated evolution in rearranged chromosomes.

Authors:  Arcadi Navarro; Nick H Barton
Journal:  Science       Date:  2003-04-11       Impact factor: 47.728

6.  Integration of linkage maps for the Amphidiploid Brassica napus and comparative mapping with Arabidopsis and Brassica rapa.

Authors:  Jun Wang; Derek J Lydiate; Isobel A P Parkin; Cyril Falentin; Régine Delourme; Pierre W C Carion; Graham J King
Journal:  BMC Genomics       Date:  2011-02-09       Impact factor: 3.969

7.  Fine mapping QTL for resistance to VNN disease using a high-density linkage map in Asian seabass.

Authors:  Peng Liu; Le Wang; Sek-Man Wong; Gen Hua Yue
Journal:  Sci Rep       Date:  2016-08-24       Impact factor: 4.379

8.  Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

Authors:  Karen M Moll; Peng Zhou; Thiruvarangan Ramaraj; Diego Fajardo; Nicholas P Devitt; Michael J Sadowsky; Robert M Stupar; Peter Tiffin; Jason R Miller; Nevin D Young; Kevin A T Silverstein; Joann Mudge
Journal:  BMC Genomics       Date:  2017-08-04       Impact factor: 3.969

9.  Fine mapping and evolution of the major sex determining region in turbot (Scophthalmus maximus).

Authors:  Xoana Taboada; Miguel Hermida; Belén G Pardo; Manuel Vera; Francesc Piferrer; Ana Viñas; Carmen Bouza; Paulino Martínez
Journal:  G3 (Bethesda)       Date:  2014-08-07       Impact factor: 3.154

10.  A high-density genetic map and growth related QTL mapping in bighead carp (Hypophthalmichthys nobilis).

Authors:  Beide Fu; Haiyang Liu; Xiaomu Yu; Jingou Tong
Journal:  Sci Rep       Date:  2016-06-27       Impact factor: 4.379

View more
  10 in total

1.  Development of whole-genome multiplex assays and construction of an integrated genetic map using SSR markers in Senegalese sole.

Authors:  Israel Guerrero-Cózar; Cathaysa Perez-Garcia; Hicham Benzekri; J J Sánchez; Pedro Seoane; Fernando Cruz; Marta Gut; Maria Jesus Zamorano; M Gonzalo Claros; Manuel Manchado
Journal:  Sci Rep       Date:  2020-12-14       Impact factor: 4.379

2.  Chromosome anchoring in Senegalese sole (Solea senegalensis) reveals sex-associated markers and genome rearrangements in flatfish.

Authors:  Israel Guerrero-Cózar; Jessica Gomez-Garrido; Concha Berbel; Juan F Martinez-Blanch; Tyler Alioto; M Gonzalo Claros; Pierre-Alexandre Gagnaire; Manuel Manchado
Journal:  Sci Rep       Date:  2021-06-29       Impact factor: 4.379

Review 3.  Integrating Genomic and Morphological Approaches in Fish Pathology Research: The Case of Turbot (Scophthalmus maximus) Enteromyxosis.

Authors:  Paolo Ronza; Diego Robledo; Roberto Bermúdez; Ana Paula Losada; Belén G Pardo; Paulino Martínez; María Isabel Quiroga
Journal:  Front Genet       Date:  2019-01-31       Impact factor: 4.599

4.  Development and characterization of genomic resources for a non-model marine teleost, the red snapper (Lutjanus campechanus, Lutjanidae): Construction of a high-density linkage map, anchoring of genome contigs and comparative genomic analysis.

Authors:  Adrienne E Norrell; Kenneth L Jones; Eric A Saillant
Journal:  PLoS One       Date:  2020-04-29       Impact factor: 3.240

5.  An ultra-high density SNP-based linkage map for enhancing the pikeperch (Sander lucioperca) genome assembly to chromosome-scale.

Authors:  Lidia de Los Ríos-Pérez; Julien A Nguinkal; Marieke Verleih; Alexander Rebl; Ronald M Brunner; Jan Klosa; Nadine Schäfer; Marcus Stüeken; Tom Goldammer; Dörte Wittenburg
Journal:  Sci Rep       Date:  2020-12-18       Impact factor: 4.379

6.  Genome Assembly of Salicaceae Populus deltoides (Eastern Cottonwood) I-69 Based on Nanopore Sequencing and Hi-C Technologies.

Authors:  Shengjun Bai; Hainan Wu; Jinpeng Zhang; Zhiliang Pan; Wei Zhao; Zhiting Li; Chunfa Tong
Journal:  J Hered       Date:  2021-05-24       Impact factor: 2.645

7.  Identification of quantitative trait loci associated with upper temperature tolerance in turbot, Scophthalmus maximus.

Authors:  Aijun Ma; Zhihui Huang; Xin-An Wang; Yuhui Xu; Xiaoli Guo
Journal:  Sci Rep       Date:  2021-11-09       Impact factor: 4.379

8.  A single genomic region involving a putative chromosome rearrangement in flat oyster (Ostrea edulis) is associated with differential host resilience to the parasite Bonamia ostreae.

Authors:  Inés Martínez Sambade; Adrian Casanova; Andrés Blanco; Manu K Gundappa; Tim P Bean; Daniel J Macqueen; Ross D Houston; Antonio Villalba; Manuel Vera; Pauline Kamermans; Paulino Martínez
Journal:  Evol Appl       Date:  2022-07-21       Impact factor: 4.929

9.  The first high-density genetic map of common cockle (Cerastoderma edule) reveals a major QTL controlling shell color variation.

Authors:  Miguel Hermida; Diego Robledo; Seila Díaz; Damián Costas; Alicia L Bruzos; Andrés Blanco; Belén G Pardo; Paulino Martínez
Journal:  Sci Rep       Date:  2022-10-10       Impact factor: 4.996

10.  The hemoglobin Gly16β1Asp polymorphism in turbot (Scophthalmus maximus) is differentially distributed across European populations.

Authors:  Øivind Andersen; Juan Andrés Rubiolo; Maria Cristina De Rosa; Paulino Martinez
Journal:  Fish Physiol Biochem       Date:  2020-10-04       Impact factor: 2.794

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.