Literature DB >> 25238392

SNP mining in Crassostrea gigas EST data: transferability to four other Crassostrea species, phylogenetic inferences and outlier SNPs under selection.

Xiaoxiao Zhong1, Qi Li1, Hong Yu1, Lingfeng Kong1.   

Abstract

Oysters, with high levels of phenotypic plasticity and wide geographic distribution, are a challenging group for taxonomists and phylogenetics. Our study is intended to generate new EST-SNP markers and to evaluate their potential for cross-species utilization in phylogenetic study of the genus Crassostrea. In the study, 57 novel SNPs were developed from an EST database of C. gigas by the HRM (high-resolution melting) method. Transferability of 377 SNPs developed for C. gigas was examined on four other Crassostrea species: C. sikamea, C. angulata, C. hongkongensis and C. ariakensis. Among the 377 primer pairs tested, 311 (82.5%) primers showed amplification in C. sikamea, 353 (93.6%) in C. angulata, 254 (67.4%) in C. hongkongensis and 253 (67.1%) in C. ariakensis. A total of 214 SNPs were found to be transferable to all four species. Phylogenetic analyses showed that C. hongkongensis was a sister species of C. ariakensis and that this clade was sister to the clade containing C. sikamea, C. angulata and C. gigas. Within this clade, C. gigas and C. angulata had the closest relationship, with C. sikamea being the sister group. In addition, we detected eight SNPs as potentially being under selection by two outlier tests (fdist and hierarchical methods). The SNPs studied here should be useful for genetic diversity, comparative mapping and phylogenetic studies across species in Crassostrea and the candidate outlier SNPs are worth exploring in more detail regarding association genetics and functional studies.

Entities:  

Mesh:

Year:  2014        PMID: 25238392      PMCID: PMC4169597          DOI: 10.1371/journal.pone.0108256

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Oysters are widely distributed throughout tropical and subtropical regions, inhabiting near-shore areas, shallow waters, bays, and estuaries [1]. Crassostrea oysters are important commercial species and account for most of the world's oyster production. Approximately 20 species make up the genus Crassostrea, of which C. gigas has become the leading species in world shellfish culture because of its rapid growth and capacity to adapt to various environmental conditions. Besides C. gigas, C. hongkongensis, C. ariakensis, C. sikamea and C. angulata are locally important species in China, Japan, Korea, the United States and some European countries. The rapid growth of the oyster aquaculture industry as well as intentional introduction or transplantation of oysters pressingly requires an appropriate understanding of the genetic variation within and among various oyster species. However, conventional taxonomic and phylogenetic studies based on morphology and geographic range information have proved problematic because of highly plastic shell patterns and overlapping geographic distributions [2]–[4]. There are ongoing debates as to the species designations in the genus Crassostrea, such as the specific status of C. gigas and C. angulata, and the nomenclature of C. hongkongensis and C. ariakensis. The ongoing confusion about oyster taxonomy and identification has become an impediment to further investigation of the genetics and conservation of oysters. In recent years, relationships and identification of oyster species have been investigated by using allozymes, randomly amplified polymorphic DNA (RAPD), restriction fragment length polymorphism (RFLP) and DNA sequences such as mitochondrial and nuclear genes [5]–[11]. Particularly, the ability to sequence and compare whole mitochondrial genomes provides a new insight into phylogenetic relationships of oysters [12]–[14]. However, mtDNA loci are uniparentally inherited and cannot alone represent all historical and contemporary processes acting upon a population [15]. Moreover, because mtDNA is fast evolving and nucleotide mutations may return to an earlier state, its sequences may not allow deep phylogenetic reconstruction [12]. Hence, incorporating nuclear markers appears necessary to increase confidence in determining the relationships of Crassostrea species. Single-nucleotide polymorphisms (SNPs) have become cornerstone markers for a wide variety of genetic applications because they are the most abundant class of polymorphisms in genomes, and can be genotyped cost-effectively [16], [17]. Besides, SNP can be found within the genomic sequences of gene candidates for artificial or natural selection and therefore they might be more informative for evolutionary biology than markers such as microsatellites and AFLPs. They offer a wide range of applications such as association studies, high-density linkage maps, traceability of genealogies and phylogenetic inference [18], [19]. The rapid increase in the availability of EST sequences of Crassostrea gigas provides abundant resources for obtaining SNP markers [20]–[23]. To date, 320 SNPs have been developed for C. gigas by mining expressed sequence tags data, using the HRM method [24]–[26]. Nevertheless, SNP markers for C. hongkongensis, C. ariakensis, C. sikamea and C. angulata have not been documented. Transferred SNPs from C.gigas provide a valuable source of SNP markers for the four species. Such cross-species EST–SNPs will be useful for comparative mapping and phylogenetic studies among species in Crassostrea. Here, 57 novel SNPs were developed from the NCBI EST database (http://www.ncbi.nlm.nih.gov/) of C. gigas and the cross-species transferability of 377 SNPs of C. gigas was tested among C. hongkongensis, C. ariakensis, C. sikamea and C. angulata. Meanwhile, through the use of the cross-species SNPs, we reconstructed the phylogenetic relationships among the five Crassostrea species. Moreover, through the use of Fst outlier analysis, we identified candidate SNPs that may have been targets of selection.

Materials and Methods

Ethics Statement

The field studies did not involve endangered or protected species. No specific permissions were required for the locations. The locations are not privately-owned or protected in any way.

Oyster Materials and DNA Extraction

Thirty-two C. gigas individuals from 2 populations (Pop1: 16 individuals from Weihai, Shandong province, China; Pop2: 16 individuals from Rizhao, Shandong province, China) were used for validation of SNP polymorphisms. Five Crassostrea species collected from China were used for the examination of the transferability of SNPs, namely C. sikamea (from Nantong, Jiangsu Province), C. angulata (from Yueqing, Zhejiang Province), C. hongkongensis (from Xiamen, Fujian Province), C. ariakensis (from Shantou, Guangdong Province) and C. gigas (from Rushan, Shandong Province) (Table 1). A set of species-specific COI primers was used for species identification according to the study of Wang & Guo [10].
Table 1

Species included in this study, and the statistics of amplification success and polymorphism.

SpeciesNumber of individualsSample location (latitude, longitude)NumberPercentNumberPercent
amplifiedamplifiedpolymorphicpolymorphic
C. sikamea 20Nantong, Jiangsu (31.91°N, 121.88°E)31182.5025667.90
C. angulata 19Yueqing, Zhejiang (28.15°N, 121.08°E)35393.6030681.20
C. hongkongensis 19Xiamen, Fujian (24.43°N,118.15°E)25467.4013335.30
C. ariakensis 19Shantou, Guangdong (23.35°N,116.63°E)25367.1011931.60
C. gigas 19Rushan, Shandong (36.90°N, 121.80°E)37710033588.90
DNA was extracted from frozen adductor muscle tissue by a modification of the standard phenolchloroform procedure previously described by Li et al. [27] and stored at −30°C prior to genetic analysis.

Data Mining for SNP Markers

Sequences containing SNPs were annotated using BLASTx software [28], and sequence homology was accepted based on a cut-off E value of 1.0×10−6. The informative strand and reading frame were identified by using the sequence with highest homology. The NCBI ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) was used to determine whether SNPs were synonymous, non-synonymous or from untranslated regions (UTRs).

Primer Design and PCR Conditions

Primers were designed using the Primer Premier 5.0 program (PREMIER Biosoft International, Palo Alto, CA, USA). SNP markers were developed according to the procedure described by Zhong et al. [25] and genotyped using the high resolution melting (HRM) method on the LightCycler 480 real-time PCR instrument (Roche Diagnostics, Burgess Hill, UK). A total of 46,171 Pacific oyster EST sequences were downloaded from GenBank EST database (http://www.ncbi.nlm.nih.gov/). The sequences were assembled and clustered into contigs with SeqMan Pro software (DNASTAR Inc., Madison, WI, USA). A single-base mutation that occurred in four or more ESTs and that was surrounded by good flanking sequences was identified as a potential SNP for further analysis. The 10-µl reaction mixture contained 0.25 U Taq DNA polymerase (Takara, Dalian, China), 10× PCR buffer, 0.2 mM dNTP mix, 0.2 µM of each primer set, 1.5 mM MgCl2, 5 µM SYTO9 (Invitrogen Foster City, CA, USA) and 10 ng template DNA. The concentration of DNA was measured by a Nanodrop 2000 spectrophotometer (Thermo Scientific, Waltham, MA). The PCR cycling conditions included an activation step at 95°C for 5 minutes followed by 45–50 cycles of 95°C for 20 seconds, a touch down of 68°C to 58°C for 20 seconds (0.5°C/cycle) and 72°C for 20 seconds. Following amplification, the products were denatured at 95°C for 1 min, and then annealed at 40°C for 1 min to randomly form DNA duplexes. Melting curves were generated by heating samples from 60°C to 90°C with 25 data acquisitions per degree. Data were analyzed using the LightCycler 480 Gene Scanning Software 1.5 (Roche Diagnostics).

Data Analysis

Shannon's Information index, expected heterozygosity (H), observed heterozygosity (H) and Nei's genetic distance [29] were calculated using POPGENE 1.32 software [30]. Phylogenetic trees were constructed using the neighbor joining (NJ) method implemented in MEGA 5.05 and POPTREE2 [31], [32]. Bootstrap analyses with 1000 replicates were performed to test the support for the branches of a phylogenetic tree. Arlequin version 3.5.1.3 software was used to calculate pairwise Fst between all pairs of species using 10000 permutations to test for significance (0.01). Outlier SNPs were tested using two island models, as implemented in Arlequin. We conducted 50000 coalescent simulations with 5 demes under a finite island-model. The analysis was also performed utilizing a hierarchical island model based on 3 groups of 3 demes with 50000 simulations to generate the joint distribution of Fst versus heterozygosity. Pre-defined population groupings were set as three groups (group 1: C. sikamea, C. angulata and C. gigas; group 2: C. hongkongensis; group 3: C. ariakensis) based on the pairwise Fst values. Loci that fall out of the 99% confidence intervals of the distribution were identified as outliers being putatively under selection. The putative function of genes with outlier SNPs was identified using the Gene Ontology (GO) annotation by mining the Swiss-Prot database.

Results

Development and Transferability of SNPs

In the study, 262 putative SNPs were selected for validation. Among these, 57 SNPs (22%) were polymorphic and considered as validated. Information about the panel of loci is summarized in Table 2. The 57 substitutions included 41 transitions and 16 transversions. Of the polymorphic SNPs, 30 (52.6%) could not be annotated, 53 (93.0%) were located in the coding region, and 4 (7.0%) in the UTR. Eighteen of the 53 SNPs located within the coding region were nonsynonymous and 35 synonymous.
Table 2

Characterization of 57 polymorphic EST–SNPs derived from Crassostrea gigas.

SNP nameAccession no.Primer sequences (5′-3′)Amplicon length (bp)SNP type and locationTypeAnnotation
CgSNP879HS148847F: ACTGGTCTCACCCCCATCAC 60C/T (487)S (Pro)Unknown
R: AGTCCTATTCACTTCACTGCTGC
CgSNP880HS140594F: AAGTGGTCATCGAAAAAGGTCTTC 90G/T (632)S (Leu)Glutathione S-transferase theta-1
R: CGGCGAGGTATTTAGACTTCTCC
CgSNP882HS243771F: TTAGAACCGATAATCCAAGGAAGTC 76A/G (209)S (Glu)ADP-ribosylation factor-like protein 15
R: ACAATCATCTTTACTATTTTCTCTGCC
CgSNP886HS210205F: TCTGGAAATACAATCTGCTGGC 71C/T (221)S (Asp)hypothetical protein CGI_10018860
R: CCTGGCTTTGATGAGGGCTT
CgSNP890HS236510F: CGGAGTCGAATGAAACAGGAT 77A/G (112)S (Pro)Unknown
R: TAGGTCTGATACATTGAAGTAAGCG
CgSNP891HS236510F: TCTACATCGAAGGACAATTTTCAAG 70G/T (250)N (Ser-Arg)Unknown
R: TTCCCGTTTCGGATATACAGACT
CgSNP895FP008693F: CTCGGTCTCAGTCATTGCGG 67A/G (82)S (Met)Unknown
R: GATTTCTCCTCTATCCTGCTTTCC
CgSNP900HS238336F: TCCTGATAACATTGCTGTGTTTG 70A/C (166)S (Gly)Protein BAT5
R: GTAGTTCATTGCTACCCATGATGC
CgSNP909CU682103F: TTACAATTCAGAACAGGACAATGG 74A/T (207)S (Leu)Macrophage mannose receptor 1
R: ACAAACTTTGAGTCTATGACTCGGT
CgSNP913HS167108F: TGTTGGGAACGATTCATACGG 77C/T (271)S (Asp)hypothetical protein CGI_10025728
R: CATTTCGGTGTTCACGATTGG
CgSNP915FQ661219F: CCAATCCAGTGCCAAAGTCTC 80A/G (317)S (Glu)Unknown
R: CAGCAACTAAATGGTCCACATAAC
CgSNP917HS175405F: TTGTCCTTGTTAATTACTGCATTGC 70C/T (226)S (Cys)Unknown
R: GCCTAGTTTGCGTAGGAGAGAG
CgSNP924HS175248F: GCGGAGTCGGAGCATCAG 58C/T (261)S (Cys)ELKS/RAB6-interacting/CAST family member 1
R: TCAGGTCGTGGTTCCTCTTCAT
CgSNP936CU993732F: CACACAAGAAGAAAACGCACAAGAT 86A/G (604)S (Glu)Phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase TPTE2
R: TGGTAAAAGATGTCAGGAACAAGGT
CgSNP940HS223847F: ATCACGACTGTAGGGCAGAGATTAT 81G/T (202)N (Gln-His)Unknown
R: AGGTTTGGATTGAGCTTTTGTCTAG
CgSNP942HS191752F: CCTCGGATCTGTTGATTGCTATT 72A/G (564)S (Pro)Complement C1q tumor necrosis factor-related protein 3
R: TGTTCTGCCAGGGTATGTTCG
CgSNP949HS231194F: CATCTCAGGGAAATGGAAGG 72C/T (491)S (Tyr)Tetratricopeptide repeat protein 17
R: AAGAAACAAAATAATGAAGAGCG
CgSNP958HS109673F: AATCCTTGATGAGCCGACG 82C/T (715)S (Ala)ATP-binding cassette sub-family F member 3
R: CCCTCCCTGGAATTTCAGTAT
CgSNP970HS206217F: AAGAGATTTTATTGTAGAAGTTGACATAT 88G/T (125)S (Ser)Unknown
R: CATACCAAAAGAATCAATGAATACTC
CgSNP980HS201459F: AAGACTGTGTGACGGTTCAGATG 82A/G (471)S (Ser)Unknown
R: AGCAGTGAAATGTTGGCGAT
CgSNP989HS242001F: GCAGTGCATGTGGATGAGTAAGT 81G/T (184)UTRUnknown
R: CGCCATAAAGTTGAAAGTATTGAAC
CgSNP990HS227296F: GGTTCCATTAAGCCATCCATTG 71C/T (586)UTRUnknown
R: GCAGACAGTATCAGCAGTCGTTG
CgSNP994HS227373F: TGTATTTCAAGGCGTGTTACAGTG 84C/T (694)N (Cys-Arg)Unknown
R: ACTCATCAGTCAAGGGACAACAAG
CgSNP1003CU996515F: GTGAGAGACTGATGAGTGCCTGT 72C/T (529)S (Gly)Unknown
R: TATGAGTGATCAGGAATTCTGTAGC
CgSNP1010FQ668992F: TCAAATCAAATCTGAACGGCG 74C/T (580)S (Gly)Fibrinogen C domain-containing protein 1
R: CCAGTTATTGTACGGTCCCCAT
CgSNP1016CU997800F: ATGTGATTGTCTCTTGAGAATGTGT 75C/T (588)N (Val-Ala)Unknown
R: CAGAGATGAAACCAGTATGTCTGAT
CgSNP1019HS116482F: TCAGACACGGAGGGAAAATG 98C/T (430)S (Pro)39S ribosomal protein L45, mitochondrial
R: TCTTTGTCCTCTTTCCAAGTGTG
CgSNP1021HS122227F: AGCCCACTGGAGGAAGAACC 58C/T (154)N (Val-Ala)Unknown
R: GGTATTCGGGATTGAATCTGTG
CgSNP1023HS235875F: GCACTACATATCATACCAGACTGTG 104A/G (377)N (Asn-Asp)Putative arylformamidase
R: GTTTGTAAAATAATGCCCATAACTG
CgSNP1024FQ666947F: GTCTAGGAGTTATTTCCCTTTGATG 98A/C (554)UTRUnknown
R: TGGATTTAGTGTTCACCAGTACAAG
CgSNP1028CU997294F: ACAGACAAAATGACAAGAAAACAAC 76C/T (606)S (Ser)Unknown
R: CAGTGACCTCAGCAGCCATC
CgSNP1029CU997294F: CTCTCACACCAGATATTTCCAGCAT 82A/G (630)N (Lys-Glu)Unknown
R: CTTCCTTTCAAGGTCACAATCACAC
CgSNP1034HS180370F: CCTGTCTTTTAACACTGTTTCTGAT 97A/T (322)N (Trp-Ser)Unknown
R: GTCAGGACGTTTTCTGCTTTC
CgSNP1037HS248681F: CCAAAGTGTACGCTGTAAGGAACC 81G/T (161)S (Ser)Unknown
R: CGTCAATGCTGATGGACAAGG
CgSNP1038HS239387F: GCTATACCTTGTCCATCAGCATTG 60C/T (708)N (Lys-Glu)Ufm1-specific protease 1
R: CATTAGTGTTGTTCACGGGGAG
CgSNP1042HS229886F: AAGTCAGTGAAGAGCCACAAAC 84A/C (280)S (Ser)Interleukin-1 receptor-associated kinase 1
R: AAACCTCATTAAATCCCAAGTGT
CgSNP1043FP004709F: CAAGTTCCGAATGAAATACCTTCT 85C/T (554)N (Tyr-Cys)hypothetical protein CGI_10008375
R: CTCAAAATAGCTGTCCCTGTGTG
CgSNP1045FP004709F: GACAGATAACAACTCTCAAGCAAAC 67A/C (688)S (Leu)hypothetical protein CGI_10008375
R: CACATATCGTTACGAAACCGAG
CgSNP1047HS233108F: TCTGGAGGCTGTATGCTGAGTT 65A/G (364)S (Gln)Tetratricopeptide repeat protein 27
R: CTTTTGTTGTGTTTCCGCTGT
CgSNP1050CU986514F: CAAGTGTCCTGTATGTTGACAGTC 64A/G (754)N (Met-Val)Mu-crystallin-like protein
R: GATAAAATTACATCCCCACTCTCTT
CgSNP1052HS170919F: TCCTGTTGCATCAGTATTCAAGATT 87A/T (233)N (Leu-Ter)Unknown
R: AAGCCTCAAAGTATGACCAGCAC
CgSNP1054FP010213F: GTAGCTTGGATATTACTGTGAGGC 77G/T (205)UTRUnknown
R: CATGGAAATCTCGGTATAAACTTG
CgSNP1055FP010213F: GATGAGTGCTTACATCAATCTGAGT 92C/T (371)N (Met-Thr)Unknown
R: CAAGACACAAAAACACATGCTTATAC
CgSNP1056HS162699F: GCTGTTTGGTCTGGTGTTTGT 79A/G (567)N (Asn-Asp)Unknown
R: TTGAAAGCATGAAGATTTCTATCAC
CgSNP1058CU997792F: AAGGAAATTCCCTGCACAAAC 78A/C (967)N (Ile-Leu)GTP-binding protein GEM
R: GTCCACACAAGATAAAAGAGAAGAG
CgSNP1061CU986467F: CAGAGGACCAGTTTGAGGCTT 63A/G (798)S (Val)BCCIP-like protein
R: CTTGTTTGAGTTTGTCTGCGG
CgSNP1069HS137887F: CGTGGAAATTCTGTGTAAATAGGAC 76C/T (377)N (Ser-Gly)Unknown
R: CTTCGGTTCGATTATGCTGC
CgSNP1073HS139503F: GCTGCCAGTTTTTCTCATTCAC 74C/T (523)S (Thr)15-hydroxyprostaglandin dehydrogenase [NAD+]
R: AACCAAGGACACATACGGACAAC
CgSNP1074HS220139F: CATGGTGACTAAATCTTCAATGTTGT 87A/G (465)S (Ser)Exosome component 10
R: AAGGCTGTGAGTAGAGGTTTGGC
CgSNP1077AM853850F: CTGAGGCACAAAGTCTGGGTAGT 73C/T (356)N (Met-Thr)Unknown
R: GGAGGAGTAGGTGACCGCTTC
CgSNP1082FP009397F: TATTAGGACCACATTCAGCTATGTC 88A/G (256)S (Pro)Unknown
R: ATTGATGGGGGTGGAGGTAC
CgSNP1105FP008773F: CAAGAGTTGACACCAGAGGGAG 83C/T (285)S (Thr)Unknown
R: CATCAAATACACGATGACCTGAG
CgSNP1115HS204076F: TCGGTCACTGTTGGATTTCTG 84A/G (378)S (Leu)Heat shock 70 kDa protein 12B
R: GAACAACCCGAATTCACGACC
CgSNP1117HS116629F: TAGTAAAGGCTAAACAAAGTGTGCT 71G/T (337)S (Val)Unknown
R: AGGGAGAGTCCGAGATGTCAC
CgSNP1118CU998279F: GACGAGTGAACGAGTACGGC 65C/T (198)S (Tyr)Protocadherin-19
R: TGGTCTATACGCAGAATAAGGAAT
CgSNP1130HS142312F: CAAGGGACAGAGTTCAATGTCTTCT 86A/G (585)S (Leu)Unknown
R: TGACAGGATTTCTTGCATCTTTACC
CgSNP1131HS225071F: ATGTGCTTTTTACCCGAACTGC 63A/G (477)N (Asp-Asn)Poly [ADP-ribose] polymerase 12
R: ACCTGTTTTGGTTGCTCGTCTT

Note: S, synonymous; N, non-synonymous; UTR, untranslated region.

Note: S, synonymous; N, non-synonymous; UTR, untranslated region. A total of 377 SNPs of C. gigas including 320 previously developed SNPs [24]–[26] and 57 new SNPs developed here were used to test the transferability in 4 other Crassostrea species: C. sikamea, C. angulata, C. hongkongensis and C. ariakensis. The basic information obtained with each SNP is shown in Table S1. Out of the 377 primer pairs tested, 311 (82.5%) primers showed amplification in C. sikamea, 353 (93.6%) in C. angulata, 254 (67.4%) in C. hongkongensis, 253 (67.1%) in C. ariakensis and 377 (100%) in C. gigas. Using the 377 primer pairs, 256 (67.9%) SNP loci were polymorphic in C. sikamea, 306 (81.2%) in C. angulata, 133 (35.3%) in C. hongkongensis, 119 (31.6%) in C. ariakensis and 335(88.9%) in C. gigas (Table 1). In total, 214 SNPs could give successful amplification in all the five Crassostrea species and 48 SNPs showed polymorphism in all the five species.

Phylogenetic Relationships

A total of 214 SNPs was used for the phylogenetic analysis. Information of the 214 SNPs evaluated from the 5 species is shown in Table 3. The values of observed heterozygosity (Ho) and expected heterozygosity (He) ranged from 0.0792 (C. hongkongensis) to 0.2895 (C. gigas) and from 0.1026 (C. hongkongensis) to 0.3229 (C. gigas), respectively. Shannon's Information index and the number of polymorphic loci ranged from 0.1664 (C. ariakensis) to 0.4749 (C. gigas) and from 99 (C. ariakensis) to 201 (C. gigas). Nei's genetic distance values ranged from 0.0738 (C. angulata and C. gigas) to 0.2728 (C. hongkongensis and C. gigas) (Table 4). All Fst estimates were statistically significant (P<0.01). Pairwise Fst ranged from 0.1230 (C. angulata and C. gigas) to 0.5257 (C. hongkongensis and C. ariakensis). The phylogenetic tree separated the five species into two clusters (Figure 1). The first cluster included two species, C. hongkongensis and C. ariakensis. This clade was sister to the clade containing C. sikamea, C. angulata and C. gigas. In this clade, C. gigas and C. angulata had the closest relationship, with C. sikamea being the sister group. Phylogenetic analysis using the unweighted pair-group method with arithmetic mean (UPGMA) generated an identical topology with high support values (data not shown).
Table 3

Characterization of 214 polymorphic EST-SNPs evaluated from 5 Crassostrea species.

SpeciesSI Ho He Number polymorphicPercent polymorphic
C. sikamea 0.3617±0.24420.1942±0.17920.2394±0.182717682.24
C. angulata 0.4196±0.23250.2538±0.19240.2829±0.176318686.92
C. hongkongensis 0.1691±0.20400.0792±0.11970.1026±0.138311151.87
C. ariakensis 0.1664±0.21660.1021±0.16270.1039±0.14699946.26
C. gigas 0.4749±0.20000.2895±0.18600.3229±0.156920193.93

Note: SI, Shannon's Information index; He, expected heterozygosity; Ho, observed heterozygosity.

Table 4

Pairwise Nei's genetic distance (lower diagonal) and Fst values (upper diagona) among 5 Crassostrea species using 214 SNPs.

Species C. sikamea C. angulata C. hongkongensis C. ariakensis C. gigas
C. sikamea 0.24860.52500.50450.2662
C. angulata 0.13270.48270.48240.1230
C. hongkongensis 0.26410.25250.52570.4707
C. ariakensis 0.24200.25350.13960.4402
C. gigas 0.16070.07380.27280.2380
Figure 1

Phylogenetic tree of five Crassostrea species using neighbor joining (NJ) method based on Nei's genetic distance derived from 214 SNPs.

Numbers above branches indicate bootstrap values from NJ analysis using both MEGA 5.05 and POPTREE2 softwares.

Phylogenetic tree of five Crassostrea species using neighbor joining (NJ) method based on Nei's genetic distance derived from 214 SNPs.

Numbers above branches indicate bootstrap values from NJ analysis using both MEGA 5.05 and POPTREE2 softwares. Note: SI, Shannon's Information index; He, expected heterozygosity; Ho, observed heterozygosity.

Outlier SNPs

Loci showing higher or lower differentiation with respect to the simulated confidence intervals are identified as candidates for positive or balancing selection [33]. The Arlequin fdist method revealed 10 candidate SNPs (CgSNP28, CgSNP230, CgSNP273, CgSNP415, CgSNP420, CgSNP515, CgSNP524, CgSNP544, CgSNP669 and CgSNP805) for selection, including 7 for positive selection and 3 for balancing selection (Table 5 and Figure 2a). In addition, the hierarchical method detected 11 outlier loci (CgSNP14, CgSNP203, CgSNP803, CgSNP273, CgSNP415, CgSNP420, CgSNP515, CgSNP524, CgSNP544, CgSNP669 and CgSNP805) for selection, including 9 for positive selection and 2 for balancing selection (Table 5 and Figure 2b). Both approaches revealed 8 SNPs lying outside the 99% confidence region of the conditional joint distribution of Fst and heterozygosity, including 6 for positive selection and 2 for balancing selection. Among the 8 SNPs, 5 located within the coding region were synonymous and 3 nonsynonymous. The putative function of three genes (UPF0686 protein, ankyrin repeat domain-containing protein 60, and hypothetical protein CGI_10016494) could not be identified using GO searches. The other five proteins (endoglucanase, rho-related GTP-binding protein, flap endonuclease 1-A, hypothetical protein CGI_10023940 and Chlorophyllase-2) were respectively involved in carbohydrate metabolism, GTPase-mediated signal transduction, DNA repair, DNA binding and chlorophyll catabolic process.
Table 5

Outlier SNPs detected using the finite island model and hierarchical island model for Fst calculation.

Finite island modelHierarchical island model
Locus Ho Fst P Ho Fst P TypeAnnotation
CgSNP280.53840.74380.0063S (Asn)Pannexin3
CgSNP230* 0.33960.02970.0047N (Ser/Glu)Unknown
CgSNP140.56870.87870.0090N (Pro/Leu)Unknown
CgSNP2030.75030.76360.0089N (Arg/Cys)Adenylate kinase 2, mitochondrial-like
CgSNP8030.69960.79280.0028S (Ala)Eukaryotic translation initiation factor 6
CgSNP2730.45050.77630.01400.59850.83160.0100S (Val)UPF0686 protein C11or f1-like protein
CgSNP4150.58930.79050.00790.79220.84420.0024S (Ala)Endoglucanase
CgSNP420* 0.19620.00850.00230.19470.00110.0003S (Ala)Rho-related GTP-binding protein RhoU
CgSNP5150.46070.76190.01170.60080.81740.0116N (Ile-Met)Flap endonuclease 1-A
CgSNP5240.58220.94920.00000.54920.94620.0035S (Arg)Ankyrin repeat domain-containing protein 60
CgSNP5440.58250.79670.00700.76820.84590.0021S (Ser)Hypothetical protein CGI_10023940
CgSNP669* 0.36720.01350.00140.37170.02540.0063N (Asp-Gly)Chlorophyllase-2
CgSNP8050.58320.81390.00460.79130.86280.0016N (Ser-Ala)Hypothetical protein CGI_10016494

Note:

* balancing selection; Ho, observed heterozygosity.

Figure 2

Plot of Fst against heterozygosity for 214 SNPs analysed with the fdist (a) and hierarchical (b) methods.

The upper and lower lines are the 99% confidence intervals.

Plot of Fst against heterozygosity for 214 SNPs analysed with the fdist (a) and hierarchical (b) methods.

The upper and lower lines are the 99% confidence intervals. Note: * balancing selection; Ho, observed heterozygosity.

Discussion

A total of 48769 potential SNPs were detected by mining the C. gigas EST database [25]. In our studies, the 1283 putative SNPs selected for validation allowed the development of 57 new SNPs bringing the total to 377 SNPs that have been validated in this species [24]–[26],. Among the 377 SNPs, 66 SNPs are known to be distributed in 8 linkage groups of C. gigas [26]. Compared to the use of several (often partial) genes, the adequate number of EST–SNPs, distributed in almost all linkage groups of C. gigas, may provide more genetic information which is valuable for phylogenetic analyses. The high cross-species transferability of the set of 377 EST-SNPs of C. gigas tested in four other Crassostrea species also suggests their potential utilization in evolutionary analysis across taxa of the genus Crassostrea. Moreover, mutations resulting in some SNPs can be responsible for an adaptive phenotype or the direct target of selection. Studies have shown that variation in allele frequencies at some outlier SNP loci can be correlated with environmental variables, such as salinity and temperature [34], [35]. Consequently, the SNP markers offer a valuable opportunity to understand the genetic basis of phenotypic variation in relation to environmental variation. In general, the more evolutionarily distant the taxa, the less successful is cross amplification [36], [37]. In a previous study, 15 EST-SSRs developed for C. gigas amplified successfully in at least one species, with C. sikamea sharing 14 (93.3%) primer pairs, C. hongkongensis 12 (80.0%), and C. ariakensis 11(73.3%) [38]. Hedgecock et al. [39] tested 86 genomic SSRs developed for C. gigas in cross-species amplification, 83 (96.5%) were likely useful for C. angulata, 71 (82.6%) for C. sikamea and 31 (36.0%) for C. ariakensis. Our data also showed C. angulata (93.6%) and C. sikamea (82.5%) had higher cross-amplification rates than both C. hongkongensis (67.4%) and C. ariakensis (67.1%). These results suggest that C. gigas has a closer relationship with C. angulata and C. sikamea than with C. hongkongensis and C. ariakensis. The taxonomy of Crassostrea has been studied for many years, but confusions still exist. There is an open debate as to whether C. gigas and C. angulata are distinct species [9], [13], [40], [41]. Some experts have argued that they are different species but genetically closely related [12], [40], [41], but other phylogenetic analyses suggest that the two should be considered one species [9], [42]. In our study, C. gigas and C. angulata were recovered as separate clades, suggesting that C. gigas and C. angulata may be two distinct species. However, the low Nei's genetic distance value between C. angulata and C. gigas (0.0738) indicates a very close relationship between them. Furthermore, C. angulata and C. gigas can cross-fertilize without any difficulty in the laboratory and form viable, fertile offspring [43]–[45]. Therefore, we still can not conclude that C. gigas and C. angulata are two distinct species. A large amount of the two species sampled from a wide geographic range and the same locations are required to better resolve this problem. Another species, C. hongkongensis has been routinely misidentified as C. ariakensis for a long time. In our study, C. hongkongensis and C. ariakensis were recovered as separate clades. Moreover, the Nei's genetic distance between C. hongkongensis and C. ariakensis (0.1396) was a little higher than that observed between two closely related sister species (between C. angulata and C. sikamea, 0.1327). The above data suggest that C. hongkongensis and C. ariakensis are two distinct species. Yu & Li [14] analyzed the complete mitochondrial DNA sequence and determined that C. hongkongensis and C. ariakensis are two separate species. Reece et al. [9] also suggested that the C. ariakensis sequences formed a distinct clade from C. hongkongensis in the COI tree. Therefore, we can conclude that C. hongkongensis and C. ariakensis are two separate species. Identifying the regions of the genome that are shaped by adaptation to different environments can be relevant to answering several important questions in evolutionary biology. Among many selection detection strategies, Fst outlier approaches are becoming widely used in identifying genes without known phenotypes that are under selection [33], [46], [47]. These methods can identify relatively highly differentiated markers (so-called outlier loci) in comparison to expected levels under neutrality inferred from coalescent simulations [48], [49]. Strong outlier patterns have been classically interpreted as being caused by divergent selection affecting the loci themselves or genes strongly linked with them [50]. Indeed, an alternative explanation for strong genetic divergence at some loci exists and is difficult to rule out when the tests are being made on comparisons of distinct species. Bierne et al. [51] advocate the role of pre- or postzygotic genetic barriers in genetic divergence. Such endogenous barriers could be the consequence of incompatibilities between combinations of alleles, established through selective mechanisms that are independent from adaptation to habitats [35]. To increase confidence in the conclusions reached, two-island models and a high confidence level (99%) were used in the Fst outlier analysis. Eight loci were identified as being possible targets of selection following two Fst outlier tests. Among the 8 SNPs, 5 located within the coding region were synonymous and 3 nonsynonymous. While nonsynonymous outlier SNPs are particularly interesting due to the potential effect of amino acid changes on protein structure and function, synonymous SNPs should not be simply dismissed as false-positives. This is because natural selection may affect synonymous codon usage in some genes, leading to codon usage bias [52], [53]. Furthermore, there is increasing evidence that silent mutations may have functional effects either on translational efficiency and accuracy, or on mRNA stability and splicing. Another explanation is that they might carry the footprint of selection on a beneficial allele that is closely linked to the outlier SNP. In marine environments, environmental factors such as temperature, salinity, pH and dissolved oxygen often interact in complex ways leading to a complicated ‘fitness landscape’. In our study, C. angulata and C. gigas were sampled from coastal zones, whereas C. sikamea, C. hongkongensis and C. ariakensis were sampled from estuarine zones. Moreover, the five species were collected from 5 sites across 13° of latitude along the coast of China. Therefore, water temperature and salinity may be environmental variations relevant to fitness. The importance of the cytoskeleton in the adaptation to water temperature and salinity is well known [54]–[56]. Major players during cytoskeletal remodeling are rho-GTPases, upstream molecular switches triggering signaling cascades that target cytoskeletal effector proteins to induce morphological change [57]. Another key aspect of the cell stress response is modulation of pathways of energy metabolism [58]. The data presented here reveal that two genes with outlier SNPs (endoglucanase and rho-related GTP-binding protein) are involved in carbohydrate metabolism and GTPase-mediated signal transduction. Furthermore, the ankyrin repeat domain-containing protein 60 may be involved in cytoskeletal motility regulation [59]. Although the genomic scan provides an encouraging result, association genetics and functional studies are ultimately required to confirm that particular loci are involved in responding to environmental variations. In summary, a total of 57 SNPs from EST sequences in C. gigas were developed using HRM method. The study confirmed a high cross-species transferability of the set of 377 EST-SNPs of C. gigas tested in four other Crassostrea species. Additionally, the current study represents an initial attempt at resolving phylogenetic relationships in Crassostrea species, using a large collection of cross-species SNP markers. The NJ analysis revealed two main groups of the five Crassostrea species. The first clade included C. hongkongensis and C. ariakensis. C. hongkongensis was a sister species of C. ariakensis. This clade was sister to the clade containing C. sikamea, C. angulata and C. gigas. C. gigas and C. angulata had the closest relationship, with C. sikamea being the sister group. Finally, the work, using Fst outlier approaches, presented evidence for adaptive genetic divergence in Crassostrea species. Further functional studies are needed to confirm the role of these outlier loci or genome segments in Crassostrea species. Cross-species amplification of 377 SNPs from C. gigas in four other Crassostrea species including C. sikamea, C. angulata, C. hongkongensis and C. ariakensis. (XLS) Click here for additional data file.
  33 in total

Review 1.  Joint analysis of demography and selection in population genetics: where do we stand and where could we go?

Authors:  Junrui Li; Haipeng Li; Mattias Jakobsson; Sen Li; Per Sjödin; Martin Lascoux
Journal:  Mol Ecol       Date:  2011-10-14       Impact factor: 6.185

2.  Adaptation and speciation: what can F(st) tell us?

Authors:  Mark A Beaumont
Journal:  Trends Ecol Evol       Date:  2005-06-13       Impact factor: 17.712

3.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

4.  POPTREE2: Software for constructing population trees from allele frequency data and computing other population statistics with Windows interface.

Authors:  Naoko Takezaki; Masatoshi Nei; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2009-12-18       Impact factor: 16.240

5.  INFERRING PHYLOGENIES FROM mtDNA VARIATION: MITOCHONDRIAL-GENE TREES VERSUS NUCLEAR-GENE TREES REVISITED.

Authors:  Guy A Hoelzer
Journal:  Evolution       Date:  1997-04       Impact factor: 3.694

6.  New resources for marine genomics: bacterial artificial chromosome libraries for the Eastern and Pacific oysters (Crassostrea virginica and C. gigas).

Authors:  Charles Cunningham; Jun-ichi Hikima; Matthew J Jenny; Robert W Chapman; Guang-Chen Fang; Chris Saski; Mats L Lundqvist; Rod A Wing; Pauline M Cupit; Paul S Gross; Greg W Warr; Jeff P Tomkins
Journal:  Mar Biotechnol (NY)       Date:  2006-08-11       Impact factor: 3.619

7.  Transcriptome-wide polymorphisms of red abalone (Haliotis rufescens) reveal patterns of gene flow and local adaptation.

Authors:  Pierre De Wit; Stephen R Palumbi
Journal:  Mol Ecol       Date:  2012-10-29       Impact factor: 6.185

8.  Outlier SNP markers reveal fine-scale genetic structuring across European hake populations (Merluccius merluccius).

Authors:  Ilaria Milano; Massimiliano Babbucci; Alessia Cariani; Miroslava Atanassova; Dorte Bekkevold; Gary R Carvalho; Montserrat Espiñeira; Fabio Fiorentino; Germana Garofalo; Audrey J Geffen; Jakob H Hansen; Sarah J Helyar; Einar E Nielsen; Rob Ogden; Tomaso Patarnello; Marco Stagioni; Fausto Tinti; Luca Bargelloni
Journal:  Mol Ecol       Date:  2013-11-18       Impact factor: 6.185

9.  Transcriptomic responses to salinity stress in the Pacific oyster Crassostrea gigas.

Authors:  Xuelin Zhao; Hong Yu; Lingfeng Kong; Qi Li
Journal:  PLoS One       Date:  2012-09-27       Impact factor: 3.240

10.  Unusual conservation of mitochondrial gene order in Crassostrea oysters: evidence for recent speciation in Asia.

Authors:  Jianfeng Ren; Xiao Liu; Feng Jiang; Ximing Guo; Bin Liu
Journal:  BMC Evol Biol       Date:  2010-12-28       Impact factor: 3.260

View more
  4 in total

1.  A second generation SNP and SSR integrated linkage map and QTL mapping for the Chinese mitten crab Eriocheir sinensis.

Authors:  Gao-Feng Qiu; Liang-Wei Xiong; Zhi-Ke Han; Zhi-Qiang Liu; Jian-Bin Feng; Xu-Gan Wu; Yin-Long Yan; Hong Shen; Long Huang; Li Chen
Journal:  Sci Rep       Date:  2017-01-03       Impact factor: 4.379

2.  Genome-wide comparisons reveal evidence for a species complex in the black-lip pearl oyster Pinctada margaritifera (Bivalvia: Pteriidae).

Authors:  Monal M Lal; Paul C Southgate; Dean R Jerry; Kyall R Zenger
Journal:  Sci Rep       Date:  2018-01-09       Impact factor: 4.379

3.  Genetic Characterization of Cupped Oyster Resources in Europe Using Informative Single Nucleotide Polymorphism (SNP) Panels.

Authors:  Sylvie Lapègue; Serge Heurtebise; Florence Cornette; Erwan Guichoux; Pierre-Alexandre Gagnaire
Journal:  Genes (Basel)       Date:  2020-04-21       Impact factor: 4.096

4.  Analysis of Genome-Wide Differentiation between Native and Introduced Populations of the Cupped Oysters Crassostrea gigas and Crassostrea angulata.

Authors:  Pierre-Alexandre Gagnaire; Jean-Baptiste Lamy; Florence Cornette; Serge Heurtebise; Lionel Dégremont; Emilie Flahauw; Pierre Boudry; Nicolas Bierne; Sylvie Lapègue
Journal:  Genome Biol Evol       Date:  2018-09-01       Impact factor: 3.416

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.