| Literature DB >> 28732465 |
Kord M Kober1,2,3, Grant H Pogson4.
Abstract
BACKGROUND: Comparative genomics studies investigating the signals of positive selection among groups of closely related species are still rare and limited in taxonomic breadth. Such studies show great promise in advancing our knowledge about the proportion and the identity of genes experiencing diversifying selection. However, methodological challenges have led to high levels of false positives in past studies. Here, we use the well-annotated genome of the purple sea urchin, Strongylocentrotus purpuratus, as a reference to investigate the signals of positive selection at 6520 single-copy orthologs from nine sea urchin species belonging to the family Strongylocentrotidae paying careful attention to minimizing false positives.Entities:
Keywords: Comparative genomics; Pathogens; Positive selection; Sea urchins; Sexual conflict; Strongylocentrotus purpuratus; d N /d S ratios
Mesh:
Year: 2017 PMID: 28732465 PMCID: PMC5521101 DOI: 10.1186/s12864-017-3944-7
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of genomic DNA sequencing and alignment coverage
| Mean coverage | |||||||
|---|---|---|---|---|---|---|---|
| Species | Illumina platform | Read length | No. reads post-filter | No. properly paired mates | % Bases covereda | Complete genome | Single copy orthologsb |
|
| HiSeq 2000 | 100 bp PE | 292,505,508 | 207,144,870 | 77.4 | 35.1X | 57.6X |
|
| HiSeq 2000 | 100 bp PE | 373,338,930 | 269,556,782 | 80.7 | 45.0X | 58.9X |
|
| HiSeq 2000 | 100 bp PE | 146,289,354 | 94,967,354 | 77.7 | 16.2X | 16.2X |
|
| HiSeq 2000 | 100 bp PE | 333,848,336 | 241,604,806 | 76.9 | 39.7X | 57.9X |
|
| HiSeq 2000 | 100 bp PE | 327,231,340 | 216,769,754 | 71.2 | 39.2X | 63.5X |
|
| HiSeq 2000 | 100 bp PE | 348,104,257 | 223,613,387 | 64.1 | 41.9X | 67.0X |
|
| HiSeq 200 | 100 bp PE | 323,937,718 | 240,552,762 | 53.9 | 39.2X | 52.3X |
|
| HiSeq 2000 | 100 bp PE | 308,218,933 | 189,969,224 | 56.8 | 37.1X | 91.9X |
|
| GA IIX | 150 bp PE | 239,280,430 | 164,442,076 | 96.8 | 44.8X | n.d.c |
aPercentage of bases in the S. purpuratus reference genome covered with at least one read
bMean coverage across the protein-coding sequences of 6520 single-copy orthologs
cnot determined
Fig. 1Phylogeny of nine species of strongylocentrotid urchins examined in the present study (reproduced from [44]). The species tree was generated from four-fold degenerate sites from 2301 concatenated genes not exhibiting positive selection. Bayesian, maximum-likelihood, and maximum parsimony trees produced identical topologies. Next to each species is information on their distributions (CIR = Circumpolar, NWP = North West Pacific, NEP = North East Pacific) and adult depth ranges [S = Shallow (0–50 m), M = Medium (0–200 m), D = Deep (0–1600 m)]
Comparison of candidate positively selected genes (PSGs) and those not exhibiting positive selection (non-PSGs)
| Category | No. of genes | No. of base pairs | Mean No. of codons | Mean | Mean | Mean | Mean | Mean |
|---|---|---|---|---|---|---|---|---|
| PSGs | 1008 | 2,188,170 | 502.4 | 0.0399 | 13.67 | 0.0844 | 0.333 | 0.253 |
| Non-PSGs | 5512 | 8,487,210 | 341.8 | 0.0544 | 1.35 | 0.0503 | 0.340 | 0.148 |
| TOTAL | 6520 | 10,675,380 | 366.6 | 0.0521 | 3.26 | 0.0556 | 0.339 | 0.164 |
aMean percentage of codons with d N /d S ratios >1 identified by PAML model M8
bMean Likelihood Ratio Test Score comparing PAML models M7 and M8
cCalculated following Wolf et al. (2009)
Fig. 2Distributions of rates of nonsynonymous (d N) and synonymous (d S) substitutions and d N/d S ratios for the a positive selected genes (PSGs; n = 1008), b genes not showing positive selection (non-PSGs; n = 5512), and c all genes (n = 6520)
Gene ontogeny (GO) categories significantly enriched in genes experiencing positive selection
| Category | Description | No. of genes | Fold enrichmentc |
| ||
|---|---|---|---|---|---|---|
| Tested | PSGa | E[PSG]b | ||||
| Molecular Function | ||||||
| GO:0004252 | Serine-type endopeptidase activity | 36 | 15 | 5.6 | 2.7 | 0.0003 |
| GO:0004930 | G-protein coupled receptor activity | 45 | 15 | 7.0 | 2.2 | 0.0020 |
| GO:0005328 | Neurotransmitter sodium symporter activity | 17 | 9 | 2.6 | 3.4 | 0.0001 |
| GO:0005509 | Calcium ion binding | 140 | 42 | 21.6 | 1.9 | <0.0001 |
| GO:0008484 | Sulfuric ester hydrolase activity | 23 | 10 | 3.6 | 2.8 | 0.0013 |
| Cellular Component | ||||||
| GO:0016020 | Membrane | 546 | 120 | 84.4 | 1.4 | 0.0001 |
| Biological Process | ||||||
| GO:0006508 | Proteolysis | 147 | 36 | 22.7 | 1.6 | 0.0027 |
| GO:0006836 | Neurotransmitter transport | 17 | 9 | 2.6 | 3.4 | 0.0002 |
| GO:0007156 | Homophilic cell adhesion | 14 | 8 | 2.2 | 3.7 | 0.0003 |
aPositively Selected Gene
bExpected number of PSGs
cFold enrichment of observed PSGs to E[PSGs]
dEmpirical P-values were determined by the hypergeometric with 10,000 re-samplings from a total of 6520 genes tested and 1008 identified as PSGs. All remain significant at a False Discovery Rate of 10%
Significant enrichment tests for genes experiencing positive selection using the Tu et al. (2012) custom sea urchin gene ontology (GO) level 2 (L2) functional categories
| Number of genes | ||||||
|---|---|---|---|---|---|---|
| Category | Subcategory | Tested | PSGa | E[PSG]b | Fold enrichmentc |
|
| Adhesion | Adhesion_ECMCollagen | 15 | 10 | 2.3 | 4.3 | <0.0001 |
| Adhesion_ECMFibropellin | 5 | 3 | 0.8 | 3.9 | 0.0632 | |
| Adhesion_ECMNeural | 13 | 5 | 2.0 | 2.5 | 0.0690 | |
| Adhesion_ECMReceptorCadherin | 14 | 8 | 2.2 | 3.7 | 0.0013 | |
| Adhesion_ECMReceptorIgFN3 | 21 | 9 | 3.2 | 2.8 | 0.0056 | |
| Biomineralization | Biomineralization_Collagen | 6 | 4 | 0.9 | 4.3 | 0.0168 |
| Defensome | Defensome_TransporterABC | 19 | 6 | 2.9 | 2.0 | 0.0928 |
| Immunity | Immunity_ReceptorScavenger | 21 | 7 | 3.2 | 2.2 | 0.0566 |
| Metabolism | Metabolism_InorganicIon | 107 | 27 | 16.5 | 1.6 | 0.0158 |
| Metalloprotease | Metalloprotease | 84 | 20 | 13.0 | 1.5 | 0.0599 |
| Signaling | Signaling_Notch | 10 | 4 | 1.5 | 2.6 | 0.0947 |
aPositively Selected Gene
bExpected number of PSGs
cFold enrichment of observed PSGs to E[PSGs]
d Empirical P-values were determined by the hypergeometric with 10,000 re-samplings from a total of 6520 genes tested and 1008 identified as PSGs
Summary of the top 15 ranked positively selected genes (PSGs)
| Rank | SPU gene | Name | Protein | Locationa | No of codons |
|
|
| LRT Scoreb | No. of positively selected codonsc |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | SPU_003768 | Sp-3Apcol | Type IV collagen | ECM | 1183 | 0.159 | 0.345 | 0.459 | 233.04 | 77 |
| 2 | SPU_010829 | Sp-Ef2 | Elongation factor 2 | RIB | 800 | 0.153 | 0.473 | 0.322 | 102.62 | 20 |
| 3 | SPU_008159 | Sp-Kcnk13 | Potassium channel | MEM | 405 | 0.079 | 0.209 | 0.378 | 91.68 | 11 |
| 4 | SPU_018532 | Sp-EgfibL | Fibropellin 1b–like | ECM | 439 | 0.268 | 0.307 | 0.873 | 79.67 | 38 |
| 5 | SPU_006534 | Sp-Ebr1_5 | Egg bindin receptor 1–5 | MEM | 691 | 0.174 | 0.368 | 0.473 | 66.79 | 12 |
| 6 | SPU_008462 | Sp-Hypp_548 | Usherin-like | ECM | 684 | 0.212 | 0.298 | 0.710 | 61.85 | 21 |
| 7 | SPU_006645 | Sp-Gdi1_1 | GDP dissociation inhibitor 1 | CYT | 184 | 0.254 | 0.278 | 0.914 | 61.38 | 12 |
| 8 | SPU_000526 | Sp-Ebr1 | Egg bindin receptor 1 | MEM | 1428 | 0.185 | 0.518 | 0.357 | 59.84 | 9 |
| 9 | SPU_009154 | Sp-Hypp_42 | Unknown protein | UNK | 1017 | 0.217 | 0.346 | 0.629 | 58.28 | 19 |
| 10 | SPU_003671 | Sp-Lrp12 | Low density lipoprotein | ECF | 761 | 0.143 | 0.453 | 0.315 | 55.58 | 13 |
| 11 | SPU_018517 | Sp-Hypp_915 | Fibrosurfin-like | ECM | 424 | 0.282 | 0.346 | 0.814 | 54.02 | 15 |
| 12 | SPU_002551 | Sp-Egfiii | Fibropellin c | ECM | 357 | 0.194 | 0.415 | 0.468 | 53.88 | 6 |
| 13 | SPU_016836 | Sp-EMI/Egf | MEGF10-like | MEM | 1709 | 0.110 | 0.271 | 0.408 | 53.41 | 7 |
| 14 | SPU_005187 | Sp-3Acolf | Type IV collagen | ECM | 384 | 0.116 | 0.289 | 0.400 | 51.59 | 5 |
| 15 | SPU_003825 | Sp-14-3-3e | 14–3-3 epsilon | CYT | 152 | 0.090 | 0.255 | 0.354 | 51.20 | 4 |
aCellular location of protein. ECM extracellular matrix, ECF extracellular fluid, MEM membrane or intrinsic to membrane, CYT cytosolic, RIB ribosome, UNK unknown
bMean Likelihood Ratio Test Score comparing PAML models M7 and M8
cNumber of codons with Bayes Emperical Bayes posterior probabilities >0.95
Summary of the branch-sites test results
| Species | No. of tests | Minimum LRT score | No. of sig. test | No. overlapping with sites testsa | No. of GO categories tested | No. of significantly enriched GO categories |
|---|---|---|---|---|---|---|
|
| 5343 | 5.629 | 85 | 23 | 113 | 18 |
|
| 5283 | 5.759 | 64 | 18 | 90 | 29 |
|
| 5300 | 5.788 | 143 | 54 | 166 | 49 |
|
| 5340 | 6.038 | 79 | 28 | 89 | 22 |
|
| 5383 | 6.253 | 57 | 17 | 47 | 16 |
|
| 5762 | 5.891 | 87 | 26 | 89 | 26 |
|
| 5652 | 5.525 | 118 | 42 | 139 | 25 |
|
| 5472 | 6.296 | 71 | 25 | 106 | 32 |
|
| 5925 | 6.033 | 120 | 44 | 121 | 26 |
aNumber of PSGs also showing significant branch-sites tests