| Literature DB >> 25168270 |
Shotaro Hirase1, Haruka Ozaki, Wataru Iwasaki.
Abstract
BACKGROUND: Understanding the genetic basis of adaptive evolution is one of the major goals in evolutionary biology. Recently, it has been revealed that gene copy number variations (GCNVs) constitute significant proportions of genomic diversities within natural populations. However, it has been unclear whether GCNVs are under positive selection and contribute to adaptive evolution. Parallel evolution refers to adaptive evolution of the same trait in related but independent lineages, and three-spined stickleback (Gasterosteus aculeatus) is a well-known model organism. Through identification of genetic variations under parallel selection, i.e., variations shared among related but independent lineages, evidence of positive selection is obtained. In this study, we investigated whole-genome resequencing data from the marine and freshwater groups of three-spined sticklebacks from diverse areas along the Pacific and Atlantic Ocean coastlines, and searched for GCNVs under parallel selection.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25168270 PMCID: PMC4159527 DOI: 10.1186/1471-2164-15-735
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Schematic diagram of the method for identifying GCNVs likely under parallel selection. (A) Re-sequenced reads (thin lines) from each individual were mapped to the stickleback reference genome (thick lines). (B) The numbers of mapped reads that overlapped with genes were counted, and we searched for genes that showed significant differences in the normalized read numbers between the freshwater (closed circles) and marine groups (open circles) with a false discovery rate (FDR) < 0.05. Genes that showed significant differences under the three mapping options were regarded as GCNVs likely under parallel selection. (C) The number of different allelic sequences was counted for each of the identified GCNVs by enumerating every pair of SNV positions that was located within the read length. If three or more allelic sequences were observed for a gene, the GCNV involved duplications or multiplications.
Figure 2GCNVs likely under parallel selection. The normalized numbers of mapped reads per 1-Mb gene length for each gene across the genomes of the (A) freshwater and (B) marine groups. Each black point represents the number for each gene in each individual, and the green lines represent the mean values for each gene across individuals. (C) The false discovery rate of the EdgeR analysis on the differences in the numbers of mapped reads between the freshwater and marine groups for each gene. Asterisks indicate the positions of the GCNVs under parallel selection (FDR < 0.05).
Gene copy number variations likely under parallel selection
| Ensembl gene ID | Genomic location | Group having more copies | In divergent regions [
[ | Gene annotation | ||
|---|---|---|---|---|---|---|
| Linkage group | Start | End | ||||
| ENSGACG00000014268 | groupI | 21,543,442 | 21,565,537 | Freshwater | Yes | Tensin 1 (TNS1) |
| ENSGACG00000014289 | groupI | 21,600,545 | 21,614,802 | Freshwater | Yes | Serine/threonine kinase 11 interacting protein (STK11IP) |
| ENSGACG00000018214 | groupIV | 11,925,723 | 11,934,224 | Freshwater | No | Kinesin family member 3A (KIF3A) |
| ENSGACG00000019313 | groupIV | 23,928,955 | 23,953,125 | Freshwater | No | Tubulin tyrosine ligase-like family member 12 (TTLL12) |
| ENSGACG00000019321 | groupIV | 23,968,608 | 23,982,358 | Freshwater | Yes | Sulfotransferase family 4A member 1 (SULT4A1) |
| ENSGACG00000020171 | groupVII | 12,721,951 | 12,727,083 | Freshwater | No | Protein phosphatase 1 regulatory (inhibitor) subunit 14A (PPP1R14A) |
| ENSGACG00000014553 | groupXI | 15,607,308 | 15,613,431 | Freshwater | No | Apolipoprotein L 2 (APOL2) |
| ENSGACG00000002886 | groupXIX | 2,446,925 | 2,473,806 | Freshwater | Yes | NLR family CARD domain containing 5 (NLRC5) |
| ENSGACG00000002902 | groupXIX | 2,484,537 | 2,497,605 | Freshwater | Yes | *Myosin heavy chain (MyHC) |
| ENSGACG00000002933 | groupXIX | 2,501,529 | 2,511,962 | Freshwater | Yes | *Myosin heavy chain (MyHC) |
| ENSGACG00000006397 | groupXX | 6,176,973 | 6,190,798 | Freshwater | No | Dopa decarboxylase (aromatic L-amino acid decarboxylase)(DDC) |
| ENSGACG00000002551 | groupXXI | 5,808,646 | 5,870,440 | Freshwater | No | *Rab effector MyRIP-like (MYRIP) |
| ENSGACG00000002682 | groupXXI | 6,189,464 | 6,240,135 | Freshwarer | No | Neuropilin (NRP) and tolloid (TLL)-like 1 (NETO1) |
| ENSGACG00000002744 | groupXXI | 6,534,938 | 6,558,550 | Freshwater | No | Junctophilin 1 (JPH1) |
| ENSGACG00000002857 | groupXXI | 7,179,938 | 7,191,684 | Freshwater | No | Carboxypeptidase A6 (CPA6) |
| ENSGACG00000002913 | groupXXI | 7,252,896 | 7,262,425 | Freshwater | No | Minichromosome maintenance domain containing 2 (MCMDC2) |
| ENSGACG00000002918 | groupXXI | 7,255,256 | 7,257,350 | Freshwater | No | *Unknown |
| ENSGACG00000003408 | groupXXI | 7,994,019 | 7,996,973 | Freshwater | No | *Neoverrucotoxin |
| ENSGACG00000015099 | scaffold_68 | 405,524 | 407,382 | Freshwater | No | LSM14B SCD6 homolog B (S. cerevisiae) (LSM14B) |
| ENSGACG00000019508 | groupIV | 25,553,051 | 25,563,391 | Marine | No | Neurexophilin and PC-esterase domain family member 3 (NXPE3) |
| ENSGACG00000020238 | groupVII | 14,778,775 | 14,788,878 | Marine | No | *Gap-Pol polyprotein-like |
| ENSGACG00000003374 | groupVIII | 1,526,335 | 1,528,158 | Marine | No | *Unknown |
| ENSGACG00000003379 | groupVIII | 1,528,722 | 1,530,746 | Marine | No | *Unknown |
| ENSGACG00000005313 | groupXI | 1,204,843 | 1,206,464 | Marine | No | *Heat shock protein (HSP) |
*Gene annotations were based on BlastX search if Ensembl annotations were unavailable.
Figure 3Segmental duplications/multiplications or deletions underlying the clusters of GCNVs likely under parallel selection. Gene clusters that included GCNVs likely under parallel selection located in the linkage groups (A) VIII and (B) XIX are shown with three genes upstream or downstream. Each point represents the ratio of the average of the normalized numbers of the mapped reads between the two groups. The identified GCNVs with more copies in the marine and freshwater groups are colored by orange and blue, respectively. Genes were excluded from visualization if the median of the numbers of mapped reads per 100 bp of the gene length was less than one or if no reads were mapped in at least one individual. The error bars indicate standard deviations of the ratios that were calculated for pairs of freshwater and marine groups derived from the same geographic regions. (If multiple samples were derived from the same geographic region for either group, the average of the normalized number of reads was used for the calculation).
Figure 4Numbers of mapped reads in two freshwater-increased and one freshwater-decreased GCNVs. Each point and line represent the normalized numbers and average normalized numbers, respectively, of the mapped reads per 200-bp non-overlapping window for 10 freshwater (black) and 10 marine (red) individuals. (A and B) Two freshwater-increased and (C) one freshwater-decreased GCNVs that were confirmed by three or more different allelic sequences, are shown. Gene models are shown at the bottom of each panel.