| Literature DB >> 28549454 |
Kristin L Sikkink1, Megan E Kobiela2, Emilie C Snell-Rood2.
Abstract
BACKGROUND: Agricultural environments have long presented an opportunity to study evolution in action, and genomic approaches are opening doors for testing hypotheses about adaptation to crops, pesticides, and fertilizers. Here, we begin to develop the cabbage white butterfly (Pieris rapae) as a system to test questions about adaptation to novel, agricultural environments. We focus on a population in the north central United States as a unique case study: here, canola, a host plant, has been grown during the entire flight period of the butterfly over the last three decades.Entities:
Keywords: Population divergence; Single nucleotide polymorphism; de novo transcriptome
Mesh:
Year: 2017 PMID: 28549454 PMCID: PMC5446745 DOI: 10.1186/s12864-017-3787-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Larval performance on different host plant species using a general linear model
| Forewing length | Forewing area | Development time | |
|---|---|---|---|
| Population |
|
|
|
| Host |
|
|
|
| Population x Host |
|
|
|
| Sex |
|
|
|
***P < 0.0001; **P < 0.01; *P < 0.10
Fig. 1Differences in performance metrics for the nonagricultural (purple) and agricultural (green) populations when raised on different plant hosts. *Significant phenotypic differences between populations from a general linear model (P < 0.05)
Summary of sequencing and transcriptome assembly results
| Sequencing (for | |
| Raw reads (101 nt paired-end) | 179472918 pairs |
| Cleaned reads | 159812653 pairs |
| 14974970 orphans | |
| Sequenced bases (cleaned) | 31.8 Gb |
| Assembly | |
| Number of transcripts (contigs) | 31624 |
| Number of unigenes | 17595 |
| Mean length (unigene) | 1420.69 bp |
| Median length (unigene) | 909 bp |
| N50 (unigene) | 2416 bp |
| Assembled length (unigenes) | 25.0 Mb |
| GC content (unigenes) | 38.8% |
| Number of protein-coding ORFs | 13991 |
| Mean ORF length | 1058.4 bp |
| BUSCO arthropod gene set (2675 genes) | |
| Complete, single-copy | 1875 (70.1%) |
| Complete, duplicated | 72 (2.7%) |
| Partial | 165 (6.2%) |
| Missing | 563 (21.0%) |
Summary of Pieris rapae transcriptome annotation
| Unigene annotation | |
|---|---|
|
| 11,049 (62.8%) |
|
| 9041 (51.3%) |
| NCBI nr database (E < 10-5) | 12,358 (70.2%) |
| UniProt/Swiss-Prot database (E < 10-5) | 8651 (49.2%) |
| UniProt/TrEMBL database (E < 10-5) | 12,230 (69.5%) |
| InterProScan | 11,537 (65.6%) |
| Pfam domaina | 11,428 (65.0%) |
| GO annotationb | 7464 (42.4%) |
| KEGG annotationa | 4554 (25.9%) |
Pfam and KEGG searches included only sequences from 13979 protein coding ORFs
GO matches were identified for 9595 unigenes, of which 7464 met significance cutoff requirements for annotation
Fig. 2Number of annotated unigenes assigned to major KEGG ontologies from the Pieris rapae transcriptome
Summary of biallelic SNPs identified in the final transcriptome assembly
| SNP summary | Total | Filtereda |
|---|---|---|
| Number of biallelic SNPs | 524184 | 63595 |
| SNPs/bp (all unigenes) | 0.0025 | |
| Number of unigenes containing variants | 5105 | |
| SNPs/bp in variant unigenes | 0.0055 | |
| % transitionsb | 60.55% | |
| % in predicted coding transcripts | 98.59% | |
| % in exons | 71.51% | |
| % nonsynonymous substitutions | 12.50% | |
| % synonymous substitutions | 59.01% |
SNPs passing filtering criteria have a called genotype (Q > 20) in at least 16 individuals per population with a minor allele frequency >1%
Expect 33% if transitions occur at random
Fig. 3GO categories enriched among genes containing significantly differentiated SNPs between populations. Red shading signifies significant enrichment of genes mapping to the GO term (FDR<0.05). The intensity of the shading scales with the significance of that term
Summary of 33 nonsynonymous SNPs that are significantly different between populations
| Unigene | Position | GST | G’ST | D | ϕST | Description | GO annotationa |
|---|---|---|---|---|---|---|---|
| c5933_g1 | 548 | 0.14 | 0.33 | 0.12 | 0.58 | Transport and Golgi organization 11 | C:cellular_component; F:isomerase activity |
| c7209_g1 | 1211 | 0.15 | 0.44 | 0.23 | 0.53 | FK506-binding 59 isoform X1 | P:protein folding |
| c10952_g1 | 794 | 0.18 | 0.44 | 0.19 | 0.51 | Ankyrin repeat domain-containing 11 | C:cellular_component |
| c11274_g1 | 630 | 0.17 | 0.43 | 0.19 | 0.54 | UDP-galactose 4-epimerase | P:biological_process; F:molecular_function |
| c11590_g2 | 235 | 0.14 | 0.33 | 0.10 | 0.38 | BCL-6 corepressor 1 | |
| c11789_g1 | 423 | 0.14 | 0.38 | 0.17 | 0.45 | hypothetical protein KGM_20694 | C:cellular_component |
| c11993_g1 | 616 | 0.16 | 0.39 | 0.15 | 0.60 | alpha-tocopherol transfer -like | P:transport; F:transporter activity; C:intracellular |
| c12584_g1 | 1268 | 0.13 | 0.37 | 0.18 | 0.44 | PREDICTED: uncharacterized protein LOC106138810, partial | |
| c12711_g1 | 1584 | 0.16 | 0.35 | 0.11 | 0.53 | Recombination repair 1 | C:nucleus; F:DNA binding; F:nuclease activity; F:lyase activity; F:ion binding; P:DNA metabolic process; P:response to stress |
| c13334_g1 | 1471 | 0.16 | 0.45 | 0.24 | 0.49 | aspartate--tRNA ligase, cytoplasmic | C:cytoplasm; F:ion binding; P:tRNA metabolic process; P:cellular amino acid metabolic process; F:ligase activity; P:translation |
| c13731_g1 | 1607 | 0.13 | 0.35 | 0.14 | 0.51 | Chorion b-ZIP transcription factor | F:DNA binding; F:nucleic acid binding transcription factor activity; P:cellular nitrogen compound metabolic process; P:biosynthetic process |
| c14347_g1 | 644 | 0.11 | 0.26 | 0.06 | 0.46 | aldehyde dehydrogenase X, mitochondrial-like | F:oxidoreductase activity |
| c15052_g1 | 791 | 0.16 | 0.35 | 0.12 | 0.43 | glutamic acid-rich -like | P:biological_process; C:extracellular region; F:molecular_function |
| c15052_g1 | 873 | 0.15 | 0.37 | 0.14 | 0.45 | glutamic acid-rich -like | P:biological_process; C:extracellular region; F:molecular_function |
| c15359_g2 | 775 | 0.13 | 0.28 | 0.08 | 0.38 | PREDICTED: uncharacterized protein LOC101740601 | C:membrane; C:integral component of membrane |
| c15377_g1 | 1122 | 0.19 | 0.43 | 0.16 | 0.53 | prion-like-(Q N-rich) domain-bearing 25 | C:membrane; C:integral component of membrane |
| c15434_g1 | 67 | 0.15 | 0.44 | 0.25 | 0.56 | cholinesterase 1-like | P:metabolic process; F:hydrolase activity |
| c15846_g1 | 1999 | 0.16 | 0.39 | 0.16 | 0.63 | nicastrin | C:cellular_component |
| c15990_g1 | 1062 | 0.14 | 0.42 | 0.24 | 0.51 | pancreatic triacylglycerol lipase-like | P:biological_process; C:extracellular region; F:molecular_function |
| c16008_g1 | 531 | 0.22 | 0.55 | 0.30 | 0.71 | nuclear pore complex Nup50 | C:nuclear envelope; P:nucleocytoplasmic transport; P:protein targeting; P:vesicle-mediated transport; P:signal transduction; P:cell differentiation; P:anatomical structure development; P:neurological system process; F:molecular_function |
| c16008_g1 | 789 | 0.24 | 0.58 | 0.32 | 0.78 | nuclear pore complex Nup50 | C:nuclear envelope; P:nucleocytoplasmic transport; P:protein targeting; P:vesicle-mediated transport; P:signal transduction; P:cell differentiation; P:anatomical structure development; P:neurological system process; F:molecular_function |
| c16008_g1 | 853 | 0.22 | 0.49 | 0.21 | 0.73 | nuclear pore complex Nup50 | C:nuclear envelope; P:nucleocytoplasmic transport; P:protein targeting; P:vesicle-mediated transport; P:signal transduction; P:cell differentiation; P:anatomical structure development; P:neurological system process; F:molecular_function |
| c16231_g1 | 2477 | 0.19 | 0.45 | 0.19 | 0.74 | serine palmitoyltransferase 1 | F:ion binding; P:biosynthetic process; C:cellular_component |
| c16497_g2 | 1985 | 0.18 | 0.44 | 0.20 | 0.57 | adenosylhomocysteinase | P:sulfur compound metabolic process; P:cofactor metabolic process; C:cytosol; P:cellular amino acid metabolic process; P:cellular nitrogen compound metabolic process; F:molecular_function |
| c16508_g2 | 391 | 0.12 | 0.32 | 0.12 | 0.57 | phosphoenolpyruvate carboxykinase | P:small molecule metabolic process; P:carbohydrate metabolic process; F:lyase activity; F:ion binding; P:biosynthetic process; F:kinase activity |
| c16613_g1 | 749 | 0.18 | 0.41 | 0.14 | 0.65 | translocator -like isoform X2 | C:intracellular |
| c17019_g1 | 1437 | 0.16 | 0.44 | 0.23 | 0.54 | FAM114A2 isoform X1 | |
| c17117_g1 | 1294 | 0.13 | 0.32 | 0.11 | 0.57 | saccharopine dehydrogenase-like oxidoreductase | F:oxidoreductase activity; C:cellular_component |
| c17365_g1 | 1932 | 0.14 | 0.34 | 0.12 | 0.49 | otopetrin-2-like isoform X1 | C:cellular_component |
| c17954_g2 | 1970 | 0.15 | 0.36 | 0.12 | 0.61 | probable uridine nucleosidase 2 isoform X2 | P:metabolic process; F:hydrolase activity |
| c18004_g1 | 465 | 0.22 | 0.56 | 0.32 | 0.67 | phosphatase 1 regulatory subunit 15A | C:cellular_component |
| c18004_g1 | 566 | 0.18 | 0.48 | 0.26 | 0.59 | phosphatase 1 regulatory subunit 15A | C:cellular_component |
| c18452_g1 | 386 | 0.14 | 0.31 | 0.08 | 0.55 | serine-rich adhesin for platelets-like isoform X1 | F:calcium ion binding; C:membrane; C:integral component of membrane; P:cell adhesion; P:homophilic cell adhesion via plasma membrane adhesion molecules; C:plasma membrane |
aGO descriptions are designated as cellular component (C), molecular function (F), or biological process (P)
Fig. 4a Principal components analysis using the subset of SNPs genotyped in all individuals. Points are shaded according the proportion of ancestry from Cluster 1 for each individual in the fastStructure analysis with k = 2. b FastStructure analysis based on all SNPs for k = 2 and k = 4