| Literature DB >> 27019205 |
Khalid Mahmood1, Dorte H Højland1, Torben Asp2, Michael Kristensen1.
Abstract
BACKGROUND: Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27019205 PMCID: PMC4809514 DOI: 10.1371/journal.pone.0151434
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of run statistics and assembly.
| Large run results | |
| Total number of reads | 666,537 |
| Total number of bases w/o keys, tags and bad quality bases | 315,617,305 |
| Average read length w/o keys, tags and bad quality bases | 473 |
| Assembly results | |
| Number assembled | 387,594 |
| Number too short | 3,446 |
| Sum of large contigs | |
| Total number of reads | 243,685 |
| Number of large contigs | 8,061 |
| Total number of bases | 13,356,970 |
| N50 | 1,679 |
| Sum of all contigs | |
| Total number of reads | 387,594 |
| Number of all contigs | 35,834 |
| Total number of bases | 30,645,588 |
| Average contig length | 855 |
| Shortest contig length | 40 |
| Longest contig length | 6,150 |
| N50 | 986 |
Fig 1Length distribution of the assembled contigs.
Fig 2Distributions of the annotated sequences in three GO categories (Level Two).
A: biological process, B: Cellular component, C: Molecular function.
Over-represented GO terms from resistant housefly transcriptome.
GO term enrichment were performed using Expression Analysis Systematic Explorer (EASE) implemented in the Database for Annotation, Visualization, and Integrated Discovery (DAVID 6.7).
| Term | Count | Bonferroni | Benjamini | FDR | |
|---|---|---|---|---|---|
| Biological Processes (BP) | |||||
| GO:0016043~cellular component organization | 155 | 5.59E-14 | 1.07E-10 | 1.07E-10 | 9.56E-11 |
| GO:0009987~cellular process | 331 | 6.12E-12 | 1.17E-08 | 5.87E-09 | 1.05E-08 |
| GO:0019222~regulation of metabolic process | 109 | 2.24E-10 | 4.30E-07 | 1.43E-07 | 3.84E-07 |
| GO:0065007~biological regulation | 191 | 1.21E-09 | 2.31E-06 | 5.79E-07 | 2.06E-06 |
| GO:0006996~organelle organization | 91 | 1.69E-09 | 3.25E-06 | 6.50E-07 | 2.90E-06 |
| GO:0050789~regulation of biological process | 175 | 7.14E-09 | 1.37E-05 | 2.28E-06 | 1.22E-05 |
| GO:0060255~regulation of macromolecule metabolic process | 98 | 8.29E-09 | 1.59E-05 | 2.27E-06 | 1.42E-05 |
| GO:0031323~regulation of cellular metabolic process | 94 | 4.96E-08 | 9.53E-05 | 1.19E-05 | 8.49E-05 |
| GO:0051179~localization | 132 | 9.34E-08 | 1.79E-04 | 1.99E-05 | 1.60E-04 |
| GO:0050794~regulation of cellular process | 160 | 1.93E-07 | 3.71E-04 | 3.71E-05 | 3.30E-04 |
| GO:0010468~regulation of gene expression | 86 | 2.49E-07 | 4.78E-04 | 4.35E-05 | 4.26E-04 |
| GO:0033036~macromolecule localization | 54 | 2.98E-07 | 5.71E-04 | 4.76E-05 | 5.09E-04 |
| GO:0051234~establishment of localization | 114 | 3.35E-07 | 6.43E-04 | 4.95E-05 | 5.73E-04 |
| GO:0034641~cellular nitrogen compound metabolic process | 110 | 4.46E-07 | 8.57E-04 | 6.12E-05 | 7.64E-04 |
| GO:0007010~cytoskeleton organization | 50 | 4.83E-07 | 9.27E-04 | 6.18E-05 | 8.26E-04 |
| Molecular Function (MF) | |||||
| GO:0005515~protein binding | 308 | 1.22E-06 | 7.86E-04 | 7.86E-04 | 0.0018253 |
| GO:0005488~binding | 430 | 4.34E-06 | 0.0027792 | 0.0013905 | 0.0064616 |
| GO:0008092~cytoskeletal protein binding | 28 | 1.05E-05 | 0.0067290 | 0.0022480 | 0.0156750 |
| GO:0016563~transcription activator activity | 16 | 4.96E-05 | 0.0313181 | 0.0079232 | 0.0738502 |
| GO:0016818~hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides | 50 | 2.01E-04 | 0.12109596 | 0.0254855 | 0.2992483 |
| GO:0016817~hydrolase activity, acting on acid anhydrides | 50 | 2.21E-04 | 0.13236004 | 0.0233853 | 0.3291028 |
| GO:0016462~pyrophosphatase activity | 49 | 2.77E-04 | 0.16303795 | 0.0251047 | 0.4123734 |
| GO:0003779~actin binding | 17 | 3.45E-04 | 0.19870473 | 0.0273108 | 0.5130181 |
| GO:0000166~nucleotide binding | 86 | 4.73E-04 | 0.26216982 | 0.0332181 | 0.7034381 |
| GO:0017111~nucleoside-triphosphatase activity | 47 | 6.60E-04 | 0.34563135 | 0.0415217 | 0.9798076 |
| GO:0030528~transcription regulator activity | 54 | 7.25E-04 | 0.37221362 | 0.0414400 | 1.0751055 |
| GO:0032555~purine ribonucleotide binding | 67 | 0.0015268 | 0.62505230 | 0.0784952 | 2.25187183 |
| GO:0032553~ribonucleotide binding | 67 | 0.0015268 | 0.62505230 | 0.0784952 | 2.25187183 |
| GO:0008641~small protein activating enzyme activity | 5 | 0.0021777 | 0.75331878 | 0.1020725 | 3.197491 |
| GO:0000175~3'-5'-exoribonuclease activity | 5 | 0.0021777 | 0.75331878 | 0.1020725 | 3.197491 |
| Cellular Components (CC) | |||||
| GO:0043234~protein complex | 133 | 2.56E-10 | 9.68E-08 | 9.68E-08 | 3.54E-07 |
| GO:0044428~nuclear part | 83 | 4.77E-10 | 1.80E-07 | 9.02E-08 | 6.60E-07 |
| GO:0032991~macromolecular complex | 158 | 6.34E-10 | 2.40E-07 | 7.99E-08 | 8.77E-07 |
| GO:0005634~nucleus | 152 | 1.94E-08 | 7.33E-06 | 1.83E-06 | 2.68E-05 |
| GO:0005622~intracellular | 312 | 2.48E-08 | 9.38E-06 | 1.88E-06 | 3.43E-05 |
| GO:0043229~intracellular organelle | 246 | 5.63E-08 | 2.13E-05 | 3.55E-06 | 7.79E-05 |
| GO:0043226~organelle | 246 | 6.45E-08 | 2.44E-05 | 3.48E-06 | 8.92E-05 |
| GO:0044446~intracellular organelle part | 154 | 7.12E-08 | 2.69E-05 | 3.36E-06 | 9.85E-05 |
| GO:0044422~organelle part | 154 | 7.70E-08 | 2.91E-05 | 3.23E-06 | 1.06E-04 |
| GO:0044424~intracellular part | 291 | 1.06E-07 | 4.00E-05 | 4.00E-06 | 1.46E-04 |
| GO:0005654~nucleoplasm | 36 | 3.85E-06 | 0.0014545 | 1.32E-04 | 0.0053268 |
| GO:0005694~chromosome | 39 | 5.60E-06 | 0.0021162 | 1.77E-04 | 0.0077524 |
| GO:0044451~nucleoplasm part | 32 | 2.21E-05 | 0.0083347 | 6.44E-04 | 0.0306246 |
| GO:0000178~exosome (RNase complex) | 7 | 2.65E-05 | 0.0099551 | 7.14E-04 | 0.0366074 |
| GO:0000176~nuclear exosome (RNase complex) | 7 | 2.65E-05 | 0.0099551 | 7.14E-04 | 0.0366074 |
This table contains only top 15 significant enriched terms from each of Biological Processes (BP), Molecular Function (MF) and Cellular Component (CC). Table includes GO term, count (number of genes in the list of significant genes with a given term), P value, Bonferroni, Benjamini and FDR values.
Functional annotation clusters of metabolic related genes involved in epigenetic, transcription and gene expression identified by DAVID in the transcriptome of insecticide resistant housefly.
| Cluster | Category | GO terms | Enrichment Score | Count | Fold enrichment | |
|---|---|---|---|---|---|---|
| 1 | BP | GO:0045449~regulation of transcription | 22.78 | 62 | 8.45E-34 | 5.7 |
| 1 | BP | GO:0006355~regulation of transcription, DNA-dependent | 22.78 | 50 | 5.44E-27 | 6 |
| 1 | MF | GO:0030528~transcription regulator activity | 22.78 | 45 | 1.00E-21 | 5.22 |
| 1 | MF | GO:0003700~transcription factor activity | 22.78 | 22 | 9.04E-09 | 4.47 |
| 2 | BP | GO:0006350~transcription | 20.06 | 35 | 4.05E-17 | 5.6 |
| 2 | PIR | Transcription regulation | 20.06 | 34 | 4.50E-25 | 10.65 |
| 2 | PIR | Transcription | 20.06 | 34 | 7.38E-25 | 10.49 |
| 3 | BP | GO:0045941~positive regulation of transcription | 11.35 | 16 | 2.90E-11 | 10.13 |
| 3 | BP | GO:0010628~positive regulation of gene expression | 11.35 | 16 | 3.29E-11 | 10.05 |
| 3 | BP | GO:0045944~positive regulation of transcription from RNA polymerase II promoter | 11.35 | 7 | 5.12E-05 | 10.28 |
| 3 | MF | GO:0016563~transcription activator activity | 11.35 | 15 | 5.87E-12 | 12.77 |
| 5 | BP | GO:0045892~negative regulation of transcription, DNA-dependent | 8.85 | 17 | 3.16E-09 | 6.68 |
| 5 | BP | GO:0016481~negative regulation of transcription | 8.85 | 17 | 1.37E-08 | 6.04 |
| 5 | BP | GO:0000122~negative regulation of transcription from RNA polymerase II promoter | 8.85 | 7 | 1.59E-04 | 8.43 |
| 5 | MF | GO:0016564~transcription repressor activity | 8.85 | 9 | 6.04E-05 | 6.53 |
| 10 | BP | GO:0045944~positive regulation of transcription from RNA polymerase II promoter | 3.78 | 7 | 5.12E-05 | 10.29 |
| 16 | MF | GO:0008134~transcription factor binding | 3.30 | 9 | 1.38E-05 | 8.01 |
| 16 | MF | GO:0003712~transcription cofactor activity | 3.30 | 6 | 0.001 | 7.54 |
| 16 | MF | GO:0003713~transcription coactivator activity | 3.30 | 4 | 0.008 | 9.60 |
| 18 | BP | GO:0016441~posttranscriptional gene silencing | 2.95 | 5 | 0.002 | 9.19 |
| 18 | BP | GO:0035194~posttranscriptional gene silencing by RNA | 2.95 | 5 | 0.002 | 9.19 |
| 24 | MF | GO:0016566~specific transcriptional repressor activity | 2.11 | 5 | 9.96E-04 | 11.0 |
| 29 | BP | GO:0006366~transcription from RNA polymerase II promoter | 1.82 | 6 | 0.01 | 4.45 |
| 29 | BP | GO:0006351~transcription, DNA-dependent | 1.82 | 6 | 0.034 | 3.32 |
| 29 | CC | GO:0008023~transcription elongation factor complex | 1.82 | 4 | 0.001 | 17.09 |
| 29 | MF | GO:0003711~transcription elongation regulator activity | 1.82 | 3 | 0.01 | 4.45 |
| 29 | MF | GO:0016251~general RNA polymerase II transcription factor activity | 1.82 | 5 | 0.025 | 4.45 |
| 13 | BP | GO:0040029~regulation of gene expression, epigenetic | 3.45 | 14 | 3.72E-07 | 6.09 |
| 13 | BP | GO:0045814~negative regulation of gene expression, epigenetic | 3.45 | 6 | 0.013 | 4.20 |
| 7 | CC | GO:0000790~nuclear chromatin | 4.06 | 7 | 9.98E-06 | 13.29 |
| 7 | CC | GO:0000785~chromatin | 4.06 | 10 | 2.87E-05 | 6.05 |
| 11 | PIR | Chromatin regulator | 3.68 | 10 | 2.17E-09 | 18.95 |
| 11 | BP | GO:0016570~histone modification | 3.68 | 7 | 8.84E-05 | 9.35 |
| 11 | GO:0016571~histone methylation | 3.68 | 4 | 0.003 | 4.14 | |
| 11 | BP | GO:0016568~chromatin modification | 3.68 | 14 | 2.60E-09 | 9.19 |
| 11 | BP | GO:0006325~chromatin organization | 3.68 | 16 | 6.73E-09 | 6.92 |
| 11 | BP | GO:0016569~covalent chromatin modification | 3.68 | 7 | 8.84E-05 | 9.35 |
| 11 | BP | GO:0006338~chromatin remodeling | 3.68 | 6 | 3.58E04 | 9.59 |
| 11 | MF | GO:0018024~histone-lysine N-methyltransferase activity | 3.68 | 4 | 0.002 | 16.67 |
| 11 | MF | GO:0042054~histone methyltransferase activity | 3.68 | 4 | 0.002 | 15.08 |
| 11 | MF | GO:0046974~histone methyltransferase activity (H3-K9 specific) | 3.68 | 3 | 0.004 | 29.69 |
| 11 | CC | GO:0035097~histone methyltransferase complex | 3.68 | 3 | 0.007 | 22.79 |
| 11 | CC | GO:0034708~methyltransferase complex | 3.68 | 3 | 0.007 | 22.79 |
| 11 | MF | GO:0016278~lysine N-methyltransferase activity | 3.68 | 4 | 0.002 | 16.67 |
| 11 | MF | GO:0016279~protein-lysine N-methyltransferase activity | 3.68 | 4 | 0.002 | 16.67 |
| 11 | MF | GO:0008276~protein methyltransferase activity | 3.68 | 4 | 0.007 | 9.90 |
| 11 | MF | GO:0008170~N-methyltransferase activity | 3.68 | 4 | 0.009 | 9.05 |
| 13 | BP | GO:0006342~chromatin silencing | 3.45 | 6 | 0.013 | 4.20 |
| 11 | BP | GO:0006479~protein amino acid methylation | 3.68 | 4 | 0.006 | 10.50 |
| 11 | BP | GO:0043414~biopolymer methylation | 3.68 | 4 | 0.02 | 6.84 |
| 11 | BP | GO:0032259~methylation | 3.68 | 4 | 0.034 | 5.55 |
| 13 | BP | GO:0016458~gene silencing | 3.45 | 10 | 2.30E-04 | 4.74 |
| 18 | BP | GO:0016441~posttranscriptional gene silencing | 2.95 | 5 | 0.002 | 9.19 |
| 18 | BP | GO:0035194~posttranscriptional gene silencing by RNA | 2.95 | 5 | 0.002 | 9.19 |
| 18 | BP | GO:0031047~gene silencing by RNA | 2.95 | 5 | 0.003 | 8.17 |
| 34 | BP | GO:0007307~eggshell chorion gene amplification | 1.62 | 3 | 0.019 | 13.78 |
Functional annotation groups with geometric p-value less than 0.05 are listed. Class ontology: BP = biological processes, MF = molecular function, CC = cellular component, PIR = protein information resource. Count = Number of genes in ontology
Fig 3Distribution of contig sequences among KEGG (Kyoto Encyclopedia of Genes and Genome) pathways.
The top 20 most highly represented pathways are shown. Analysis was performed using the Blast2GO and the KEGG database.
Differential expression, CpG islands, regulatory motifs and SNPs in selected P450s.
| CYP P450 | Copy number | CpG island | Promoter Motif (PM) | mRNA Motif (MM) | SNPs | ||||
|---|---|---|---|---|---|---|---|---|---|
| Frequency | Size (bp) | Location and coverage | Location | NA | AA | ||||
| 2100±497 (2.2) | 1 | 677 | 3’-end, covers exon 3 | 1, 3–6, 9, 13 | 1–3, 8, 13, 15, 16, 18, 19 | 67 (CTG), 281(GGT), 139 | C→T, G→T | L→L, G→V | |
| 24.3±8.8 (2.7) | 1 | 997 | 5’-end, covers exon 1 and promoter part | 1–6, 8, 9, 11, 12 | 1–7, 9, 16 | ||||
| 2.4±0.7 (0.4) | 1 | 785 | 5’-end, exon 1 and promoter part | 1, 3, 7, 8, 10 | 1–9, 11, 14 | ||||
| 327±71.7 (2.8) | 1 | 795 | 5’-end, exon 1 and promoter part | 1, 5–7 | 1–7, 9, 11, 12, 15, 17, 20, 23 | 1102(CAT) | C→T | H→Y | |
| 797±144 (0.7) | 1 | 983 | 5’-end, covers all three exons, two introns and promoter part | 2–6 | 1, 3, 4, 8, 10, 13, 23 | 1(ATG), 3(ATG), 9(GTA), 10(GAA), 11(GAA), 13(TTA), 14(TTA), 11 | A→C, G→T, A→G, G →T, A→C, T→C, T→C, | M→L, M→L, V→V, E→S, E→S, L→P, L→P | |
| 405±49.8 (5.8) | 1 | 649 | 5’-end, covers 1st exon and intron | 1–5, 7, 10–13 | 1, 3–5, 8–10, 12, 13, 18, 19, 21 | ||||
| 363±95.5 (2.4) | 1 | 850 | Intergenic, covers exon 2 | 2, 3, 5, 7, 10, 12 | 1, 2, 4, 6, 7, 14, 17, 20–22 | ||||
| 26.3±6.0 (0.7) | 3 | 749, 675, 601 | Intergenic and 3’-end, covers two introns and exon 3 | 3–5 | 3, 4, 6, 8, 13, 22 | 230(AAG), 1114(ACG) | A→T, G→A | K→M, T→T |
1 Obtained from Højland et al. (2014) [29]
2 Promoter Motifs (PM) along with transcription factors with their significance can be found in Fig 4.
3 mRNA Motifs (MM) along with transcription factors can be found in Fig 5.
* Nucleic Acid
** Amino Acid
§ Insertion after specified nucleotide. Bold nucleotide letters represent site where mutation occur.
Fig 4Motifs found in the 1000 bp upstream of promoter of selected CYPs P450 in the MEME analysis.
The combination of TOMTOM and GOMO provides information regarding the novelty of the motif and the probability that this motif is involved in transcription regulation. The found motifs were presented with their e-value. GOMO is used to provide information on what type of GO term could be associated to this DNA motif using the Drosophila melanogaster sequence as reference. The TOMTOM hits provide information on the number of known transcription factor to which this motif is sequence wise close using Pearson correlation coefficient with E-value <10. The p value is the probability that the match occurred by random chance according to the null model, E value is the expected number of false positives in the matches up to this point and q value is the minimum False Discovery Rate required to include the match.
Fig 5Motifs found in the mRNA of selected CYPs P450 in the MEME analysis.
The combination of TOMTOM and GOMO provides information regarding the novelty of the motif and the probability that this motif is involved in transcription regulation. The found motifs were presented with their e-value. GOMO is used to provide information on what type of GO term could be associated to this DNA motif using the Drosophila melanogaster sequence as reference. The TOMTOM hits provide information on the number of known transcription factor to which this motif is sequence wise close Pearson correlation coefficient with E-value <10. The p value is the probability that the match occurred by random chance according to the null model, E value is the expected number of false positives in the matches up to this point and q value is the minimum False Discovery Rate required to include the match.