| Literature DB >> 17933781 |
Alexie Papanicolaou1, Steffi Gebauer-Jung, Mark L Blaxter, W Owen McMillan, Chris D Jiggins.
Abstract
With over 100 000 species and a large community of evolutionary biologists, population ecologists, pest biologists and genome researchers, the Lepidoptera are an important insect group. Genomic resources [expressed sequence tags (ESTs), genome sequence, genetic and physical maps, proteomic and microarray datasets] are growing, but there has up to now been no single access and analysis portal for this group. Here we present ButterflyBase (http://www.butterflybase.org), a unified resource for lepidopteran genomics. A total of 273 077 ESTs from more than 30 different species have been clustered to generate stable unigene sets, and robust protein translations derived from each unigene cluster. Clusters and their protein translations are annotated with BLAST-based similarity, gene ontology (GO), enzyme classification (EC) and Kyoto encyclopaedia of genes and genomes (KEGG) terms, and are also searchable using similarity tools such as BLAST and MS-BLAST. The database supports many needs of the lepidopteran research community, including molecular marker development, orthologue prediction for deep phylogenetics, and detection of rapidly evolving proteins likely involved in host-pathogen or other evolutionary processes. ButterflyBase is expanding to include additional genomic sequence, ecological and mapping data for key species.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17933781 PMCID: PMC2238913 DOI: 10.1093/nar/gkm853
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The content of ButterflyBase (September 2007)
| Species (ButterflyBase Code) | Taxon/Family | Proteins @ NCBI | mRNAs @ Bbase | Gene objects @ BBase | Similar to known proteins | Only exist in Lepidoptera | Found in 2+ ButterflyBase species | Clusters with putative SNPs (total SNPs) |
|---|---|---|---|---|---|---|---|---|
| Total: 33 | Lepidoptera | 6907 | 273 077 | 70 867 | 37 962 | 25 204 | 9583 (41 093) | 4821 (27 808) |
| Pyralidae | 3 | 28 | 23 | 14 | 6 | 5 (14) | 5 (14) | |
| Saturniidae | 45 | 22 | 17 | 17 | 0 | 0 (17) | N/A | |
| Saturniidae | 51 | 3912 | 1433 | 943 | 509 | 535 (1432) | 47 (140) | |
| Saturniidae | 65 | 40 | 37 | 37 | 0 | 0 (37) | N/A | |
| Saturniidae | 35 | 610 | 325 | 157 | 82 | 88 (226) | 9 (19) | |
| Nymphalidae | 11 | 9848 | 5726 | 2375 | 1207 | 1012 (3099) | 81 (234) | |
| Bombycidae | 3623 | 184 577 | 35 876 | 17 162 | 19 174 | 4776 (17 194) | 3756 (22 445) | |
| Bombycidae | 54 | 261 | 205 | 105 | 97 | 90 (194) | 3 (3) | |
| Tortricidae | 74 | 652 | 618 | 359 | 82 | 72 (379) | N/A | |
| Noctuidae | N/A | 570 | 259 | 138 | 2 | 2 (122) | 18 (50) | |
| Pyralidae | 95 | 93 | 84 | 68 | 8 | 4 (65) | N/A | |
| Noctuidae | 207 | 1221 | 733 | 634 | 53 | 50 (663) | 19 (118) | |
| Saturniidae | 57 | 20 | 16 | 16 | 0 | 0 (16) | N/A | |
| Nymphalidae | 157 | 17 573 | 6859 | 4787 | 1118 | 856 (5019) | 464 (3236) | |
| Nymphalidae | 443 | 4976 | 1965 | 1262 | 408 | 422 (1531) | 99 (369) | |
| Noctuidae | 152 | 90 | 83 | 83 | 0 | 0 (83) | N/A | |
| Noctuidae | 80 | 40 | 38 | 38 | 0 | 0 (38) | N/A | |
| Saturniidae | 133 | 1635 | 671 | 503 | 60 | 58 (514) | 25 (63) | |
| Sphingidae | 582 | 3683 | 2291 | 1256 | 412 | 301 (1469) | 22 (56) | |
| Crambidae | 146 | 1761 | 543 | 309 | 137 | 133 (418) | 40 (162) | |
| Pieridae | 17 | 5 | 5 | 5 | 0 | 0 (4) | N/A | |
| Papilionidae | 14 | 708 | 307 | 236 | 22 | 20 (248) | 27 (102) | |
| Pyralidae | 47 | 6219 | 3788 | 1879 | 483 | 414 (2079) | 28 (80) | |
| Papilionidae | 41 | 25 | 24 | 24 | 0 | 0 (24) | N/A | |
| Plutellidae | 188 | 1286 | 1021 | 701 | 108 | 124 (747) | 3 (11) | |
| Saturniidae | 49 | 27 | 27 | 27 | 0 | 0 (27) | N/A | |
| Noctuidae | 64 | 48 | 42 | 42 | 0 | 0 (42) | N/A | |
| Noctuidae | 241 | 31 538 | 6993 | 4172 | 1116 | 1204 (4741) | 149 (528) | |
| Noctuidae | 66 | 154 | 100 | 85 | 7 | 8 (90) | 1 (1) | |
| Noctuidae | 28 | 23 | 20 | 20 | 0 | 0 (20) | N/A | |
| Tineidae | 1 | 921 | 240 | 170 | 39 | 14 (162) | 30 (177) | |
| Noctuidae | 138 | 511 | 498 | 338 | 74 | 61 (379) | N/A |
*designates those species with no public ESTs but public full-length mRNA sequences.
aNuclear sequences only, this total includes segmented sequences and is not limited to RefSeq. August 2007. The B. mori proteins were limited to 1025 before January 2007.
bBLASTx of nucleotide consensus and BLASTp of predicted proteins versus Uniref100 or proteins released by the Apis mellifera, D. melanogaster, Tribolium castaneum and Anopheles gambiae genomes or other Arthropoda proteins in EBI with E-value cutoff 1E − 4 (source: EBI Jul 2007). We also used in-house clusters of the public EST data for Aedes aegypti, Anopheles gambiae, Culex pipiens, Drosophila ananassae, Drosophila erecta, Drosophila grimshawi, Drosophila simulans, Drosophila yakuba and Tribolium castaneum (E-value cutoff 1E − 4, source: EBI September 2007).
cBLASTn of nucleotide consensus versus Lepidoptera nuclear nucleotides, B. mori genome from EBI and ButterflyBase EST consensuses but no significant similarity to the databases mentioned above (EBI, Jul 2007, E-value cutoffs 1E − 4).
dLepidoptera-specific clusters which were found to have a significant hit in at least one other organism in ButterflyBase using BLASTn for nucleotide consensuses or BLASTp for protein predictions (Jul 2007, E-value cutoff 1E − 3). Gene objects present in more than one organism facilitate annotation and marker design. In brackets, a similar count is present for all clusters regardless of similarity to any protein.
eMost Lepidoptera cDNA libraries are constructed with relative outbred individuals, thus the relatively high number of SNPs. Even though the number of clusters containing putative SNPs are accurate, the reader has to consider that the total number of SNPs may be inflated as the data here are pooled from all cDNA libraries.