| Literature DB >> 34747465 |
Amy T Walsh1, Deborah A Triant1, Justin J Le Tourneau1, Md Shamimuzzaman1, Christine G Elsik1,2,3.
Abstract
We report an update of the Hymenoptera Genome Database (HGD; http://HymenopteraGenome.org), a genomic database of hymenopteran insect species. The number of species represented in HGD has nearly tripled, with fifty-eight hymenopteran species, including twenty bees, twenty-three ants, eleven wasps and four sawflies. With a reorganized website, HGD continues to provide the HymenopteraMine genomic data mining warehouse and JBrowse/Apollo genome browsers integrated with BLAST. We have computed Gene Ontology (GO) annotations for all species, greatly enhancing the GO annotation data gathered from UniProt with more than a ten-fold increase in the number of GO-annotated genes. We have also generated orthology datasets that encompass all HGD species and provide orthologue clusters for fourteen taxonomic groups. The new GO annotation and orthology data are available for searching in HymenopteraMine, and as bulk file downloads.Entities:
Mesh:
Year: 2022 PMID: 34747465 PMCID: PMC8728238 DOI: 10.1093/nar/gkab1018
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Genomes in HGD
| Group | Family | Species | New/Updateda | Assembly accessionb | Assembly name | Refc |
|---|---|---|---|---|---|---|
| Ants | Formicidae |
| GCF_204515.1 | Aech_3.9 | ( | |
|
| GCF_143395.1 | Attacep1.0 | ( | |||
|
| N | GCF_1594045.1 | Acol1.0 | ( | ||
|
| U | GCF_3227725.1 | Cflo_v7.5 | ( | ||
|
| Cobs_1.4 | ( | ||||
|
| N | GCF_1594065.1 | Ccosl1.0 | ( | ||
|
| N | GCF_1313825.1 | ASM131382v1 | ( | ||
|
| N | GCF_3651465.1 | ASM365146v1 | ( | ||
|
| U | GCF_3227715.1 | Hsal_v8.5 | ( | ||
|
| GCF_217595.1 | Lhum_UMD_V04 | ( | |||
|
| N | GCF_3260585.2 | ASM326058v2 | ( | ||
|
| N | GCF_5281655.1 | TAMU_Nfulva_1 | NP | ||
|
| N | GCF_10583005.1 | Obru_v1 | NP | ||
|
| U | GCF_3672135.1 | Obir_v5.4 | ( | ||
|
| U | GCF_187915.1 | Pbar_UMD_V03 | ( | ||
|
| N | GCF_2006095.1 | ASM200609v1 | ( | ||
|
| U | GCF_188075.2 | Si_gnH | ( | ||
|
| N | GCF_3070985.1 | ASM307098v1 | NP | ||
|
| N | GCF_1594075.1 | Tcor1.0 | ( | ||
|
| N | GCF_1594115.1 | Tsep1.0 | ( | ||
|
| N | GCF_1594055.1 | Tzet1.0 | ( | ||
|
| N | GCF_949405.1 | V.emery_V1.0 | ( | ||
|
| GCF_956235.1 | wasmannia.A_1 | NP | |||
| Bees | Apidae |
| N | GCF_1442555.1 | ACSNU-2.0 | ( |
|
| N | GCF_469605.1 | Apis_dorsata_1.3 | ( | ||
|
| N | GCF_184785.2 | Aflo_1.1 | ( | ||
|
| U | GCF_3254395.2 | Amel_HAv3.1 | ( | ||
|
| N | GCF_11952205.1 | Bbif_JDL3187 | ( | ||
|
| U | GCF_188095.3 | BIMP_2.2 | ( | ||
|
| U | GCF_214255.1 | Bter_1 | ( | ||
|
| N | GCF_11952275.1 | Bvanc_JDL1245 | ( | ||
|
| N | GCF_11952255.1 | Bvos_JDL3184-5_v1.1 | ( | ||
|
| N | GCF_1652005.1 | ASM165200v1 | ( | ||
|
| U | GCF_1483705.1 | ASM148370v1 | ( | ||
|
| U | GCF_1263275.1 | ASM126327v1 | ( | ||
|
| U | GCA_1276565.1 | ASM127656v1 | ( | ||
| Halictidae |
| U | GCF_1272555.1 | ASM127255v1 | ( | |
|
| Lalb_v2 | ( | ||||
|
| N | GCF_11865705.1 | USU_MGEN_1.2 | ( | ||
|
| N | GCF_3710045.1 | USU_Nmel_1.2 | ( | ||
| Megachilidae |
| GCF_220905.1 | MROT_1 | ( | ||
|
| N | GCF_4153925.1 | Obicornis_v3 | ( | ||
|
| N | GCF_12274295.1 | USDA_Olig_1 | NP | ||
| Sawflies | Cephidae |
| N | GCF_341935.1 | Ccin1 | ( |
| Diprionidae |
| N | GCF_1263575.1 | Nlec1.0 | ( | |
| Orussidae |
| N | GCF_612105.2 | Oabi_2 | ( | |
| Tenthredinidae |
| N | GCF_344095.2 | Aros_2 | ( | |
| Wasps (non-parasitoid) | Vespidae |
| N | GCF_1313835.1 | ASM131383v1 | ( |
|
| N | GCF_1465965.1 | Pdom_r1.2 | ( | ||
| Wasps (parasitoid) | Agaonidae |
| N | GCF_503995.1 | CerSol_1 | ( |
| Braconidae |
| N | GCF_13357705.1 | ASM1335770v1 | NP | |
|
| N | GCF_1412515.2 | Dall2.0 | ( | ||
|
| N | GCF_806365.1 | ASM80636v1 | ( | ||
|
| N | GCF_572035.2 | Mdem2 | ( | ||
| Cynipidae |
| N | GCF_10883055.1 | B_treatae_v1 | NP | |
| Encyrtidae |
| N | GCF_648655.2 | Cflo_2 | ( | |
| Pteromalidae |
| U | GCF_9193385.2 | Nvit_psr_1.1 | ( | |
| Trichogrammatidae |
| N | GCF_599845.2 | Tpre_2 | ( | |
| Fly (Dipteran outgroup) | Drosophilidae |
| N | GCF_1215.4 | Release_6_plus_ISO1_MT | ( |
aNew (N) genome or updated (U) genome assembly and/or gene set since the previous update report (1). A blank cell in the New/Updated column indicates no changes in genome assembly or gene set.
bA blank cell in the assembly accession column indicates the genome assembly is not available at NCBI.
cNP denotes ‘not published’. Links for data usage policies for these species are provided on the HGD ‘Genome Publications’ page.
Figure 1.The number of genes with GO annotations for each species in the UniProt-GOA and HGD GO Annotation datasets.
Taxonomic groups in the HGD-Ortho dataset
| Taxonomic groupa | Rank | Number of species |
|---|---|---|
| Holometabola | superorder | 59 |
| >Hymenoptera | order | 58 |
| >>Aculeata | infraorder | 45 |
| >>>Apoidea | superfamily | 20 |
| >>>>Apidae | family | 13 |
| >>>>>>Apis | genus | 4 |
| >>>>>>Bombus | genus | 5 |
| >>>>Halictidae | family | 4 |
| >>>Formicoideaa | superfamily | 23 |
| >>>>Formicidaeb | family | 23 |
| >>>>>Formicinae | subfamily | 4 |
| >>>>>Myrmicinae | subfamily | 14 |
| >>Parasitoida | infraorder | 9 |
| >>>>Chalcidoidea | family | 4 |
| >>>>Ichneumonoidea | family | 4 |
aIndentations shown as ‘>’ represent taxonomic hierarchy.
bIn HGD, all species of the superfamily Formicoidea are within the family Formicidae, and are labeled as the latter in HymenopteraMine.
Figure 2.An example of the use of the HGD-Ortho dataset to gather orthologue sequences for later molecular evolution analyses. (A) ‘Gene ID → Homologues’ template query and output identifying orthologues among Parasitoida, along with the orthologue cluster identifier, for a gene of Nasonia vitripennis. The N. vitripennis RefSeq gene id is entered into the box, and ‘Parasitoida’ is selected in the pulldown menu. (B) ‘Orthologue Cluster ID → Genes’ template query and output, showing all pairwise gene relationships in the orthologue cluster identified the previous query. The ‘Save as List’ menu in the output page is used to save a list of genes in the orthologue cluster. (C) ‘Gene ID → Protein and Coding Sequences’ template query and output. Rather than entering a single ‘Gene DB Identifier’, the box next to ‘constrain to be IN’ is checked and the gene list saved previously is selected in the pulldown menu. The output includes coding sequences, protein sequences, identifiers and sequence lengths, which can then be used to select the longest coding sequence of genes with multiple transcripts for downstream molecular evolution analyses. The protein sequences are in the rightmost column and are not visible in this figure. The ‘Export’ button in the top right corner can be used to export the sequences.