| Literature DB >> 28415075 |
Steven T Hill1, Ramcharan Sudarsanam1, John Henning2, David Hendrix1,3.
Abstract
Hop (Humulus lupulus L. var lupulus) is a dioecious plant of worldwide significance, used primarily for bittering and flavoring in brewing beer. Studies on the medicinal properties of several unique compounds produced by hop have led to additional interest from pharmacy and healthcare industries as well as livestock production as a natural antibiotic. Genomic research in hop has resulted a published draft genome and transcriptome assemblies. As research into the genomics of hop has gained interest, there is a critical need for centralized online genomic resources. To support the growing research community, we report the development of an online resource "HopBase.org." In addition to providing a gene annotation to the existing Shinsuwase draft genome, HopBase makes available genome assemblies and annotations for both the cultivar "Teamaker" and male hop accession number USDA 21422M. These genome assemblies, gene annotations, along with other common data, coupled with a genome browser and BLAST database enable the hop community to enter the genomic age. The HopBase genomic resource is accessible at http://hopbase.org and http://hopbase.cgrb.oregonstate.edu.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28415075 PMCID: PMC5467566 DOI: 10.1093/database/bax009
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Sequencing libraries used for the Teamaker genome assembly
| Mate pair insert size (bp) | Number of sequenced reads | Number of single-copy + QC Reads | Portion removed from dedup + QC | Estimated coverage |
|---|---|---|---|---|
| 9000 | 796 503 434 | 164 452 668 | 0.794 | 6.091 |
| 6000 | 363 664 930 | 96 117 630 | 0.736 | 3.560 |
| 5000 | 830 281 020 | 611 993 950 | 0.263 | 22.666 |
| 3000 | 618 181 114 | 379 821 668 | 0.386 | 14.067 |
| 2 608 630 498 | 1 252 385 916 | 0.520 | 46.385 | |
| 143 | 1 655 421 082 | 708 994 796 | 0.572 | 26.259 |
| 173 | 1 176 857 672 | 606 418 512 | 0.485 | 22.460 |
| 250 | 419 621 690 | 388 494 910 | 0.074 | 14.389 |
| 3 251 900 444 | 1 703 908 218 | 0.476 | 63.108 | |
Figure 1.The transcriptome-guided Assembly (TGA) pipeline. Transcripts are combined to form a union model consisting of all exons present for each isoform. The resulting sequence is used as the initial “assembly.” The assembly is aligned to the DNA reads using BLAST, and all aligning reads are retrieved. The Reads are assembled using Velvet, and ordered according to the order of the corresponding exons in the transcript models. After gap filling, this process is repeated until subsequent applications do not change the assembly.
Distribution of repeats in Teamaker assembly by length
| 100–200bp | 201–300bp | 301bp+ | Total | |
|---|---|---|---|---|
| LTR | 2094 | 780 | 353 | 3227 |
| Unclear | 155 | 51 | 52 | 258 |
| DNA Transposon | 1024 | 240 | 60 | 1324 |
| Retro | 1533 | 545 | 212 | 2290 |
| LINE | 303 | 601 | 618 | 1522 |
| SINE | 621 | 62 | 3 | 686 |
| nonLTR | 235 | 67 | 6 | 308 |
Gene annotation for Shinsuwase and Teamaker assemblies
| Shinsuwase | Teamaker | |
|---|---|---|
| StringTie Transcripts | 1 120 693 | 1 137 597 |
| StringTie w/SVM Transcripts | 97 288 | 77 118 |
| MAKER genes | 46 735 | 39 831 |
| MAKER after pseudogene removal | 39 672 | 28 434 |
| MAKER after repeat removal | 35 482 | 24 919 |
| Genes with unknown protein homology | 13 281 | 8758 |
| Genes with protein homology | 22 201 | 16 161 |
| Total remaining genes | 35 482 | 24 919 |
Figure 2.A schematic representation of HopBase. HopBase consists of three genome assemblies including Teamaker, Shinsuwase and a male accession number 21422M. There is a JBrowse genome browser for each assembly, as well as FTP site for downloading sequences and annotation files for each assembly. We also provide a BLAST interface for aligning sequences to mRNA, protein and genome collections.
Figure 3.HopBase provides a JBrowse genome browser consisting of multiple tracks such as gene models, ESTs, alignments from TAIR, and RNA-seq data.
Comparison of Shinsuwase assembly and Teamaker assembly
| Shinsuwase v1 ( | Hopbase Teamaker v1(current) | |
|---|---|---|
| Transcriptome Assembly alignments | 70% | 76% |
| Public ESTs alignments | 94% | 96% |
| CEGMA genes | 89% | 85% |
| NG50 (without Ns) | 5050 | 9231 |
| NG50 (with Ns) | N/A | 41 006 |
| Assembly size (with Ns) | 2 049 209 000 | 2 770 850 934 |
| Assembly size (without Ns) | 1,775,776,000 | 1,766,890,029 |
Estimates of co-ancestry as calculated by use of phi (Manichaikul et al., 2010)
| INDV1 | INDV2 | N_AaAa | N_AAaa | N1_Aa | N2_Aa | PHI |
|---|---|---|---|---|---|---|
| USDA21422M | USDA21422M | 316 418 | 0 | 316 418 | 316 418 | 0.5 |
| USDA21422M | Cordifolius | 26 110 | 79 853 | 316 418 | 53 545 | −0.361106 |
| USDA21422M | Shinsuwase | 237 926 | 1821 | 316 418 | 389 779 | 0.331754 |
| USDA21422M | Teamaker | 245 479 | 18 564 | 316 418 | 324 193 | 0.325238 |
| Cordifolius | USDA21422M | 26 110 | 79 853 | 53 545 | 316 418 | −0.361106 |
| Cordifolius | Cordifolius | 53 545 | 0 | 53 545 | 53 545 | 0.5 |
| Cordifolius | Shinsuwase | 20 020 | 20 108 | 53 545 | 389 779 | −0.0455558 |
| Cordifolius | Teamaker | 23 928 | 55 859 | 53 545 | 324 193 | −0.23241 |
| Shinsuwase | USDA21422M | 237 926 | 1821 | 389 779 | 316 418 | 0.331754 |
| Shinsuwase | Cordifolious | 20 020 | 20 108 | 389 779 | 53 545 | −0.0455558 |
| Shinsuwase | Shinsuwase | 389 779 | 0 | 389 779 | 389 779 | 0.5 |
| Shinsuwase | Teamaker | 247 013 | 963 | 389 779 | 324 193 | 0.343273 |
| Teamaker | USDA21422M | 245 479 | 18 564 | 324 193 | 316 418 | 0.325238 |
| Teamaker | Cordifolious | 23 928 | 55 859 | 324 193 | 53 545 | −0.23241 |
| Teamaker | Shinsuwase | 247 013 | 963 | 324 193 | 389 779 | 0.343273 |
| Teamaker | Teamaker | 324 193 | 0 | 324 193 | 324 193 | 0.5 |