| Literature DB >> 32537769 |
D A Triant1, J J Le Tourneau1, C M Diesh2, D R Unni3, M Shamimuzzaman1, A T Walsh1, J Gardiner1, A K Goldkamp4, Y Li1, H N Nguyen1,5, C Roberts1, Z Zhao6, L J Alexander7, J E Decker1,5, R D Schnabel1,5, S G Schroeder8, T S Sonstegard9, J F Taylor1, R M Rivera1, D E Hagen4, C G Elsik1,5,6.
Abstract
With the availability of a new highly contiguous Bos taurus reference genome assembly (ARS-UCD1.2), it is the opportune time to upgrade the bovine gene set by seeking input from researchers. Furthermore, advances in graphical genome annotation tools now make it possible for researchers to leverage sequence data generated with the latest technologies to collaboratively curate genes. For many years the Bovine Genome Database (BGD) has provided tools such as the Apollo genome annotation editor to support manual bovine gene curation. The goal of this paper is to explain the reasoning behind the decisions made in the manual gene curation process while providing examples using the existing BGD tools. We will describe the sources of gene annotation evidence provided at the BGD, including RNA-seq and Iso-Seq data. We will also explain how to interpret various data visualizations when curating gene models, and will demonstrate the value of manual gene annotation. The process described here can be applied to manual gene curation for other species with similar tools. With a better understanding of manual gene annotation, researchers will be encouraged to edit gene models and contribute to the enhancement of livestock gene sets.Entities:
Keywords: zzm321990Bos tauruszzm321990; RNA-seq; gene prediction; genome annotation; genome annotation tools
Year: 2020 PMID: 32537769 PMCID: PMC7540445 DOI: 10.1111/age.12962
Source DB: PubMed Journal: Anim Genet ISSN: 0268-9146 Impact factor: 3.169
Figure 1Bovine Genome Database (BGD) jbrowse. This view of the BGD jbrowse genome browser shows an example of a split/merge disagreement between Ensembl and RefSeq genes. Ensembl shows two genes, one of which has two transcripts, where RefSeq shows one gene. The ‘Iso‐Seq Combined PASA’ track includes two transcripts that are similar to the RefSeq transcript, with identifiers that include the two‐letter code ‘JE’ for jejunum. The green arcs in the ‘Jejunum SE RNAseq junctions’ track highlight RNA‐seq splice junctions. The ‘Jejunum SE BAM dense track’ shows RNA‐seq read alignments as small red or blue bars and connections between parts of spliced reads as gray lines. Both the Iso‐Seq and the RNA‐seq data support the RefSeq transcript or a merge of the Ensembl gene models into one gene.
Figure 2BGD apollo. After logging into apollo, an Information Panel appears on the right. To the left of the Information Panel is the browser, which now includes the Evidence Area (equivalent to the jbrowse view) and the Editing Area, with a light yellow background, above the Evidence Area. Here, the RefSeq transcript and an Iso‐Seq transcript have been dragged to the Editing Area. Notice that the identifiers of the transcripts in the Editing Area both resemble the RefSeq identifier, which was the first transcript added. One transcript has been clicked and is now outlined in red. Exon boundaries in the gene prediction and Iso‐Seq tracks are highlighted in red if they agree with the exon boundaries in the outlined annotation. The ‘Jejunum SE BAM dense’ track has been configured to hide unspliced alignments using a pulldown menu available by clicking the track label. The Information Panel can be hidden from view to increase the browser width by clicking the greater‐than sign near the upper left of the panel. The ‘Select Tracks’ tab seen on the left of the browser in Fig. 1 can be brought back into view by clicking the icon that resembles a list under the greater‐than sign.