Literature DB >> 25361970

PomBase 2015: updates to the fission yeast database.

Mark D McDowall1, Midori A Harris2, Antonia Lock3, Kim Rutherford2, Daniel M Staines4, Jürg Bähler3, Paul J Kersey4, Stephen G Oliver5, Valerie Wood6.   

Abstract

PomBase (http://www.pombase.org) is the model organism database for the fission yeast Schizosaccharomyces pombe. PomBase provides a central hub for the fission yeast community, supporting both exploratory and hypothesis-driven research. It provides users easy access to data ranging from the sequence level, to molecular and phenotypic annotations, through to the display of genome-wide high-throughput studies. Recent improvements to the site extend annotation specificity, improve usability and allow for monthly data updates. Both in-house curators and community researchers provide manually curated data to PomBase. The genome browser provides access to published high-throughput data sets and the genomes of three additional Schizosaccharomyces species (Schizosaccharomyces cryophilus, Schizosaccharomyces japonicus and Schizosaccharomyces octosporus).
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2014        PMID: 25361970      PMCID: PMC4383888          DOI: 10.1093/nar/gku1040

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The fission yeast Schizosaccharomyces pombe is a unicellular eukaryote that has been used as a model organism for studying a diverse array of biological processes, from the cell cycle to signaling, for over 60 years (1). It was the sixth eukaryotic organism to have its genome completely sequenced (2). With a thriving community generating data from small and large-scale projects, a central hub to curate and integrate information is vital to facilitate data interpretation and hypothesis generation, and to guide further research. PomBase (http://www.pombase.org) was launched in 2011 as the model organism database for fission yeast (3). The PomBase portal provides centralized access to gene- and genome-scale information, emphasizing data acquired by manual literature curation. In a novel community curation initiative, fission yeast researchers now contribute significantly to gene annotation, using the Canto online curation tool (4). PomBase presents information in gene-specific pages that include summary data on each gene and its product, such as its biological functions, cellular localization, phenotype data, modifications, interactions, regulation and gene expression. PomBase offers a customized Ensembl Genome browser (5) to provide access to the genome sequence and features, and to visualize high-throughput data sets in a genomic context.

BIOLOGICAL DATA

PomBase curators focus on extracting data from historical papers and on providing help and guidance to researchers who curate their own papers using Canto. The inclusion of genome-scale datasets has resulted in a large increase in the volume of data curated.

High-throughput datasets

PomBase gene pages typically include data from various types of large-scale experiments, such as gene expression data (6,7), phenotypic analysis (8,9) and interaction data (10). Within the genome browser, PomBase hosts sequence-based datasets from a variety of high-throughput experimental techniques, such as nucleosome positioning (11), transcriptomic data (see Figure 1A) (6,11–12), replication profiling (13), polyadenylation sites (14,15) and chromatin binding (16). The datasets included to date are those requested by the fission yeast community, and for which the publication authors have provided data to PomBase.
Figure 1.

Views available in the genome browser. (A) Region display with two tracks enabled, displaying RNA-Seq coverage data (12). The two tracks show the forward (top) and reverse (bottom) strand reads along with the genes that are mapped to the region. (B) Region comparison display showing the regions of alignment between Schizosaccharomyces pombe and Schizosaccharomyces japonicus (top) and Schizosaccharomyces octosporus (bottom) in green. (C) Gene tree view generated using the Compara framework, with the gene of interest, pom1 (SPAC2F7.03c), highlighted in red.

Views available in the genome browser. (A) Region display with two tracks enabled, displaying RNA-Seq coverage data (12). The two tracks show the forward (top) and reverse (bottom) strand reads along with the genes that are mapped to the region. (B) Region comparison display showing the regions of alignment between Schizosaccharomyces pombe and Schizosaccharomyces japonicus (top) and Schizosaccharomyces octosporus (bottom) in green. (C) Gene tree view generated using the Compara framework, with the gene of interest, pom1 (SPAC2F7.03c), highlighted in red.

Additional species

In addition to S. pombe, the genomes of Schizosaccharomyces cryophilus, Schizosaccharomyces japonicus and Schizosaccharomyces octosporus (12) are now accessible via the genome browser, as the result of collaboration with the Ensembl Genomes project (17). All-against-all DNA alignments between the four Schizosaccharomyces species can also be displayed in the PomBase browser (see Figure 1B). The new Schizosaccharomyces species are among the 52 fungal species included in a protein multiple sequence alignment (see Figure 1C). S. pombe also represents the fission yeast clade in protein multiple sequence alignments of a further set of 178 species covering a broad taxonomic range including human and bacteria, which are visible in the genome browser.

INFRASTRUCTURE AND PROCEDURAL IMPROVEMENTS

Manually curated data is stored within a Chado relational database (18). During the release procedure, a snapshot of the curation database is created and the annotations are imported into an Ensembl schema on a MySQL database. The Ensembl schema provides the back-end architecture for the genome browser and houses the annotations for the PomBase site. The import pipeline that transfers data from the PomBase Chado curation database to the Ensembl database has been altered to accommodate increasing annotation complexity (as described below). Update procedures have been improved to implement data consistency checks on the database and to use the Selenium testing framework (http://www.seleniumhq.org) on the web interface. This more robust infrastructure has enabled PomBase to implement a monthly release cycle. Improved back-end data storage and retrieval has reduced the gene page loading time, even as the amount of data presented has increased, enabling users to navigate through multiple genes with minimal delays.

INTEGRATING AND VISUALIZING DATA

To maintain a readable display of increasingly complex data, and incorporate new data types, there have been major improvements to the organization and presentation of data on the gene pages. The most extensive changes have affected three key regions of the gene page: the displays of Gene Ontology (GO) (19,20) annotations, Fission Yeast Phenotype Ontology (FYPO) annotations (21) and gene expression data. More subtle changes have also been introduced throughout the gene pages.

Ontology annotations and extensions

The most significant change affecting annotation complexity in PomBase is the introduction of ‘annotation extensions’ that increase the expressivity of annotations to ontology terms. With active participation by PomBase curators, the GO Consortium introduced annotation extensions in 2013 (22) to enable curators to capture additional contextual details such as effector–target relationships and temporal or spatial aspects of biological processes. Whereas previously each GO annotation combined a single gene product with a single GO term, and was independent from any other GO annotations, extended annotations can capture interconnections between multiple annotations as well as links to additional ontologies. Each annotation extension consists of a relation and an identifier that refers to another ontology term (GO, SO (23), ChEBI (24), PSI-MOD (25), etc.) or another gene. An annotation may have one or more extensions, each with its own relationships and sources, and ‘compound’ extensions can be made by combining single extensions. To date, PomBase curators have added extensions to over 2000 GO annotations. PomBase has also adopted the annotation extension model for phenotype (FYPO) and gene expression annotations, as described below. Table 1 shows the total number of annotations in PomBase as of August 2014, and the number that have extensions, for four ontologies plus gene expression.
Table 1.

Summary of annotations and extensions in PomBase as of September 2014

Curated data typeAnnotationAnnotationGene
countextensionscoverage
Gene expression40 40340 4037017
Phenotype (FYPO)36 38211 7224942
Gene Ontolgy (GO)37 22421495301
Modifications (MOD)11 26572552009
Protein sequence943N/A764
Features (SO)

Annotation Count: total number of annotations of each type, including those with extensions; Annotation extensions: number of annotations that have one or more extensions apiece; Gene coverage: number of genes that have at least one annotation of the given type.

Annotation Count: total number of annotations of each type, including those with extensions; Annotation extensions: number of annotations that have one or more extensions apiece; Gene coverage: number of genes that have at least one annotation of the given type. To accommodate annotation extensions, PomBase has adapted its Chado and Ensembl relational database schemata and loading procedures and enhanced the gene page ontology annotation displays. On PomBase gene pages, annotation extensions are shown in rows below the ontology term, with the relevant evidence code and annotation source. Identifiers and relation strings are converted to human-friendly text, such as a gene name or ontology term, wherever possible. For example, Figure 2A shows annotations to GO:0045944 from the ste11 (SPBC32C12.02) gene page. Annotations without extensions are displayed first, followed by those with extensions, and the bottom row shows a compound annotation extension (26).
Figure 2.

PomBase gene page views for example annotations: (A) GO (gene ste11 - SPBC32C12.02); (B) FYPO (gene cdc2 - SPBC11B10.09); (C) Gene expression (gene clr3 - SPBC800.03). These examples highlight the display of annotation extensions and their use within the context of different ontologies.

PomBase gene page views for example annotations: (A) GO (gene ste11 - SPBC32C12.02); (B) FYPO (gene cdc2 - SPBC11B10.09); (C) Gene expression (gene clr3 - SPBC800.03). These examples highlight the display of annotation extensions and their use within the context of different ontologies. Phenotype annotations have also considerably increased both in number and complexity. FYPO annotations include allele details, expression level, experimental conditions, the evidence and source, and annotation extensions that represent penetrance and severity. Phenotype annotation extensions also capture specific genes used in assays for phenotypes such as protein localization or gene expression level. On each gene page, FYPO annotations are grouped by whether the phenotype is relevant at the level of a cell population or an individual cell, and display annotation extensions similarly to the GO tables. Figure 2B shows alleles of cdc2 (SPBC11B10.09) that have been annotated to ‘decreased protein binding’ (FYPO:0001645), affecting several other gene products.

Targets

A new table on the gene pages, ‘Target Of’, reports effects of other genes on the gene of interest, such as modification or regulation. ‘Target Of’ annotations are the reciprocal of GO and FYPO annotation extensions. The ‘Target Of’ display includes the relevant gene, a relationship and the annotation source. For example, cdc2 (SPBC11B10.09) is annotated as the substrate of csk1 (SPAC1D4.06c) protein kinase activity (27) and cdc25 (SPAC24H6.05) protein phosphatase activity (28).

Gene expression

Quantitative gene expression data have been imported from two large datasets covering the expression of 3175 (7) and 7016 (6) gene products. Gene expression annotations may include extensions indicating that the expression level was measured in a specific phase of the cell cycle or under specific growth conditions. Further qualitative data have also been manually imported into PomBase from the literature. When available, this information is displayed on the gene pages, providing details of the experimental conditions, evidence, scale and source. Figure 2C shows the display of gene expression data for clr3 (SPBC800.03).

Data visualization in genome browser

In the PomBase genome browser, data tracks are presented with curated metadata, links to the relevant publication via the Europe PMC portal (29) and, where appropriate, links to external source databases. Users have the option of viewing their own data privately within the context of the genome browser, or submitting data to be hosted by PomBase for public viewing.

OTHER IMPROVEMENTS

PomBase now offers a motif finder that can retrieve lists of genes that match a particular protein sequence pattern. In the PomBase advanced search, the interfaces for constructing custom queries and retrieving results have been enhanced. User-experience testing conducted after the initial PomBase release identified several opportunities for usability improvements. Accordingly, changes to the navigation and organization of the gene pages, such as collapsible intra-page menus, now make data more intuitively visible. Interfaces requiring user interaction are now also more intuitive.

OUTREACH AND USAGE

PomBase includes documentation for all gene page sections, links to Ensembl documentation for the genome browser and a Frequently Asked Questions section. Various web forms offer convenient links for users to contact curators to ask questions or submit high-throughput datasets to be included in the genome browser or on the gene pages. PomBase curators invite all authors of new fission yeast publications to curate their own papers using Canto. PomBase also sends announcements and help to a dedicated mailing list and to various social media outlets including Twitter (@PomBase), LinkedIn (http://www.linkedin.com/company/pombase) and Google+ (+PombaseOrg).

FUTURE DIRECTIONS

Canto and the gene pages will be extended to support the curation and display, respectively, of multiple-gene phenotypes (double mutants, triple mutants, etc.). Work has also begun to create pages for non-gene sequence features, such as the centromeres, which at present can only be viewed in the genome browser.
  29 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Cdc2 activation in fission yeast depends on Mcs6 and Csk1, two partially redundant Cdk-activating kinases (CAKs).

Authors:  K M Lee; J E Saiz; W A Barton; R P Fisher
Journal:  Curr Biol       Date:  1999-04-22       Impact factor: 10.834

3.  Regulation of p34cdc2 protein kinase during mitosis.

Authors:  S Moreno; J Hayles; P Nurse
Journal:  Cell       Date:  1989-07-28       Impact factor: 41.582

4.  Schizosaccharomyces pombe ste11+ encodes a transcription factor with an HMG motif that is a critical regulator of sexual development.

Authors:  A Sugimoto; Y Iino; T Maeda; Y Watanabe; M Yamamoto
Journal:  Genes Dev       Date:  1991-11       Impact factor: 11.361

5.  The PSI-MOD community standard for representation of protein modification data.

Authors:  Luisa Montecchi-Palazzi; Ron Beavis; Pierre-Alain Binz; Robert J Chalkley; John Cottrell; David Creasy; Jim Shofstahl; Sean L Seymour; John S Garavelli
Journal:  Nat Biotechnol       Date:  2008-08       Impact factor: 54.908

6.  Absolute proteome and phosphoproteome dynamics during the cell cycle of Schizosaccharomyces pombe (Fission Yeast).

Authors:  Alejandro Carpy; Karsten Krug; Sabine Graf; André Koch; Sasa Popic; Silke Hauf; Boris Macek
Journal:  Mol Cell Proteomics       Date:  2014-04-23       Impact factor: 5.911

7.  The genome sequence of Schizosaccharomyces pombe.

Authors:  V Wood; R Gwilliam; M-A Rajandream; M Lyne; R Lyne; A Stewart; J Sgouros; N Peat; J Hayles; S Baker; D Basham; S Bowman; K Brooks; D Brown; S Brown; T Chillingworth; C Churcher; M Collins; R Connor; A Cronin; P Davis; T Feltwell; A Fraser; S Gentles; A Goble; N Hamlin; D Harris; J Hidalgo; G Hodgson; S Holroyd; T Hornsby; S Howarth; E J Huckle; S Hunt; K Jagels; K James; L Jones; M Jones; S Leather; S McDonald; J McLean; P Mooney; S Moule; K Mungall; L Murphy; D Niblett; C Odell; K Oliver; S O'Neil; D Pearson; M A Quail; E Rabbinowitsch; K Rutherford; S Rutter; D Saunders; K Seeger; S Sharp; J Skelton; M Simmonds; R Squares; S Squares; K Stevens; K Taylor; R G Taylor; A Tivey; S Walsh; T Warren; S Whitehead; J Woodward; G Volckaert; R Aert; J Robben; B Grymonprez; I Weltjens; E Vanstreels; M Rieger; M Schäfer; S Müller-Auer; C Gabel; M Fuchs; A Düsterhöft; C Fritzc; E Holzer; D Moestl; H Hilbert; K Borzym; I Langer; A Beck; H Lehrach; R Reinhardt; T M Pohl; P Eger; W Zimmermann; H Wedler; R Wambutt; B Purnelle; A Goffeau; E Cadieu; S Dréano; S Gloux; V Lelaure; S Mottier; F Galibert; S J Aves; Z Xiang; C Hunt; K Moore; S M Hurst; M Lucas; M Rochet; C Gaillardin; V A Tallada; A Garzon; G Thode; R R Daga; L Cruzado; J Jimenez; M Sánchez; F del Rey; J Benito; A Domínguez; J L Revuelta; S Moreno; J Armstrong; S L Forsburg; L Cerutti; T Lowe; W R McCombie; I Paulsen; J Potashkin; G V Shpakovski; D Ussery; B G Barrell; P Nurse; L Cerrutti
Journal:  Nature       Date:  2002-02-21       Impact factor: 49.962

8.  A Chado case study: an ontology-based modular schema for representing genome-associated biological information.

Authors:  Christopher J Mungall; David B Emmert
Journal:  Bioinformatics       Date:  2007-07-01       Impact factor: 6.937

9.  UKPMC: a full text article resource for the life sciences.

Authors:  Johanna R McEntyre; Sophia Ananiadou; Stephen Andrews; William J Black; Richard Boulderstone; Paula Buttery; David Chaplin; Sandeepreddy Chevuru; Norman Cobley; Lee-Ann Coleman; Paul Davey; Bharti Gupta; Lesley Haji-Gholam; Craig Hawkins; Alan Horne; Simon J Hubbard; Jee-Hyub Kim; Ian Lewin; Vic Lyte; Ross MacIntyre; Sami Mansoor; Linda Mason; John McNaught; Elizabeth Newbold; Chikashi Nobata; Ernest Ong; Sharmila Pillai; Dietrich Rebholz-Schuhmann; Heather Rosie; Rob Rowbotham; C J Rupp; Peter Stoehr; Philip Vaughan
Journal:  Nucleic Acids Res       Date:  2010-11-09       Impact factor: 16.971

10.  The Sequence Ontology: a tool for the unification of genome annotations.

Authors:  Karen Eilbeck; Suzanna E Lewis; Christopher J Mungall; Mark Yandell; Lincoln Stein; Richard Durbin; Michael Ashburner
Journal:  Genome Biol       Date:  2005-04-29       Impact factor: 13.583

View more
  52 in total

1.  Aspergillus Secondary Metabolite Database, a resource to understand the Secondary metabolome of Aspergillus genus.

Authors:  Varahalarao Vadlapudi; Nabajyoti Borah; Kanaka Raju Yellusani; Sriramya Gade; Prabhakar Reddy; Maheshwari Rajamanikyam; Lakshmi Narasimha Santosh Vempati; Satya Prakash Gubbala; Pankaj Chopra; Suryanarayana Murty Upadhyayula; Ramars Amanchy
Journal:  Sci Rep       Date:  2017-08-04       Impact factor: 4.379

2.  A Brief History of Schizosaccharomyces pombe Research: A Perspective Over the Past 70 Years.

Authors:  Peter A Fantes; Charles S Hoffman
Journal:  Genetics       Date:  2016-06       Impact factor: 4.562

3.  Tutorial on Protein Ontology Resources.

Authors:  Cecilia N Arighi; Harold Drabkin; Karen R Christie; Karen E Ross; Darren A Natale
Journal:  Methods Mol Biol       Date:  2017

Review 4.  Big data mining powers fungal research: recent advances in fission yeast systems biology approaches.

Authors:  Zhe Wang
Journal:  Curr Genet       Date:  2016-10-11       Impact factor: 3.886

5.  Discovery of genes involved in mitosis, cell division, cell wall integrity and chromosome segregation through construction of Schizosaccharomyces pombe deletion strains.

Authors:  Jun-Song Chen; Janel R Beckley; Liping Ren; Anna Feoktistova; Michael A Jensen; Nicholas Rhind; Kathleen L Gould
Journal:  Yeast       Date:  2016-06-29       Impact factor: 3.239

6.  Solid phase chemistry to covalently and reversibly capture thiolated RNA.

Authors:  Erin E Duffy; Daniele Canzio; Tom Maniatis; Matthew D Simon
Journal:  Nucleic Acids Res       Date:  2018-08-21       Impact factor: 16.971

7.  Systematic identification and correction of annotation errors in the genetic interaction map of Saccharomyces cerevisiae.

Authors:  Nir Atias; Martin Kupiec; Roded Sharan
Journal:  Nucleic Acids Res       Date:  2015-11-23       Impact factor: 16.971

8.  Abrogation of glucosidase I-mediated glycoprotein deglucosylation results in a sick phenotype in fission yeasts: Model for the human MOGS-CDG disorder.

Authors:  Giovanna L Gallo; Ayelén Valko; Sofía I Aramburu; Emiliana Etchegaray; Christof Völker; Armando J Parodi; Cecilia D'Alessio
Journal:  J Biol Chem       Date:  2018-11-02       Impact factor: 5.157

9.  Accurate identification of centromere locations in yeast genomes using Hi-C.

Authors:  Nelle Varoquaux; Ivan Liachko; Ferhat Ay; Joshua N Burton; Jay Shendure; Maitreya J Dunham; Jean-Philippe Vert; William S Noble
Journal:  Nucleic Acids Res       Date:  2015-05-04       Impact factor: 16.971

Review 10.  Databases for Protein-Protein Interactions.

Authors:  Natsu Nakajima; Tatsuya Akutsu; Ryuichiro Nakato
Journal:  Methods Mol Biol       Date:  2021
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.