| Literature DB >> 34251419 |
Margaret Staton1, Ethalinda Cannon2, Lacey-Anne Sanderson3, Jill Wegrzyn4, Tavis Anderson5, Sean Buehler1, Irene Cobo-Simón4, Kay Faaberg5, Emily Grau4, Valentin Guignon6, Jessica Gunoskey4, Blake Inderski5, Sook Jung7, Kelly Lager5, Dorrie Main7, Monica Poelchau8, Risharde Ramnath4, Peter Richter4, Joe West1, Stephen Ficklin7.
Abstract
Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate genomic, transcriptomic, metabolomic, proteomic, phenomic and environmental data, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. A common infrastructure framework using community standards shared by many databases can reduce development burden, provide interoperability, ensure use of common standards and support long-term sustainability. Tripal is a mature, open source platform built to meet this need. With ongoing improvement since its first release in 2009, Tripal provides full functionality for searching, browsing, loading and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals and human data, primarily storing genomics, genetics and breeding data. Tripal software development is managed by a shared, inclusive governance structure including both project management and advisory teams. Here, we report on the most important and innovative aspects of Tripal after 11 years development, including integration of diverse types of biological data, successful collaborative projects across member databases, and support for implementing FAIR principles.Entities:
Keywords: FAIR; Tripal; breeding; community databases; community governance; genetics; genomics; open source software
Mesh:
Year: 2021 PMID: 34251419 PMCID: PMC8574961 DOI: 10.1093/bib/bbab238
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1
Timeline of events in Tripal’s software and community development. The progression of Tripal software versions can be seen along the top and the community milestones are shown along the bottom. Tripal version 1.0, the first version to provide generic support for all of Chado, was released in 2013. Version 2, which represented an upgrade from Drupal 6 to Drupal 7, was released in 2015, and version 3 with a redesigned data storage based on controlled vocabularies was released in October of 2018. The current version of Tripal, version 3.3, was released April 2020 and Tripal version 4 (an upgrade to support Drupal 8 and 9) is under development and scheduled for alpha release in 2021. Abbreviations: PAG (Plant and Animal Genome Conference).
Data portals partially or wholly utilizing the Tripal software platform. This table highlights the diversity in data types, described organisms and location of the developer community found among existing databases utilizing Tripal as of October, 2020
| Data Portal | Kingdom | Survey Response (Y/N) | Reported Data Types | Primary Location |
|---|---|---|---|---|
| Banana Genome Hub ( | Plantae | Y | Gene families, Genomes, Genotypes, Ontologies, Phylogenetics, Synteny, Transcriptomes | Montpellier, France |
| Cacao Genome Database ( | Plantae | N | Genetic maps, Genomes, Genotypes, Ontologies, Pathways, Transcriptomes | Ames, IA, USA |
| CGD ( | Plantae | Y | Contacts, Expression, Genetic maps, Genomes, Genotypes, Germplasm, Pedigrees, Phenotypes, Synteny, Transcriptomes, QTLs | Pullman, WA, USA |
| CorkOakDB ( | Plantae | N | Expression, Genomes, Publications, Transcriptomes | Oeiras, Portugal |
| CottonGen ( | Plantae | Y | Contacts, Genetic maps, Genomes, Genotypes, Germplasm, Images, Ontologies, Pedigrees, Phenotypes, Publications, Synteny, Transcriptomes, QTLs | Pullman, WA, USA |
| Cucurbit Genomics ( | Plantae | N | Genomes, Genotypes, Ontologies, Pathways, Synteny, Transcriptomes | Ithaca, NY, USA |
| GeneNet Engine ( | Plantae | N | Gene co-expression networks, Ontologies | Pullman, WA, USA |
| GDR ( | Plantae | Y | Contacts, Environmental data, Expression, Gene families, Genetic maps, Genomes, Genotypes, Germplasm, Haplotypes, Images, Ontologies, Pedigrees, Phenotypes, Publications, Synteny, Transcriptomes, QTLs | Pullman, WA, USA |
| GDV ( | Plantae | Y | Contacts, Expression, Genetic maps, Genomes, Genotypes, Germplasm, Pedigrees, Phenotypes, Publications, Transcriptomes, QTLs | Pullman, WA, USA |
| GrainGenomes (development version) | Plantae | Y | Genetic maps, Genomes, Genotypes, Germplasm, Images, Pedigrees, Phenotypes, Publications, QTLs | Ithaca, NY, USA |
| Grass Genome Hub ( | Plantae | Y | Gene families, Genomes, Synteny, Transcriptomes, QTLs | Montpellier, France |
| Hardwood Genomics Project ( | Plantae | Y | Expression, Gene families, Genomes, Ontologies, Transcriptomes | Knoxville, TN, USA |
| I5K Workspace ( | Animalia | Y | Genomes | Beltsville, MD, USA |
| Kiwifruit Genome Database ( | Plantae | N | Genomes, Ontologies, Synteny, Transcriptomes | Hefei, China |
| KnowPulse ( | Plantae | Y | Genetic maps, Genomes, Genotypes, Germplasm, Ontologies, Pedigrees, Phenotypes, Publications, QTLs | Saskatoon, SK, Canada |
| Legume Information System ( | Plantae | Y | Expression, Gene families, Genetic maps, Genomes, Genotypes, Germplasm, Ontologies, Pan-genomes, Phenotypes, Phylogenetics, Publications, Synteny, Transcriptomes, QTLs | Ames, IA, USA |
| LiceBase ( | Animalia | N | Contacts, Genomes, Genotypes, Images, Phenotypes, Publications, Stocks | Bergen, Norway |
| Mimubase ( | Plantae | Y | Contacts, Genomes, Genotypes, Ontologies, Publications | Storrs, CT, USA |
| Musa Germplasm Information System ( | Plantae | Y | Genotypes, Germplasm, Phenotypes, Stocks | Montpellier, France |
| NanDeSyn Database ( | Plantae | N | Expression, Gene families, Genomes, Genotypes, Ontologies, Phenotypes, Publications, Synteny | Beijing, China |
| PeanutBase ( | Plantae | Y | Expression, Gene families, Genetic maps, Genomes, Genotypes, Germplasm, Images, Ontologies, Pan-genomes, Phenotypes, Phylogenetics, Publications, Stocks, Synteny, Transcriptomes, QTLs | Ames, IA, USA |
| Planarian Educational Resource ( | Animalia | Y | Expression, Genes, Ontologies, Transcriptomes | Kansas City, MO |
| Planosphere ( | Animalia | Y | Expression, Genomes, Images, Ontologies, Transcriptomes | Kansas City, MO, USA |
| PCD ( | Plantae | Y | Contacts, Genetic maps, Genomes, Genotypes, Germplasm, Pedigrees, Phenotypes, Publications, Synteny, QTLs | Pullman, WA, USA |
| Rice Genome Hub ( | Plantae | Y | Genomes, Ontologies, Publications, Synteny, Transcriptomes | Montpellier, France |
| RNAStructurome ( | Human | N | RNA structures | Ames, IA, USA |
| SeriolaDB ( | Animalia | N | Genomes, Ontologies, QTLs | Ames, IA, USA |
| SIMRBase ( | Animalia | Y | Genomes, Transcriptomes | Kansas City, MO, USA |
| SpinachBase ( | Plantae | N | Genomes, Ontologies, Pathways, Transcriptomes | Ithaca, NY, USA |
| TreeGenes ( | Plantae | Y | Contacts, Environmental data, Gene families, Genetic maps, Genomes, Genotypes, Ontologies, Phenotypes, ublications, Stocks, Transcriptomes, QTLs | Storrs, CT, USA |
| US-SPD ( | Animalia | Y | Genomes | Ames, IA, USA |
| Zeamap ( | Plantae | N | Expression, Genetic maps, Genomes, Genotypes, Phenotypes, Synteny, Transcriptomes, QTLs | Wuhan, China |
Figure 2
Tripal software architecture (middle panel) depends on Drupal, a popular CMS for building websites. Tripal Core communicates with the standard Drupal database as well as the Chado database or other user-installed database. Extension modules can build and extend from the Core module to provide support for new functions, such as importing/displaying new data types, providing analysis or search services to users and/or interacting with outside software packages such as JBrowse. Tripal Core is directly governed (left panel) by a PMC, which receives input and recommendations from the TAC. The Tripal community works together to develop Tripal and contribute extension modules (right panel), which are given the badges of bronze, silver or gold indicating compliance with a list of best practices and standards.
Figure 3
Tripal Governance Committee Responsibilities and their interaction with the Tripal Community. The structure of Tripal Governance is shown in detail with interaction occurring between the Tripal community, TAC and Tripal PMC. Specifically, the TAC surveys the needs of the Tripal Community to develop recommendations for the Tripal PMC. The PMC provides feedback on recommendations to the TAC as both work together to guide the future direction of Tripal. Additionally, the Tripal PMC interacts directly with the Tripal Community in respect to bug reports, feature requests and other code-focused concerns. The inclusivity of the Tripal Community is highlighted as it consists of all people who interact with Tripal either directly or indirectly.
Figure 4
An overview of requirements for Tripal Extension Module Badges. These badges provide guidance to extension module developments for best practices on developing sharable extension modules as well as providing quality information to Tripal administrators looking to use these extension modules. The Tripal PMC awards badges to modules that meet the requirements and adds the badge to the official extension module listing on the tripal.info website.
Figure 5
Gigwa interface on Musa Germplasm Information System offers users an interface to select variants by genomic location, gene and variant effect (A) and to filter individuals from a population based on various criteria (B). Results for each variant display the functional annotations due to the variant as well as the genotype calls and other metadata for individual accessions (C).
Figure 6
Tripal germplasm accession page designed using Display Suite integration. The KnowPulse CDC Greenstar AGL germplasm accession page (left) was designed using the Drupal Display Suite drag and drop administrative interface (right). Diverse data types provided by multiple Tripal extension modules were brought together into a cohesive display through data type focused categories. Each discrete piece of content is represented as a row in the administrative interface and can be rearranged and categorized easily by the administrator. The highlighted boxes indicate the configuration on the right with the generated display on the left. The configuration can be a single row as shown with the pedigree or multiple rows combined with a custom formatter as shown with the phenotypic summary depending on the level of configuration supported by the data type.
Figure 7
The Tripal Phylotree.js interface is capable of displaying large circular phylogenetic or taxonomic trees. Users can click to highlight paths and hover to see the distance measure between two leaves.
Figure 8
Anonymous survey results identifying architectural priorities for the community.
Figure 9
Anonymous survey results identifying common challenges to Tripal development.