| Literature DB >> 15608197 |
Huaiyu Mi1, Betty Lazareva-Ulitsky, Rozina Loo, Anish Kejariwal, Jody Vandergriff, Steven Rabkin, Nan Guo, Anushya Muruganujan, Olivier Doremieux, Michael J Campbell, Hiroaki Kitano, Paul D Thomas.
Abstract
PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise. These subfamilies model the divergence of specific functions within protein families, allowing more accurate association with function (ontology terms and pathways), as well as inference of amino acids important for functional specificity. Hidden Markov models (HMMs) are built for each family and subfamily for classifying additional protein sequences. The latest version, 5.0, contains 6683 protein families, divided into 31,705 subfamilies, covering approximately 90% of mammalian protein-coding genes. PANTHER 5.0 includes a number of significant improvements over previous versions, most notably (i) representation of pathways (primarily signaling pathways) and association with subfamilies and individual protein sequences; (ii) an improved methodology for defining the PANTHER families and subfamilies, and for building the HMMs; (iii) resources for scoring sequences against PANTHER HMMs both over the web and locally; and (iv) a number of new web resources to facilitate analysis of large gene lists, including data generated from high-throughput expression experiments. Efforts are underway to add PANTHER to the InterPro suite of databases, and to make PANTHER consistent with the PIRSF database. PANTHER is now publicly available without restriction at http://panther.appliedbiosystems.com.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15608197 PMCID: PMC540032 DOI: 10.1093/nar/gki078
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Number of genes from each organism classified using PANTHER HMMs
| Genome | No. of genes | No. of genes with PANTHER HMM hit | No. of genes with MF association | No. of genes with BP association |
|---|---|---|---|---|
| LocusLink human | 16 232 | 14 533 (89.5%) | 10 453 (64.4%) | 10 410 (64.1%) |
| LocusLink mouse | 15 020 | 13 147 (87.5%) | 10 012 (66.7%) | 9933 (66.1%) |
| LocusLink rat | 4516 | 4391 (97.2%) | 3967 (87.8%) | 3969 (87.9%) |
| FlyBase | 13 654 | 9325 (68.3%) | 6253 (45.8%) | 5719 (41.9%) |
These classifications can be searched on the PANTHER website. For LocusLink, only genes associated with at least one reviewed RefSeq (accession no. beginning with ‘NP’) were considered. Genes encoding proteins that hit a PANTHER HMM can be classified to a family or subfamily, and most but not all of these are associated with meaningful molecular function (MF) or biological process (BP) classifications.
Figure 1CellDesigner (6) diagram of the insulin/IGF receptor signaling pathway. Proteins (blue and brown boxes) are mapped onto PANTHER HMMs. Active forms (dashed-line boxes) and phosphorylated forms (small circles around the letter ‘P’) of proteins are clearly indicated in the diagram. Over 60 pathways (mostly signaling pathways) are currently available.
Figure 2Statistical analysis of gene expression experiment results for a liver cancer versus normal cell line. Users can upload a list of genes/transcripts, along with an associated value (e.g. fold change, but can be any continuous variable). The list is divided automatically into groups sharing the same function (molecular function, biological process or pathway), and the distribution of values for each group is compared statistically with the overall distribution using the Mann–Whitney U-test to look for coordinated changes across each group (10).
Figure 3Pie chart of molecular functions represented in a list of genes. Users can upload protein IDs to the PANTHER website, and pie charts can be drawn from any list.