| Literature DB >> 17202162 |
Nicola J Mulder1, Rolf Apweiler, Teresa K Attwood, Amos Bairoch, Alex Bateman, David Binns, Peer Bork, Virginie Buillard, Lorenzo Cerutti, Richard Copley, Emmanuel Courcelle, Ujjwal Das, Louise Daugherty, Mark Dibley, Robert Finn, Wolfgang Fleischmann, Julian Gough, Daniel Haft, Nicolas Hulo, Sarah Hunter, Daniel Kahn, Alexander Kanapin, Anish Kejariwal, Alberto Labarga, Petra S Langendijk-Genevaux, David Lonsdale, Rodrigo Lopez, Ivica Letunic, Martin Madera, John Maslen, Craig McAnulla, Jennifer McDowall, Jaina Mistry, Alex Mitchell, Anastasia N Nikolskaya, Sandra Orchard, Christine Orengo, Robert Petryszak, Jeremy D Selengut, Christian J A Sigrist, Paul D Thomas, Franck Valentin, Derek Wilson, Cathy H Wu, Corin Yeats.
Abstract
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PANTHER. The latter two new member databases have been integrated since the last publication in this journal. There have been several new developments in InterPro, including an additional reading field, new database links, extensions to the web interface and additional match XML files. InterPro has always provided matches to UniProtKB proteins on the website and in the match XML file on the FTP site. Additional matches to proteins in UniParc (UniProt archive) are now available for download in the new match XML files only. The latest InterPro release (13.0) contains more than 13 000 entries, covering over 78% of all proteins in UniProtKB. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). The InterProScan search tool is now also available via a web service at http://www.ebi.ac.uk/Tools/webservices/WSInterProScan.html.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17202162 PMCID: PMC1899100 DOI: 10.1093/nar/gkl841
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Coverage of protein sequences and amino acid residues for each member database
| Member database | Number of methods in InterPro | Total number of proteins hit by database | Total number of residues covered | Number of unique proteins hit by database |
|---|---|---|---|---|
| Gene3D | 1465 | 1 736 593 | 395 970 746 | 18 504 |
| PANTHER | 39 648 | 582 799 | 173 969 368 | 6355 |
| PIRSF | 1347 | 161 248 | 58 525 186 | 851 |
| PRINTS | 1900 | 645 272 | 55 137 257 | 3936 |
| PROSITE patterns | 1336 | 766 422 | 16 861 589 | 14 229 |
| PROSITE profiles | 632 | 763 334 | 153 498 831 | 2131 |
| Pfam | 8296 | 2 502 476 | 570 591 566 | 281 062 |
| ProDom | 3538 | 506 284 | 61 153 722 | 19 926 |
| SMART | 706 | 514 466 | 94 310 609 | 2252 |
| SUPERFAMILY | 1122 | 1 929 112 | 484 789 136 | 51 282 |
| TIGRFAMs | 2625 | 501 897 | 170 121 752 | 7306 |
aNot all the methods are integrated into InterPro entries, e.g. for PANTHER, but InterPro provides matches to them in the match XML file.
bThis is the number of proteins hit by one database only.
Number of InterPro entries with cross-references to the databases InterPro provides links to
| Database | Number of InterPro entries with links |
|---|---|
| UniProtKB | 13 131 |
| BLOCKS | 6134 |
| CAZy | 119 |
| COMe | 204 |
| IntEnz | 2336 |
| IUPHAR receptor | 113 |
| MEROPS | 548 |
| PANDIT | 7702 |
| PROSITE doc | 1479 |
| Pfam Clans | 1544 |
| CluSTr | 6818 |
| IntAct | 135 |
| GO | 7131 |
| MSDsite | 1313 |
| PDB | 68 021 |
| SCOP | 6537 |
| CATH | 6212 |