| Literature DB >> 25422781 |
Olivier Arnaiz1, Jean Cohen1, Anne-Marie Tassin1, France Koll1.
Abstract
BACKGROUND: New generation technologies in cell and molecular biology generate large amounts of data hard to exploit for individual proteins. This is particularly true for ciliary and centrosomal research. Cildb is a multi-species knowledgebase gathering high throughput studies, which allows advanced searches to identify proteins involved in centrosome, basal body or cilia biogenesis, composition and function. Combined to localization of genetic diseases on human chromosomes given by OMIM links, candidate ciliopathy proteins can be compiled through Cildb searches.Entities:
Year: 2014 PMID: 25422781 PMCID: PMC4242763 DOI: 10.1186/2046-2530-3-9
Source DB: PubMed Journal: Cilia ISSN: 2046-2530
Figure 1The species whose whole proteome has been included into Cildb V3.0 are gathered by taxonomy groups, with indication whether they are centric or not and of the number of high throughput studies, ciliary or not, performed in the species. The choice of species to include into Cildb was 1) species in which high throughput ciliary studies have been performed, 2) species routinely used as models in ciliary studies in general, and 3) centric and acentric species, because the presence/absence of certain proteins may be relevant for the conservation of ciliary proteins through evolution. The case of the Bug22/GTL3/C16orf80 protein, composed of a domain called DUF667, essential for ciliary motility [6], was carefully examined for the choice of fungi to add in Cildb for comparative genomics. Bug22 is a protein highly conserved in all centric species, be they metazoans, protozoa, plants or fungi and curiously also highly conserved in the acentric land plants, but absent from the genomes of higher fungi already sequenced at the time of the publication, i.e. acentric ascomycetes [6]. Owing to constant new genome sequencing, novel fungal whole proteomes appeared and the occurrence of Bug22 was different from what was thought earlier. It is still undetectable in ascomycetes, but is found conserved in the acentric Mortierella verticillata (accession MVEG_01915), and a more divergent Bug22 with recognizable DUF667 domain is found in several basidiomycetes represented in Cildb by Laccaria bicolor (accession 598201). This property was one of the reasons to include those two fungi proteomes into Cildb V3.0. This also emphasizes that constant arrival of new knowledge as new genomes are sequenced can put into questions former assumptions such as the absence of particular proteins in some species, here Bug22 in fungi.
High throughput studies compiled in Cildb V3.0
| Andersen et al., 2003 [ | Centriole proteome | yes | |
| Arnaiz et al., 2009 [ | Cilium proteome | yes | |
| Arnaiz et al., 2010 [ | Expression during ciliogenesis | yes | |
| Avidor-Reiss et al., 2004 [ | Comparative genomics | yes | |
| Baker et al., 2008a [ | Spermatozoa proteome | no | |
| Baker et al., 2008b [ | Spermatozoa proteome | no | |
| Bechstedt et al., 2010 [ | Expression in tissues containing sensory cilia | yes | |
| Blacque et al., 2005 [ | Differential expression between ciliated and non ciliated cells | yes | |
| Blacque et al., 2005 [ | Genomic screening for X-boxes in promoters | yes | |
| Boesger et al., 2009 [ | Flagellum phosphoproteome | yes | |
| Broadhead et al., 2006 [ | Flagellum proteome | yes | |
| Cachero et al., 2011 [ | Expression in early development of future neural cells | no | |
| Cao et al., 2006 [ | Sperm flagellar axonemes proteome | yes | |
| Chen et al., 2006 [ | Expression in daf-19 mutant | yes | |
| Datta et al., 2011 [ | Gene expression with HIPPI expression modulation | no | |
| Dorus et al., 2006 [ | Spermatozoa proteome | no | |
| Efimenko et al., 2005 [ | Genomic screening for X-boxes in promoters | yes | |
| Fritz-Laylin and Cande, 2010 [ | Flagellum proteome | yes | |
| Geremek et al., 2011 [ | Expression in primary ciliary dyskinesia patients | yes | |
| Geremek et al., 2014 [ | Expression in primary ciliary dyskinesia patients | yes | |
| Guo et al., 2010 [ | Proteomics associated with spermiogenesis | no | |
| Hodges et al., 2011 [ | Comparative genomics | yes | |
| Hoh et al., 2012 [ | Expression in multiciliated cells from trachea | yes | |
| Huang et al., 2008 [ | Proteomics associated with spermiogenesis | no | |
| Hughes et al., 2008 [ | Proteome of Microtubule-Associated Proteins | no | |
| Ishikawa et al., 2012 [ | Primary cilium proteome | yes | |
| Ivliev et al., 2012 [ | Expression profile in different tissues | yes | |
| Jakobsen et al., 2011 [ | Centrosome proteomics | yes | |
| Keller et al., 2005 [ | Expression during ciliogenesis | yes | |
| Keller et al., 2005 [ | Basal body proteome | yes | |
| Kilburn et al., 2007 [ | Basal body proteome | yes | |
| Kim et al., 2010 [ | Ciliogenesis modulation | yes | |
| Kubo et al., 2008 [ | Expression in ciliated tissues | yes | |
| Laurençon et al., 2007 [ | Genomic screening for X-boxes in promoters | yes | |
| Lauwaet et al., 2011 [ | Homology search for basal body proteins | yes | |
| Lauwaet et al., 2011 [ | Basal body proteome | yes | |
| Li et al., 2004 [ | Comparative genomics | yes | |
| Liu et al., 2007 [ | Cilium proteome | yes | |
| Martínez-Heredia et al., 2006 [ | Spermatozoa proteome | no | |
| Mayer et al., 2008 [ | Cilium proteome | yes | |
| Mayer et al., 2009 [ | Cilium proteome | yes | |
| McClintock et al., 2008 [ | Expression in ciliated tissues | yes | |
| Merchant et al., 2007 [ | Comparative genomics | yes | |
| Müller et al., 2010 [ | Centrosome proteome | yes | |
| Nakachi et al., 2011 [ | Sperm tail proteome | yes | |
| Nogales-Cadenas et al., 2009 [ | Centrosome human curation | yes | |
| Ostrowski et al., 2002 [ | Cilium proteome | yes | |
| Pazour et al., 2005 [ | Expression during ciliogenesis | yes | |
| Pazour et al., 2005 [ | Flagellum proteome | yes | |
| Phirke et al., 2011 [ | Down and upregulated genes in daf-19 mutant | yes | |
| Reinders et al., 2006 [ | Nuclear-associated body proteome | no | |
| Ross et al., 2007 [ | Expression during ciliogenesis | yes | |
| Sakamoto et al., 2008 [ | Proteome of Microtubule-Associated Proteins | no | |
| Sauer et al., 2005 [ | Mitotic spindle proteome | no | |
| Smith et al., 2005 [ | Cilium proteome | yes | |
| Stolc et al., 2005 [ | Expression during ciliogenesis | yes | |
| Stubbs et al., 2008 [ | Expression Under FoxJ1 silencing | yes | |
| Wigge et al., 1998 [ | Spindle pole body proteome | no | |
| Yano et al., 2013 [ | Ciliary membrane proteome | yes |
The high throughput studies present in Cildb V3.0 are summarized in the table with indication in the second column whether it is a proteomic, gene expression, or genomic study. The species in which the studies have been performed are specified in the third column. In the fourth column is the fact whether a given study is ciliary (concerns cilia, flagella, basal bodies, centrioles, centrosomes or spindle pole bodies) or not. The table is ordered alphabetically by first author of publication of the studies present in Cildb V3.0.
Figure 2An advanced search on Cildb V3.0 is started by clicking on the ‘Search’ button on the top row on the right. Then, it is necessary to choose the species in which the proteome has to be searched for. The filter window then appears to adjust the filters in the left panel (no filter means that the full proteome will be retrieved). Similarly, the output window allows displaying particular properties (attributes) in columns for each filtered protein. A summary on the right reminds the user of all the filters and attributes currently used. This also allows direct modification of the orders of the columns in the output by moving the attributes up and down in the list. The last operation of the process is to show the results. The results are given by pages of 20 items with a maximum of 1000 items. To see all results, they have to be downloaded as a file. At any time, if the result output seems incomplete or inappropriate, the filters and attributes can be modified by using the ‘Back’ button (edit results) to refine the search and show the results again. The quick search allows a rapid search by keywords. The result can be processed the same way as the one described above, with the possibility to add attributes by ‘Edit results’ and to download the file. Note the direct access to BLAST, Human genome Gbrowse, Motif search, Help and access to older Versions of Cildb on the top row buttons to the right.
Evolutionary conservation of centrosomal proteins viewed through Cildb V3.0
| 1 | ENSP00000380378 | PAFAH1B1,LIS1,LIS2,MDCR | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 2 | ENSP00000364691 | CROCC,ROLT,ROLT,rootletin | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 3 | ENSP00000309591 | PRKACA,PKACA,PKACa | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 4 | ENSP00000263710 | CLASP1,MAST1 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 5 | ENSP00000263811 | DYNC1I2,DNCI2,IC2 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 6 | ENSP00000216911 | AURKA,AIK,ARK1,AURA,AURORA2 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 7 | ENSP00000364721 | MAPRE1,EB1,EB1 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 8 | ENSP00000265563 | PRKAR2A,PKR2,PRKAR2 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 9 | ENSP00000355966 | NEK2,HsPK21,NEK2A,NLK1 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 10 | ENSP00000261965 | TUBGCP3,GCP3,SPBC98 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 11 | ENSP00000252936 | TUBGCP2,GCP2,Grip103,h103p | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 12 | ENSP00000251413 | TUBG1,CDCBM4,GCP-1 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 13 | ENSP00000456648 | TUBGCP4,76P,GCP-4,GCP4 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 14 | ENSP00000323302 | POC1B,PIX1,TUWD12,WDR51B | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 15 | ENSP00000324464 | CSNK1D,ASPS,CKIdelta,FASPS2,HCKID | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 16 | ENSP00000270861 | PLK4,SAK,STK18,Sak | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 17 | ENSP00000356785 | NME7,MN23H7,NDK7 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 18 | ENSP00000273130 | DYNC1LI1,DNCLI1,LIC1 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 19 | ENSP00000359300 | CETN2,CALT,CEN2 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 20 | ENSP00000287380 | TBC1D31,Gm85,WDR67 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 21 | ENSP00000287482 | SASS6,SAS-6,SAS6 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 22 | ENSP00000300093 | PLK1,PLK,STPK13 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 23 | ENSP00000257287 | CEP135,CEP4,MCPH8 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 24 | ENSP00000439376 | DCTN2,DCTN50,DYNAMITIN,RBP50 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 25 | ENSP00000395302 | CKAP5,ch-TOG,CHTOG,MSPS | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 26 | ENSP00000342510 | CEP97,LRRIQ2 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 27 | ENSP00000348965 | DYNC1H1,DHC1,DHC1a | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 28 | ENSP00000469720 | CETN2,CALT,CEN2 | yes | yes | yes | yes | yes | 1 (yyyyy) |
| 29 | ENSP00000317156 | CEP192,PPP1R62 | yes | yes | yes | yes | no | 2 (yyyyn) |
| 30 | ENSP00000270708 | WRAP73,WDR8 | yes | yes | yes | yes | no | 2 (yyyyn) |
| 31 | ENSP00000248846 | TUBGCP6,GCP-6,GCP6,MCCRP,MCPHCR | yes | yes | yes | yes | no | 2 (yyyyn) |
| 32 | ENSP00000393583 | AZI1,AZ1,Cep131,ZA1 | yes | yes | yes | yes | no | 2 (yyyyn) |
| 33 | ENSP00000283645 | TUBGCP5,GCP5 | yes | yes | yes | yes | 2 (yyyyn) | |
| 34 | ENSP00000303058 | CEP120,CCDC100 | yes | yes | yes | yes | no | 2 (yyyyn) |
| 35 | ENSP00000313752 | SSNA1,N14,NA-14 | yes | yes | yes | yes | no | 2 (yyyyn) |
| 36 | ENSP00000355812 | FGFR1OP,FOP | yes | yes | yes | yes | no | 2 (yyyyn) |
| 37 | ENSP00000343818 | CDK5RAP2,C48,Cep215,MCPH3 | yes | yes | yes | yes | 3 (yyyny) | |
| 38 | ENSP00000344314 | OFD1,CXorf5,JBTS10,RP23 | yes | yes | yes | no | no | 4 (yyynn) |
| 39 | ENSP00000317144 | PIBF1,C13orf24,CEP90 | yes | yes | yes | no | no | 4 (yyynn) |
| 40 | ENSP00000204726 | GOLGA3,GCP170,MEA-2,golgin-160 | yes | yes | yes | no | no | 4 (yyynn) |
| 41 | ENSP00000206474 | HAUS4,C14orf94 | yes | yes | yes | no | no | 4 (yyynn) |
| 42 | ENSP00000281129 | CEP128,C14orf145,C14orf61,LEDP/132 | yes | yes | yes | no | no | 4 (yyynn) |
| 43 | ENSP00000262127 | CEP76,C18orf9,HsT1705 | yes | yes | yes | no | no | 4 (yyynn) |
| 44 | ENSP00000370803 | CCP110,Cep110,CP110 | yes | yes | yes | no | no | 4 (yyynn) |
| 45 | ENSP00000263284 | CCDC61 | yes | yes | yes | no | no | 4 (yyynn) |
| 46 | ENSP00000223208 | CEP41,JBTS15,TSGA14 | yes | yes | yes | no | no | 4 (yyynn) |
| 47 | ENSP00000303769 | AKNA | yes | yes | yes | no | no | 4 (yyynn) |
| 48 | ENSP00000302537 | MDM1 | yes | yes | yes | no | no | 4 (yyynn) |
| 49 | ENSP00000264935 | CEP72,FLJ10565 | yes | yes | yes | no | no | 4 (yyynn) |
| 50 | ENSP00000419231 | CEP70,BITE | yes | yes | yes | no | no | 4 (yyynn) |
| 51 | ENSP00000306105 | CEP89,CCDC123 | yes | yes | yes | no | no | 4 (yyynn) |
| 52 | ENSP00000380661 | CEP250,C-NAP1,CEP2,CNAP1 | yes | yes | yes | no | no | 4 (yyynn) |
| 53 | ENSP00000356579 | CEP350,CAP350,GM133 | yes | yes | yes | no | no | 4 (yyynn) |
| 54 | ENSP00000260372 | HAUS2,C15orf25,CEP27,HsT17025 | yes | yes | yes | no | no | 4 (yyynn) |
| 55 | ENSP00000360540 | CEP55,C10orf3,CT111,URCC6 | yes | yes | yes | no | no | 4 (yyynn) |
| 56 | ENSP00000355500 | CEP170,FAM68A,KAB | yes | yes | yes | no | no | 4 (yyynn) |
| 57 | ENSP00000369871 | HAUS6,Dgt6,FAM29A | yes | yes | yes | no | no | 4 (yyynn) |
| 58 | ENSP00000371308 | CENPJ,BM032,CENP-J,CPAP,LAP,LIP1,MCPH6,Sas-4 | yes | yes | yes | no | no | 4 (yyynn) |
| 59 | ENSP00000282058 | HAUS1,CCDC5,HEI-C,HEIC | yes | yes | yes | no | no | 4 (yyynn) |
| 60 | ENSP00000283122 | CETN3,CDC31,CEN3 | yes | yes | yes | no | no | 4 (yyynn) |
| 61 | ENSP00000352572 | PCNT,KEN,MOPD2,PCN,PCNT2,PCNTB | yes | yes | yes | no | no | 4 (yyynn) |
| 62 | ENSP00000295872 | SPICE1,CCDC52,SPICE | yes | yes | yes | no | no | 4 (yyynn) |
| 63 | ENSP00000317902 | CEP57,MVA2,PIG8,TSP57 | yes | yes | yes | no | no | 4 (yyynn) |
| 64 | ENSP00000426129 | CEP63 | yes | yes | yes | no | no | 4 (yyynn) |
| 65 | ENSP00000308021 | CEP290,BBS14,JBTS5,LCA10,MKS4,NPHP6,POC3,rd16,SLSN6 | yes | yes | yes | no | no | 4 (yyynn) |
| 66 | ENSP00000439056 | HAUS5,dgt5 | yes | yes | yes | no | no | 4 (yyynn) |
| 67 | ENSP00000462740 | CEP41,JBTS15,TSGA14 | yes | yes | yes | no | no | 4 (yyynn) |
| 68 | ENSP00000265717 | PRKAR2B,PRKAR2,RII-BETA | yes | yes | no | yes | yes | 5 (yynyy) |
| 69 | ENSP00000345892 | NDE1,HOM-TES-87,LIS4,NDE,NUDE | yes | yes | no | yes | yes | 5 (yynyy) |
| 70 | ENSP00000358921 | ACTR1A,ARP1,CTRN1 | yes | yes | no | yes | yes | 5 (yynyy) |
| 71 | ENSP00000447907 | DYNLL1,DLC1,DLC8,DNCL1,DNCLC1,hdlc1,LC8 | yes | yes | no | no | no | 6 (yynnn) |
| 72 | ENSP00000278935 | CEP164,NPHP15 | yes | yes | no | no | no | 6 (yynnn) |
| 73 | ENSP00000264448 | ALMS1,ALSS | yes | yes | no | no | no | 6 (yynnn) |
| 74 | ENSP00000316681 | KIAA1731 | yes | yes | no | no | no | 6 (yynnn) |
| 75 | ENSP00000456335 | CNTROB,LIP8,PP1221 | yes | yes | no | no | no | 6 (yynnn) |
| 76 | ENSP00000348573 | AKAP9,AKAP350,AKAP450,CG-NAP,HYPERION,LQT11 | yes | no | no | no | no | 7 (ynnnn) |
| 77 | ENSP00000384844 | DCTN1,DAP-150,P135 | yes | yes | yes | 8 (nnyyy) |
This table presents the list of 77 human proteins obtained from a BioMart search described in the text. The output gives a total of 133 proteins encoded by 77 genes, due to the presence of splice variants. For clarity, only one protein ID per gene has been presented in the table, after verification that all the splice variants of each gene displays the same orthology relationships with the species presented here. This table illustrates evolutionary conservation where a “yes” indicates that the human protein has an Inparalog in Cildb and a “no” that no Inparanoid orthology was found. The column ‘class’ serves to order the output genes in the table (from 5× ‘yes’ at the top to much fewer ‘yes’ at the bottom, along criteria of certain species being closer to each other than others, whereby the order from left to right goes human-mouse-rat (mammals), then fish (vertebrate), then bee and fly (insects). All instances of lacking orthology (“no”) were individually verified by BLAST searches using the Cildb BLAST. The BLAST results were consistent with the absence of orthologs in the species, and only three exceptions contradict the Inparanoid results, highlighted as bold characters in the table.
1- Human Azi1 (ENSP00000393583) has no inparalog in Drosophila although an ortholog called dilatory exists. BLAST search on the Drosophila genome indeed light up dilatory, with a score very close to the one found for the Apis inparalogs by BLAST. The difference between these different outputs may result from the value of default thresholds taken by the Inparanoid program and the different lengths of the proteins.
2- Human cdk5rap2 (ENSP00000343818) has no Inparalog in Apis, although homologs are found by BLAST. Inparanoid relationships of the top three Apis proteins in the list (XP_006563202.1, XP_006563201.1, XP_392107.3) appear to be Inparalogs of Drosophila centrosomin (cnn, cdk5rap2) for which 8 of 12 splice variant proteins display human Inparalogs. However, no direct Inparanoid relationships exist between the Apis proteins and any human protein.
3- Human dynactin/dctn1 (ENSP00000384844) has surprisingly no Inparalogs in mouse and rat whereas some are found in fish, bee and fly. However, mouse and rat homologs are easily found by BLAST search. After careful examination, it appears that the only ENSP00000384844 dynactin protein found common to the three human centrosomal studies, is one of the splice variants excluded from Inparalog groups. Indeed, the 16 splice variants for the human dynactin gene ENSG00000204843 and the seven splice variants for its mouse counterpart ENSMUSG00000031865 are related by Inparanoid orthology through three groups, hsap_mmus.17187 (one human and one mouse gene), hsap_mmus.1073 (four human and one mouse gene) and hsap_mmus.977 (one human and two mouse genes). The remaining ten human protein variants (among which is ENSP00000384844) and three mouse protein variants encoded by these genes are not included in the orthology groups, probably because their exon composition was too different from the other protein variants.
These three examples represent the limits of Inparanoid orthology prediction, highlighting the fact that reciprocal BLAST searches cannot be avoided, and thus represent an important complementary approach, for the analysis of individual proteins.