| Literature DB >> 22747887 |
Osamu Nishimura1, Yukako Hirao, Hiroshi Tarui, Kiyokazu Agata.
Abstract
BACKGROUND: Planarians are considered to be among the extant animals close to one of the earliest groups of organisms that acquired a central nervous system (CNS) during evolution. Planarians have a bilobed brain with nine lateral branches from which a variety of external signals are projected into different portions of the main lobes. Various interneurons process different signals to regulate behavior and learning/memory. Furthermore, planarians have robust regenerative ability and are attracting attention as a new model organism for the study of regeneration. Here we conducted large-scale EST analysis of the head region of the planarian Dugesia japonica to construct a database of the head-region transcriptome, and then performed comparative analyses among related species.Entities:
Mesh:
Year: 2012 PMID: 22747887 PMCID: PMC3507646 DOI: 10.1186/1471-2164-13-289
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the materials used in the EST analysis
| Eye | Dj_aE | 000 | Plasmid | SK | ABI 3700 | 918* | - | - | - |
| Head | Dj_aH | 000 | Plasmid | SK, M13 | ABI 3700 | 6,444* | 3,163 | 689 | 495 |
| Head | Dj_aH | 001 - 022 | Plasmid | SK | ABI 3700 | 5,024 | - | - | 21 |
| Head | Dj_aH | 101 - 140 | Plasmid | SK | ABI 3700 | 2,364 | - | - | 4 |
| Head | Dj_aH | 201 - 227 | TempliPhi | SK | ABI 3730xl | 8,426 | - | - | 52 |
| Head | Dj_aH | 301 - 327 | TempliPhi | SK, M13 | ABI 3730xl | 8,366 | 7,107 | 4,516 | 21 |
| Head | Dj_aH | 401 - 406 | TempliPhi | T3 | ABI 3730xl | 2,056 | - | - | - |
| Head | Dj_aH | 501 - 530 | TempliPhi | T3, M13 | ABI 3730xl | 9,462 | 8,191 | 5,888 | - |
| Total | 43,060 | 18,461 | 11,093 | 593 |
* DDBJ entries registered by previous research.
Statistics of the transcriptome assembly
| Contigs | 4,883 | 6,642,939 | 1360.4 | 1,252 |
| Singletons | 8,284 | 5,919,589 | 701.1 | 672 |
| Unique sequences | 13,167 | 12,562,528 | 940.5 | 846 |
Summary of the de novo transcriptome assembly using filter-passed and vector trimmed reads, pre-assembly contigs, and full-insert sequences.
Figure 1Distribution of number of sequences per contig in the unigenes.
Histogram of cluster sizes for unigenes
| 1 | 8,488 | 63.81% | 63.48% |
| 2 | 1,686 | 12.67% | 76.09% |
| 3 - 4 | 1,423 | 10.70% | 86.73% |
| 5 - 8 | 922 | 6.93% | 93.63% |
| 9 - 16 | 471 | 3.54% | 97.15% |
| 17 - 32 | 217 | 1.63% | 98.77% |
| 33 - 64 | 96 | 0.72% | 99.49% |
| 65 - 128 | 46 | 0.35% | 99.84% |
| 129 - 256 | 17 | 0.13% | 99.96% |
| 257 - 512 | 3 | 0.02% | 99.99% |
| 513 - 1,024 | 0 | 0.00% | 99.99% |
| 1,025 - 2,048 | 1 | 0.01% | 99.99% |
| 2,049 - 4,096 | 1 | 0.01% | 100.00% |
The depth of contigs showing less than 5 copies of expression consist mostly of the unigene variation.
Figure 2Unigene accumulation curve. The x-axis shows the number of EST sequences in the group used for unigene assembly, and the y-axis shows the number of unigenes consisting of contigs and singletons.
Figure 3Flow chart for the generation and annotation of unigenes.
Figure 4Species distribution of significant homologous matches of the unique sequences. A total of 7,334 sequences were hit with the RefSeq protein database using BLASTX with an E-value 1e-10. The top 30 species and number of sequences are shown.
The top 40 Pfam domains and families in unigenes
| pfam00069 | Protein kinase domain | 307 |
| pfam12796 | Ankyrin repeats | 101 |
| pfam00076 | RNA recognition motif | 71 |
| pfam00071 | Ras family | 62 |
| pfam07690 | Major Facilitator Superfamily | 49 |
| pfam07714 | Protein tyrosine kinase | 38 |
| pfam00876 | Innexin | 34 |
| pfam00443 | Ubiquitin carboxyl-terminal hydrolase | 33 |
| pfam00067 | Cytochrome P450 | 32 |
| pfam00270 | DEAD/DEAH box helicase | 29 |
| pfam00102 | Protein-tyrosine phosphatase | 28 |
| pfam00001 | 7 transmembrane receptor (rhodopsin family) | 27 |
| pfam00022 | Actin | 27 |
| pfam00112 | Papain family cysteine protease | 27 |
| pfam00350 | Dynamin family | 26 |
| pfam00240 | Ubiquitin family | 23 |
| pfam00335 | Tetraspanin family | 23 |
| pfam11901 | Protein of unknown function (DUF3421) | 23 |
| pfam03028 | Dynein heavy chain and region D6 of dynein motor | 22 |
| pfam00226 | DnaJ domain | 21 |
| pfam00271 | Helicase conserved C-terminal domain | 21 |
| pfam00012 | Hsp70 protein | 20 |
| pfam02931 | Neurotransmitter-gated ion-channel ligand binding domain | 20 |
| pfam00153 | Mitochondrial carrier protein | 19 |
| pfam00179 | Ubiquitin-conjugating enzyme | 19 |
| pfam00620 | RhoGAP domain | 19 |
| pfam01576 | Myosin tail | 19 |
| pfam03953 | Tubulin C-terminal domain | 19 |
| pfam00004 | ATPase family associated with various cellular activities | 16 |
| pfam00017 | SH2 domain | 16 |
| pfam00060 | Ligand-gated ion channel | 16 |
| pfam00089 | Trypsin | 16 |
| pfam00155 | Aminotransferase class I and II | 16 |
| pfam00168 | C2 domain | 16 |
| pfam01145 | SPFH domain/Band 7 family | 16 |
| pfam00046 | Homeobox domain | 15 |
| pfam00091 | Tubulin/FtsZ family, GTPase domain | 15 |
| pfam00307 | Calponin homology (CH) domain | 15 |
| pfam00501 | AMP-binding enzyme | 15 |
| pfam00782 | Dual specificity phosphatase, catalytic domain | 15 |
Figure 5Algorithm for the identical match ratio calculation between conserved proteins. (A) Alignment output of Protein-BLAST. The colored box shows a conserved region between query and subject sequences defined by BLAST. Match type i and h indicate identical matches and homologous substitutions, which are based on the BLOSUM62 matrix, and - means no similarity or a gap. (B) The equation used to calculate the identical match ratio. A large substitution ratio within a conserved region leads to a decreased identical match ratio, which indicates that a homologous pair exhibits a high rate of diversification. In the case of A, the identical match ratio is "0.78".
Figure 6Classification of the planarian conserved/identical proteins using KOG annotation. (A) The definitions of conserved protein and identical protein were derived from the identical match ratio calculation using the BLOSUM62 substitution matrix and the conserved region between homologous proteins in the 2 planarian species which were predicted by protein BLAST. After conserved domain search using the KOG database, each gene was classified in accordance with KOG functions. The columns of Conserved and Identical proteins show the numbers of genes that were classified into each function. The heat plot shows the log2 conserved/identical match ratio, with red indicating a high proportion of proteins with substitutions and green indicating that the majority of proteins are identical for the indicated function. * indicates a function that contained only conserved proteins (shown in red color). (B) KOG category classification shows obvious patterns that were clearly distinguishable for each category. Many relatively conserved genes were concentrated in ‘Metabolism’, whereas ‘Information Storage and Processing’ contained many identical genes.
Comparative analysis of CNS development genes with and the schistosoma
| Dj_CL2868_001_b2 | Protein numb | 2.00E-68 | |||||
| Dj_CL0775_001_b2 | Ras-related protein Rac1 | 2.00E-85 | |||||
| Dj_aH_314_P04.full | Protein nubbin | 7.00E-21 | | ||||
| Dj_aH_000_00626HN.full | Zinc finger protein jing | 1.00E-12 | | ||||
| Dj_CL0266_002_b2 | Dynein heavy chain, cytoplasmic | 0.00E + 00 | |||||
| Dj_aH_323_P16 | DNA topoisomerase 2-beta | 1.00E-129 | |||||
| Dj_CL2992_001_b2 | Probable global transcription activator SNF2L1 | 1.00E-104 | |||||
| Dj_CL0800_001_b2 | Transcriptional regulator ATRX | 5.00E-84 | |||||
| Dj_aH_133136_J24 | Myosin-10 | 2.00E-80 | |||||
| Dj_CL1927_001_b2 | Protein Wnt-4 | 3.00E-62 | |||||
| Dj_CL1142_001_b2 | Brain tumor protein | 5.00E-35 | |||||
| Dj_CL2438_001_b2 | Ubiquitin carboxyl-terminal hydrolase isozyme L5 | 3.00E-72 | |||||
| Dj_CL1575_001_b2 | Alpha-soluble NSF attachment protein | 1.00E-77 | |||||
| Dj_aH_121124_O06 | cGMP-dependent protein kinase 1 | 4.00E-81 | |||||
| Dj_aH_203_F09 | Neurofibromin | 2.00E-40 | |||||
| Dj_aH_517_H15.double | Receptor tyrosine-protein kinase erbB-4 | 3.00E-83 | |||||
| Dj_aH_526_D23.double | Hypoxanthine-guanine phosphoribosyltransferase | 4.00E-61 | |||||
| Dj_aH_214_D06.full | Homeobox protein SIX3 | 2.00E-90 | |||||
| Dj_aH_202_C21 | Plasma membrane calcium-transporting ATPase 3 | 8.00E-47 | |||||
| Dj_CL0788_001_b2 | Neural cell adhesion molecule 1 | 5.00E-18 | |||||
| Dj_aH_204_E12 | Protein kinase C iota type | 1.00E-49 | |||||
| Dj_CL3335_001_b2 | Nuclear factor 1 B-type | 4.00E-14 | |||||
| Dj_aH_307_L04.rev | cGMP-dependent protein kinase 1 | 8.00E-55 | |||||
| Dj_CL3457_001_b2 | RNA binding protein fox-1 homolog 1 | 5.00E-33 | |||||
| Dj_CL1765_001_b2 | Zinc finger protein ZIC 2 | 2.00E-73 | |||||
| Dj_aH_503_H01 | Epidermal growth factor receptor | 9.00E-38 | |||||
| Dj_aH_511_L17 | Tubby-related protein 3 | 2.00E-44 | |||||
| Dj_CL1678_001_b2 | NADH dehydrogenase [ubiquinone] iron-sulfur protein 4, mitochondrial | 2.00E-35 | |||||
| Dj_CL2162_001_b2 | Menin | 1.00E-37 | |||||
| Dj_CL3785_001_b2 | Excitatory amino acid transporter 2 | 5.00E-28 | |||||
| Dj_aH_514_N16 | DNA topoisomerase 2-beta | 2.00E-33 | |||||
| Dj_aH_325_L12.double | Paired mesoderm homeobox protein 2B | 2.00E-28 | |||||
| Dj_aH_305_P01 | Transcriptional regulator ATRX | 9.00E-29 | |||||
| Dj_aH_520_E14 | Epidermal growth factor receptor | 9.00E-15 | |||||
| Dj_aH_000_01325HH.double | NADH dehydrogenase [ubiquinone] iron-sulfur protein 4, mitochondrial | 1.00E-24 | |||||
| Dj_aH_217_D18 | Cysteine string protein | 7.00E-28 | |||||
| Dj_aH_402_H01 | LIM homeobox transcription factor 1-alpha | 2.00E-29 | |||||
| Dj_CL3423_001_b2 | Lethal(2) giant larvae protein homolog 1 | 2.00E-14 | |||||
| Dj_aH_222_K23 | Leishmanolysin-like peptidase | 2.00E-38 | |||||
| Dj_aH_505_J10 | Ski oncogene | 1.00E-32 | |||||
| Dj_aH_519_D15.double | Bardet-Biedl syndrome 2 protein homolog | 4.00E-51 | |||||
| Dj_CL0001_086_b2 | Paired box protein Pax-6 | 5.00E-72 | |||||
| Dj_CL3673_001_b2 | Glutamate [NMDA] receptor subunit epsilon-1 | 2.00E-37 | |||||
| Dj_aH_000_04532HH.full | Protocadherin-18 | 3.00E-39 | | ||||
| Dj_aH_523_G03.rev | Tubby-related protein 3 | 4.00E-23 | |||||
| Dj_CL1956_001_b2 | Homeobox protein meis3-A | 3.00E-44 | |||||
| Dj_CL1514_001_b2 | Protocadherin-like wing polarity protein stan | 2.00E-20 | |||||
| Dj_CL2141_001_b2 | Large proline-rich protein BAG6 | 2.00E-11 | |||||
| Dj_aH_306_A21 | Sphingosine kinase 2 | 3.00E-14 | | ||||
| Dj_CL3244_001_b2 | Neuroligin-4, X-linked | 1.00E-21 | | ||||
| Dj_CL4115_001_b2 | Neuroglian | 2.00E-41 | | ||||
| Dj_aH_306_H01 | Protein phosphatase Slingshot | 6.00E-14 | | ||||
| Dj_CL1452_001_b2 | SH3 and multiple ankyrin repeat domains protein 2 | 3.00E-18 | | ||||
| Dj_CL1757_001_b2 | Intraflagellar transport protein 88 homolog | 4.00E-74 | | ||||
| Dj_aH_527_L21 | Cytosolic carboxypeptidase 1 | 3.00E-76 | | ||||
| Dj_aH_522_B06 | Slit homolog 2 protein (Fragment) | 7.00E-15 | | ||||
| Dj_aH_137140_C22 | Bardet-Biedl syndrome 4 protein | 6.00E-47 | | ||||
| Dj_aH_303_K17 | Zinc finger protein Dzip1 | 1.00E-33 | | ||||
| Dj_aH_207_O05 | Potassium/sodium hyperpolarization-activated cyclic nucleotide-gated channel 2 | 8.00E-88 | | ||||
| Dj_CL0854_001_b2 | Secreted frizzled-related protein 5 | 3.00E-29 | | ||||
| Dj_CL3133_001_b2 | Cytosolic carboxypeptidase 1 | 3.00E-35 | | ||||
| Dj_aH_313_M09.double | Cyclin-dependent kinase 5 activator 1 | 2.00E-48 | | ||||
| Dj_CL0876_001_b2 | Sodium/calcium exchanger 1 | 1.00E-161 | | ||||
| Dj_aH_303_F09 | Frizzled-8 | 1.00E-21 | | ||||
| Dj_CL4143_001_b2 | Intraflagellar transport protein 88 homolog | 2.00E-81 | | ||||
| Dj_aH_530_O19 | Paired box protein Pax-6 | 9.00E-55 | | ||||
| Dj_aH_007_F21 | Protein Hook homolog 3 | 6.00E-13 | | | |||
| Dj_aH_309_N15.double | Dixin | 6.00E-11 | | | |||
| Dj_CL4393_001_b2 | Endothelin-converting enzyme 2 | 7.00E-34 | | | |||
| Dj_aH_325_L06 | Centrosomal protein of 290 kDa | 3.00E-31 | | | |||
| Dj_CL0234_001_b2 | Reticulon-4 | 2.00E-13 | | | |||
| Dj_aH_518_F04.double | Ceroid-lipofuscinosis neuronal protein 5 | 2.00E-74 | | | |||
| Dj_aH_203_F09 | Neurofibromin | 2.00E-40 | |||||
| Dj_aH_000_02581HH | Calpain-A | 1.00E-51 | |||||
| Dj_aH_502_D11 | Calpain-A | 2.00E-67 | |||||
| Dj_aH_511_L17 | Tubby-related protein 3 | 2.00E-44 | |||||
| Dj_CL3713_001_b2 | Suppressor of fused homolog | 2.00E-21 | |||||
| Dj_aH_000_01210HH.full | Suppressor of fused homolog | 3.00E-15 | |||||
| Dj_CL3673_001_b2 | Glutamate [NMDA] receptor subunit epsilon-1 | 2.00E-37 | |||||
| Dj_aH_523_G03.rev | Tubby-related protein 3 | 4.00E-23 | |||||
| Dj_aH_522_B06 | Slit homolog 2 protein (Fragment) | 7.00E-15 | | ||||
| Dj_aH_000_03614HH | DNA-binding protein SMUBP-2 | 4.00E-23 | | | |||
| Dj_CL2485_001_b2 | Spastin | 1.00E-83 | |||||
| Dj_aH_526_D23.double | Hypoxanthine-guanine phosphoribosyltransferase | 4.00E-61 | |||||
| Dj_aH_511_L17 | Tubby-related protein 3 | 2.00E-44 | |||||
| Dj_aH_402_H01 | LIM homeobox transcription factor 1-alpha | 2.00E-29 | |||||
| Dj_aH_523_G03.rev | Tubby-related protein 3 | 4.00E-23 | |||||
| Dj_aH_527_L21 | Cytosolic carboxypeptidase 1 | 3.00E-76 | | ||||
| Dj_CL3133_001_b2 | Cytosolic carboxypeptidase 1 | 3.00E-35 | | ||||
| Dj_aH_000_03614HH | DNA-binding protein SMUBP-2 | 4.00E-23 | | | |||
| Dj_aH_203_F09 | Neurofibromin | 2.00E-40 | |||||
| Dj_CL0341_002_b2 | Porphobilinogen deaminase | 4.00E-69 | |||||
| Dj_CL0341_001_b2 | Porphobilinogen deaminase | 2.00E-64 | |||||
| Dj_aH_203_F09 | Neurofibromin | 2.00E-40 | |||||
| | Dj_aH_513_H15.rev | Exocyst complex component 4 | 7.00E-18 | ||||
| Dj_CL1495_001_b2 | Translation initiation factor eIF-2B subunit delta | 4.00E-42 | |||||
+ indicate the presence of gene which homologous with D. japonica CNS-development gene in S. mediterranea and the schistosoma.