| Literature DB >> 18691438 |
Norihiro Futamura1, Yasushi Totoki, Atsushi Toyoda, Tomohiro Igasaki, Tokihiko Nanjo, Motoaki Seki, Yoshiyuki Sakaki, Adriano Mari, Kazuo Shinozaki, Kenji Shinohara.
Abstract
BACKGROUND: Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18691438 PMCID: PMC2568000 DOI: 10.1186/1471-2164-9-383
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of characteristics of the full-length cDNA library from male strobili of C. japonica
| Total sequences | 36,011 |
| Number of 5'sequences | 18,843 |
| Number of 3' sequences | 17,168 |
| Number of cDNA clones | 19,437 |
| Number of contigsa | 7,686 |
| Number of singletsa | 15,972 |
| Number of unique transcriptsb | 10,463 |
| Number of unique transcripts corresponded to | |
| 1 clone | 6,320 |
| 2 clones | 2,078 |
| 3–5 clones | 1,686 |
| 6–10 clones | 353 |
| 11–20 clones | 21 |
| >21 clones | 5 |
a ESTs were assembled using the PHRAP program.
b Contigs and singlets were grouped using the BLASTN program and then both 5'- and 3'- end sequences derived from the respective same clones were grouped together.
Base composition at each position in the 5'-end sequences
| Position | A | G | C | T | %purine (A+G) |
| 1 | 505 (2.7%) | 16977 (90%) | 350 (1.9%) | 1011 (5.4%) | 93 |
| 2 | 8185 (43%) | 8613 (46%) | 1066 (5.7%) | 979 (5.2%) | 89 |
| 3 | 5146 (27%) | 5237 (28%) | 2978 (16%) | 5482 (29%) | 55 |
| 4 | 5408 (29%) | 5645 (30%) | 3545 (19%) | 4249 (23%) | 59 |
| 5 | 5517 (29%) | 5455 (29%) | 3225 (17%) | 4646 (25%) | 58 |
| 10 | 5456 (29%) | 3879 (21%) | 3849 (20%) | 5659 (30%) | 50 |
The occurrence of each base was counted at each position.
Figure 1Functional classification and relative levels, as percentages of unique transcripts in the pool, of ESTs derived from male strobili of . We assigned 7,369 clusters by reference to databases of KOGs, TWOGs and LSEs using a BLAST-based algorithm (E-value ≤ 10-5). Designations of functional categories: A, RNA processing and modification; B, chromatin structure and dynamics; C, energy production and conversion; D, cell cycle control and mitosis; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation, ribosomal structure and biogenesis; K, transcription; L, replication and repair; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, post-translational modification, protein turnover and chaperone functions; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; T, signal transduction; U, intracellular trafficking, secretion, and vesicular transport; V, defense mechanisms; Y, nuclear structure; Z, cytoskeleton; R, prediction of general function only; S, function unknown; and X, unassigned.
Figure 2Sequence similarities. Numbers of transcript sequences of transcripts from C. japonica male strobili similar to sequences in the Uniprot, pine, spruce, poplar, Arabidopsis, and rice databases according to BLASTX E-value cutoff values.
Occurrence of the 25 most common Pfam domains in the predicted proteins of unique transcripts from male strobili of C. japonica
| Description of Pfam domain | Pfam accession | Number of | Number of genes in |
| Protein kinase domain | PF00069 | 132 | 789 |
| Cytochrome P450 | PF00067 | 120 | 242 |
| RNA recognition motif. | |||
| (a.k.a. RRM, RBD, or RNP domain) | PF00076 | 78 | 172 |
| NAD-dependent epimerase/dehydratase family | PF01370 | 54 | 52 |
| Zinc finger; C3HC4 type (RING finger) | PF00097 | 48 | 18 |
| GDSL-like lipase/acylhydrolase | PF00657 | 44 | 107 |
| Myb-like DNA-binding domain | PF00249 | 43 | 53 |
| WD domain; G-beta repeat | PF00400 | 41 | 17 |
| Ras family | PF00071 | 35 | 73 |
| UDP-glucuronosyl and UDP-glycosyl transferase | PF00201 | 35 | 86 |
| Short-chain dehydrogenase | PF00106 | 34 | 55 |
| Alpha/beta hydrolase fold | PF00561 | 34 | 37 |
| Sugar (and other) transporter | PF00083 | 33 | 55 |
| 2OG-Fe(II) oxygenase superfamily | PF03171 | 33 | 105 |
| Core histone H2A/H2B/H3/H4 | PF00125 | 32 | 46 |
| Peroxidase | PF00141 | 32 | 82 |
| DnaJ domain | PF00226 | 32 | 88 |
| Eukaryotic aspartyl protease | PF00026 | 31 | 9 |
| Aldo/keto reductase family | PF00248 | 30 | 21 |
| Transferase family | PF02458 | 28 | 55 |
| Alcohol dehydrogenase, GroES-like domain | PF08240 | 28 | 25 |
| Mitochondrial carrier protein | PF00153 | 27 | 55 |
| Ubiquitin-conjugating enzyme | PF00179 | 27 | 41 |
| Zinc-binding dehydrogenase | PF00107 | 26 | 39 |
| AMP-binding enzyme | PF00501 | 26 | 45 |
a Protein families were identified by BLASTX searches with an E-value < 1e-10 in the Pfam database.
b Protein families were assigned on the basis of data in TAIR.
Pfam domains found in transcripts of both A. thaliana stamen- or male gametophyte-specific genes and in transcripts from male strobili of C. japonica
| Description of Pfam domain | Pfam accession number | Number of transcripts in | Number of | Number of |
| Protein kinase domain | PF00069 | 132 | 35 | 39 |
| Plant invertase/pectin methylesterase inhibitor | PF04043 | 12 | 22 | 15 |
| Protein tyrosine kinase | PF07714 | 19 | 16 | 12 |
| GDSL-like lipase/acylhydrolase | PF00657 | 44 | 15 | 4 |
| Pectinesterase | PF01095 | 14 | 15 | 11 |
| ABC transporter | PF00005 | 20 | 13 | 5 |
| Cytochrome P450 | PF00067 | 120 | 13 | 3 |
| Glycosyl hydrolase family 28 | PF00295 | 10 | 12 | 11 |
| Sodium/hydrogen exchanger family | PF00999 | 6 | 9 | 10 |
| ABC-2 type transporter | PF01061 | 8 | 9 | 3 |
| Oleosin | PF01277 | 4 | 9 | 4 |
| No apical meristem (NAM) protein | PF02365 | 12 | 9 | 2 |
| RNA recognition motif. | ||||
| (a.k.a. RRM, RBD, or RNP domain) | PF00076 | 78 | 8 | 8 |
| Sugar (and other) transporter | PF00083 | 33 | 8 | 3 |
| Calcineurin-like phosphoesterase | PF00149 | 25 | 8 | 6 |
| E1-E2 ATPase | PF00122 | 4 | 7 | 4 |
| Multicopper oxidase | PF00394 | 4 | 7 | 4 |
| Haloacid dehalogenase-like hydrolase | PF00702 | 19 | 7 | 3 |
| Multicopper oxidase | PF07731 | 14 | 7 | 4 |
| Multicopper oxidase | PF07732 | 17 | 7 | 4 |
| Glycosyl hydrolase family 1 | PF00232 | 20 | 6 | 4 |
| Pectate lyase | PF00544 | 5 | 6 | 3 |
| FAD-binding domain | PF01565 | 6 | 6 | 2 |
| Galactose-binding lectin domain | PF02140 | 5 | 6 | 4 |
| MtN3/saliva family | PF03083 | 14 | 6 | 1 |
a 1,145 transcripts were picked up on the basis of previous report by Wellmer et al. (2004) [29].
b 1,274 transcripts were picked up on the basis of previous report by Honys and Twell (2004) [30].
Products of ESTs that resemble pollen allergens
| Accession No.a | Allergen | Species | Putative product | E-valueb | No.c |
| BY894724 | Cry j 1 | Pectate lyase | 1E-127 | 6 | |
| BY895894 | Cry j 2 | Polymethylgalacturonase | 1E-119 | 10 | |
| BY891770 | Cry j 3.8 | PR-5 protein | 3E-98 | 16 | |
| BY896705 | CJP-4 | Class IV chitinase | 1E-112 | 4 | |
| BY888350 | CJP-6 | Isoflavone reductase family | 4E-87 | 7 | |
| BY894232 | Jun o 4 | Calcium-binding protein | 2E-67 | 16 | |
| BY912188 | Amb a 3 | Plastocyanin-like protein | 2E-8 | 2 | |
| BY881070 | Cat r 1 | Cyclophilin | 9E-79 | 10 | |
| BY882008 | Che a 1 | Trypsin inhibitor | 3E-28 | 1 | |
| BY911759 | Cor a 1.04 | PR-10 protein | 2E-14 | 1 | |
| BY896550 | Cor a 10 | Luminal-binding protein | 1E-107 | 10 | |
| BY882008 | Cro s 1 | LAT52 protein | 3E-13 | 3 | |
| BY883628 | Cyn d 22 | Enolase | 9E-25 | 1 | |
| BY893554 | Cyn d 24 | PR-1 protein | 2E-33 | 3 | |
| BY895449 | Hum j Profilin | Profilin | 6E-63 | 3 | |
| BY899168 | Hum j 1 | Uncharacterized protein | 1E-10 | 3 | |
| BY892250 | Lol p 1 | Expansin | 6E-14 | 11 | |
| BY896420 | Ole e 5 | Superoxide dimutase | 7E-66 | 5 | |
| BY894301 | Ole e 9 | β-1,3-glucanase | 2E-46 | 10 | |
| BY905708 | Ole e 10 | Glycosyl hydrolase | 4E-23 | 3 | |
| BY889873 | Sal k 1.03 | pectin esterase | 2E-35 | 2 | |
| BY911213 | Sal k 2 | Protein kinase | 6E-47 | 53 |
a The accession number of an EST that exhibited strongest similarity.
b The E-value of an EST that exhibited strongest similarity.
c Total number of transcripts whose products were similar to the respective allergens (BLASTX score > 50 and E-value <10-7).
Identification of transcripts encoding putative transcription factors in male strobilus of C. japonica
| Description of Pfam domains | Pfam accession | Number of |
| Zinc finger, C3HC4 type (RING finger) | PF00097 | 48 |
| Myb-like DNA-binding domain | PF00249 | 43 |
| AP2 domain | PF00847 | 18 |
| No apical meristem (NAM) protein; NAC domain | PF02365 | 12 |
| Homeobox domain | PF00046 | 11 |
| PHD-finger | PF00628 | 11 |
| SRF-type transcription factor; MADS box | PF00319 | 10 |
| Histone-like transcription factor (CBF/NF-Y) and archaeal histone | PF00808 | 7 |
| Helix-loop-helix DNA-binding domain | PF00010 | 6 |
| HSF-type DNA-binding | PF00447 | 6 |
| B-box zinc finger | PF00643 | 6 |
| Dof domain, zinc finger | PF02701 | 4 |
| WRKY DNA-binding domain | PF03106 | 3 |
| Response regulator receiver domain | PF00072 | 2 |
| bZIP transcription factor | PF00170 | 2 |
| GATA zinc finger | PF00320 | 2 |
| B3 DNA binding domain; ABI3/VP1 transcription factor | PF02362 | 2 |
| SBP domain | PF03110 | 2 |
| GRAS family transcription factor | PF03514 | 2 |
| ZF-HD protein dimerisation region | PF04770 | 2 |
| CCT motif; CO-like protein | PF06203 | 2 |
| Auxin response factor | PF06507 | 2 |
| ARID/BRIGHT DNA binding domain | PF01388 | 1 |
| CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B | PF02045 | 1 |
| TCP family transcription factor | PF03634 | 1 |
| CG-1 domain; CAMTA protein | PF03859 | 1 |
| YABBY protein | PF04690 | 1 |
| Plant protein of unknown function; BZR1/LAT61 family | PF05687 | 1 |
| Whirly transcription factor | PF08536 | 1 |
Figure 3A phylogenetic tree based on MADS, I- and K-domains of deduced proteins. The local bootstrap probability is shown on branches where available. This tree is an unrooted tree. MIKCC-type genes are divided into 13 subfamilies, including gymnosperm-specific DAL10-like subfamily, on the basis of this tree. The genus from which each transcript was isolated is indicated after the deduced protein name. Deduced proteins from C. japonica that were identified in the present study are indicated in red. Deduced proteins from gymnosperms other than C. japonica are indicated in blue, and deduced proteins from basal angiosperms are indicated in green.