Literature DB >> 19014467

Large-scale collection and annotation of full-length enriched cDNAs from a model halophyte, Thellungiella halophila.

Teruaki Taji1, Tetsuya Sakurai, Keiichi Mochida, Atsushi Ishiwata, Atsushi Kurotani, Yasushi Totoki, Atsushi Toyoda, Yoshiyuki Sakaki, Motoaki Seki, Hirokazu Ono, Yoichi Sakata, Shigeo Tanaka, Kazuo Shinozaki.   

Abstract

BACKGROUND: Thellungiella halophila (also known as Thellungiella salsuginea) is a model halophyte with a small plant size, short life cycle, and small genome. It easily undergoes genetic transformation by the floral dipping method used with its close relative, Arabidopsis thaliana. Thellungiella genes exhibit high sequence identity (approximately 90% at the cDNA level) with Arabidopsis genes. Furthermore, Thellungiella not only shows tolerance to extreme salinity stress, but also to chilling, freezing, and ozone stress, supporting the use of Thellungiella as a good genomic resource in studies of abiotic stress tolerance.
RESULTS: We constructed a full-length enriched Thellungiella (Shan Dong ecotype) cDNA library from various tissues and whole plants subjected to environmental stresses, including high salinity, chilling, freezing, and abscisic acid treatment. We randomly selected about 20,000 clones and sequenced them from both ends to obtain a total of 35 171 sequences. CAP3 software was used to assemble the sequences and cluster them into 9569 nonredundant cDNA groups. We named these cDNAs "RTFL" (RIKEN Thellungiella Full-Length) cDNAs. Information on functional domains and Gene Ontology (GO) terms for the RTFL cDNAs were obtained using InterPro. The 8289 genes assigned to InterPro IDs were classified according to the GO terms using Plant GO Slim. Categorical comparison between the whole Arabidopsis genome and Thellungiella genes showing low identity to Arabidopsis genes revealed that the population of Thellungiella transport genes is approximately 1.5 times the size of the corresponding Arabidopsis genes. This suggests that these genes regulate a unique ion transportation system in Thellungiella.
CONCLUSION: As the number of Thellungiella halophila (Thellungiella salsuginea) expressed sequence tags (ESTs) was 9388 in July 2008, the number of ESTs has increased to approximately four times the original value as a result of this effort. Our sequences will thus contribute to correct future annotation of the Thellungiella genome sequence. The full-length enriched cDNA clones will enable the construction of overexpressing mutant plants by introduction of the cDNAs driven by a constitutive promoter, the complementation of Thellungiella mutants, and the determination of promoter regions in the Thellungiella genome.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 19014467      PMCID: PMC2621223          DOI: 10.1186/1471-2229-8-115

Source DB:  PubMed          Journal:  BMC Plant Biol        ISSN: 1471-2229            Impact factor:   4.215


Background

Thellungiella halophila (also known as Thellungiella salsuginea) is well known as a model halophyte for studying abiotic stress tolerance, as the plant exhibits extreme salt and freezing tolerance [1-9]. Thellungiella is closely related to Arabidopsis, and its genes share approximately 90% identity to those of Arabidopsis [1,10,11]. Moreover, Thellungiella is characterized by good features from the perspective of genetic studies, such as small plant size, a short life cycle, a high seed number, and the ability to self-pollinate. Furthermore, as in Arabidopsis, transformation of Thellungiella plants can be accomplished by means of the floral dipping method. Since the sequence identities between Thellungiella and Arabidopsis are very high at the cDNA level, Arabidopsis cDNA microarrays or oligo-microarrays can be used for transcriptome analysis of Thellungiella plants. We previously compared expression levels of various genes between Thellungiella and Arabidopsis plants under normal or high-salinity conditions using an Arabidopsis cDNA microarray composed of 7,000 Arabidopsis genes. Interestingly, a large number of genes known to be inducible by abiotic and biotic stresses were highly expressed in Thellungiella under normal growth conditions [5]. The use of a 70-mer oligoarray with 25 000 Arabidopsis genes revealed that Arabidopsis exhibited a global defense strategy required for bulk protein synthesis, whereas induced genes in Thellungiella were involved in protein folding, modification, and redistribution [2]. However, because of failed hybridization or a low hybridization rate between Arabidopsis DNAs and Thellungiella mRNAs, the data obtained from heterologous microarrays cannot provide an accurate evaluation of the expression levels. Recently, Thellungiella plants (Yukon ecotype) treated with drought, salinity, and freezing stresses were used to construct expressed sequence tag (EST) libraries with a total of 3628 unique genes [9]. A cDNA microarray was established with these cDNAs, and the transcriptional profiles of Thellungiella plants under various stress conditions were obtained [8]. Full-length cDNAs are useful genomic resources not only for genome annotation, but also for the identification of promoter regions, transgenic analyses, biochemical analyses, and determination of the three-dimensional structure of proteins [12]. Full-length enriched cDNA libraries from Arabidopsis [13,14], rice [15], poplar [16,17], wheat [18], maize [19], humans [20], mice [21,22], and Drosophila [23-25] have contributed enormously to elucidating biological processes in these organisms. In previous work, we reported the development of full-length enriched Arabidopsis cDNA libraries from plants grown under different conditions [13,26] using the biotinylated CAP trapper method with trehalose-thermoactivated reverse transcriptase [27-30]. A total of 155 144 RIKEN Arabidopsis Full-Length (RAFL) clones were isolated and clustered into 14 668 non-redundant cDNA groups [14]. Using the full-length cDNAs, we also created a microarray to analyze the expression profiles of Arabidopsis genes under various stress conditions or in various mutants and transgenic plants [12,26]. Using ectopic expression of full-length cDNAs, a novel gain-of-function system, termed the "FOX hunting system" (Full-length cDNA Over-eXpressing gene hunting system) was developed [31]. The Arabidopsis genome sequence and resources, including full-length cDNAs, also provide powerful tools for comparative genomics in furthering the understanding of the biology and evolution of other plant species [2,5,10,32]. In the present study, we constructed a full-length enriched cDNA library from whole Thellungiella plants and various tissues, in addition to cDNAs from seedlings subjected to high salinity, chilling, or freezing stress or to abscisic acid (ABA) treatment. We determined their DNA sequences from both the 5'- and the 3'-ends to permit the functional annotation of the Thellungiella full-length cDNAs, and we discuss their predicted functions related to abiotic stress tolerance.

Results and Discussion

Full-length enriched cDNA library construction and sequencing of 20 000 cDNAs

We used the biotinylated CAP trapper method [29] to construct a full-length cDNA library of Thellungiella halophila (Shandong ecotype) from whole plants as well as from various tissues, including leaves, roots, flowers, siliques, and mature seeds, of plants treated with high salinity, chilling, freezing stress, and ABA (Table 1). The λFLCIII vector [33], which accommodates cDNAs in a broad range of sizes and is useful for the high-efficiency cloning of long cDNA fragments, was used for the construction of the cDNA library. To reduce the frequency of representation of highly expressed mRNAs in the library, normalization procedures [29] were employed in the construction process. The 20000 recombinant clones were randomly selected and sequenced from both ends. We determined 18636 and 16535 sequences from the forward and reverse directions, respectively, and from among the 20000 clones we obtained the forward or reverse sequences of a total of 19429 clones (Table 2). A total of 35171 sequences have been deposited in the DDBJ public sequence database (accession numbers, BY800476 to BY835646). We have named these "RTFL" (RIKEN Thellungiella Full-Length) cDNAs.
Table 1

Collection of RNA sample for constructing a Thellungiella full-length cDNA library

Sample nameConditionTime courseConditionTissues
salt stressNaCl, 250 mM1, 2, 3, 7 and 14 dayagar mediumwhole plants
cold stress4°C2, 4, 8 and 24 hoursoilrosette leaves
freezing stress-6°C1, 2, 4 and 8 hoursoilrosette leaves
ABAABA 50 μM1, 2, 4 and 8 houragar mediumwhole plants
various tissuesnormal conditionsoilsiliques, mature seeds
Table 2

Characteristics of full-length Thellungiella cDNA library

Source of cDNATotal no. clonesNo. forward sequencesNo. reverse sequencesTotal no. sequencesTotal no. singletons after CAP3 analysisTotal no. contigs after CAP3 analysisNo. gene clusters
RTFLa19429186361653535171655674029569

aRTFL, RIKEN Thellungiella halophila full-length cDNA

Collection of RNA sample for constructing a Thellungiella full-length cDNA library Characteristics of full-length Thellungiella cDNA library aRTFL, RIKEN Thellungiella halophila full-length cDNA Figure 1 shows the size distribution of the Thellungiella cDNA inserts from 1161 randomly selected clones. The average size was approximately 1.54 kbp. Our group previously determined 20683 full-read cDNA sequences from the RAFL (RIKEN Arabidopsis Full-Length) cDNA collection, and these sequences are available in the RARGE database [34]. The estimated average size of the RAFL cDNA inserts was 1.495 kbp (Motoaki Seki et al., RIKEN Plant Science Center, unpublished results). The average size of the Thellungiella cDNA inserts was thus slightly longer than the average cDNA inserts from Arabidopsis libraries and similar to those in other plants; for example, the average rice and wheat cDNA lengths are both about 1.5 kbp [10,18].
Figure 1

Size distribution of the RTFL clones. The sequence lengths of the Thellungiella cDNA inserts were determined from a total of 1161 clones by digestion with SfiI (in the cDNA cloning site) or PCR amplification using T3 and T7 primers.

Size distribution of the RTFL clones. The sequence lengths of the Thellungiella cDNA inserts were determined from a total of 1161 clones by digestion with SfiI (in the cDNA cloning site) or PCR amplification using T3 and T7 primers.

Sequence assembly and the proportion of full-length cDNA clones in the library

The 35171 sequences were assembled by using the CAP3 program [35] to evaluate the level of sequence redundancy. Assembling these sequences generated 7402 contigs and 6556 singletons, and the sequenced 20000 cDNAs were clustered into 9569 nonredundant scaffolds that represented distinct genes (Table 2). Figure 2 shows the degree of redundancy in the sequences from the cDNA library. The majority of the cDNAs (6024 cDNAs, 63% of the total), consisted of a single cDNA in a cluster, and only 5% contained more than six cDNAs in a cluster, indicating that the normalization procedures were successful.
Figure 2

Sequence redundancy in the normalized cDNA library. A total of 35171 sequences were assembled and divided into 9569 clusters. The graph represents the number of clones per assembled scaffold. Clusters containing a single cDNA accounted for 63% of the identified sequences, whereas clusters that contained more than six cDNAs accounted for only 5% of the total.

Sequence redundancy in the normalized cDNA library. A total of 35171 sequences were assembled and divided into 9569 clusters. The graph represents the number of clones per assembled scaffold. Clusters containing a single cDNA accounted for 63% of the identified sequences, whereas clusters that contained more than six cDNAs accounted for only 5% of the total. The cDNA sequence data were submitted for the BLASTN search to compare with the green plants (Viridiplantae) mRNA databases in GenBank. Of the 19429 clones that we obtained as clean sequences, 18 295 clones (94%) showed > 80% identity, whereas the remaining clones (6%) had no significant identity to any plant sequences in GenBank. We examined the proportion of full-length cDNA clones in our library. We selected clones that (1) had sequences from both ends, and (2) showed an Expected Value (E) < 1.0e-20 on the basis of a fastx search using forward sequences as queries against Arabidopsis proteins (TIGR v5, ATH1.pep, ), with the correct direction of the reading frame. We considered clones to be full-length if they met the following criteria: (1) they contained the first methionine, and (2) the reverse sequence contained the polyA sequence. Consequently, we selected 12878 clones as calculation objects and classified 10880 (84.5%) of these clones as full-length clones. This frequency is nearly identical to the reported values from the Arabidopsis [14], rice [15], and wheat [18] libraries.

Functional annotation of RTFL cDNAs

The 9569 nonredundant genes were submitted to InterPro [36] to obtain functional domain information. InterPro is an integrated resource for protein families, domains and functional sites that integrates the following protein signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D, and PANTHER. Protein matches in InterPro are pre-calculated by using InterProScan software, which combines the different protein signature recognition methods offered by the InterPro member databases into one resource and provides the corresponding InterPro accession numbers and Gene Ontology (GO) annotations [37]. A total of 8289 sequences were assigned to InterPro IDs and GO terms [see Additional file 1]. According to the obtained GO terms, the 8289 genes were remapped and classified by using Plant GO Slim (; [38]). Figures 3 and 4 show the categorization of Arabidopsis genes and the 8289 Thellungiella genes assigned to the GO terms. In most categories, we observed no obvious differences between the numbers of sequences from Arabidopsis and Thellungiella, including genes involved in biological processes, cellular components, and molecular function. Notably, the number of sequences classified as transcription genes under biological processes in Arabidopsis was approximately twice that in Thellungiella (Fig. 4). Furthermore, the number of Arabidopsis genes associated with the nucleus under cellular components was also much higher than that in Thellungiella (Fig. 3A). Although the total number of transcription factors is not clear in Thellungiella genome, there seems to be no significant difference in the proportion of transcription factor genes between Arabidopsis and Thellungiella genome. The population of cDNAs may reflects the levels of gene expression. Thus, the expression level of Thellungiella transcription factors may be lower than that of Arabidopsis. Arabidopsis responds strongly to abiotic and biotic stresses at the transcriptional level. In contrast, Thellungiella does not initiate immediate changes in transcription in response to abiotic stresses, and instead constitutively expresses a large number of genes that correspond to stress-inducible genes in Arabidopsis [5]. Thus, the difference between Arabidopsis and Thellungiella in their responses to various stimuli at the transcriptional level may reflect differences in the number of transcription-related genes between the organisms and may depend partly on the number of transcription factors.
Figure 3

Comparison of the categories of Arabidopsis and nonredundant . The 28227 Arabidopsis genes and the 8298 nonredundant Thellungiella genes that were assigned InterPro IDs were classified according to the GO terms using Plant GO Slim into categories based on (A) cellular components and (B) molecular function.

Figure 4

Biological process categories for Arabidopsis genes, nonredundant . The 28227 Arabidopsis genes, the 8298 nonredundant Thellungiella genes, and the 763 Thellungiella genes that showed low identity to Arabidopsis genes assigned to InterPro IDs were classified according to the GO terms using Plant GO Slim for biological processes. † indicates the categories in which the number of Thellungiella genes was more than 1.5 times the number of Arabidopsis genes. * indicates categories in which the number of Thellungiella genes was more than 1.5 times the number of Arabidopsis genes. The number of Thellungiella genes under the categories of transport, DNA metabolic process, generation of precursor metabolites and energy, response to abiotic stimulus, multicellular organismal development, response to external stimulus, and cell differentiation was more than 1.5 times that in Arabidopsis.

Comparison of the categories of Arabidopsis and nonredundant . The 28227 Arabidopsis genes and the 8298 nonredundant Thellungiella genes that were assigned InterPro IDs were classified according to the GO terms using Plant GO Slim into categories based on (A) cellular components and (B) molecular function. Biological process categories for Arabidopsis genes, nonredundant . The 28227 Arabidopsis genes, the 8298 nonredundant Thellungiella genes, and the 763 Thellungiella genes that showed low identity to Arabidopsis genes assigned to InterPro IDs were classified according to the GO terms using Plant GO Slim for biological processes. † indicates the categories in which the number of Thellungiella genes was more than 1.5 times the number of Arabidopsis genes. * indicates categories in which the number of Thellungiella genes was more than 1.5 times the number of Arabidopsis genes. The number of Thellungiella genes under the categories of transport, DNA metabolic process, generation of precursor metabolites and energy, response to abiotic stimulus, multicellular organismal development, response to external stimulus, and cell differentiation was more than 1.5 times that in Arabidopsis.

Features of Thellungiella-specific genes

Most Thellungiella genes have a high sequence identity (approximately 90% at the cDNA level) to Arabidopsis genes. Numerous studies of salt tolerance in Arabidopsis suggest that this plant contains most, if not all, the salt-tolerance related genes that might be found in halophytes[39]. The current hypothesis is that halophytes employ salt-tolerance mechanisms similar to those found in glycophytes, including Arabidopsis. However, subtle differences in this regulation result in large variations in salt tolerance between glycophytes and halophytes[11]. In addition, halophytes are hypothesized to exhibit specific salt-tolerance mechanisms resulting from the induction of halophyte-specific genes. We divided the 8298 genes into two groups on the basis of their sequence identities, using BLASTX searches against the Arabidopsis database. The group with high sequence identity to Arabidopsis genes (E value ≤ 1.0e-50) included 7535 genes, and the group with low identity to Arabidopsis genes (E value > 1.0e-50) included 763 genes. Previous studies revealed that a plasma membrane Na+/H+ antiporter (SOS1), a vacuolar Na+/H+ antiporter (NHX1), and a plasma membrane Na+ transporter (HKT1) are essential for the salt tolerance of Arabidopsis [40-42], and these mutants exhibit a salt-hypersensitive phenotype. In contrast, plants that overexpress SOS1 and NHX1 show higher salt tolerance than wild-type plants [43,44]. The co-ortholog Thellungiella genes belong to the first gene group, exhibiting high identity to Arabidopsis genes. This suggests that some salt-tolerance mechanisms are common to both glycophytes and halophytes. We compared the categorization of the whole Arabidopsis genome with the categories of the 763 Thellungiella genes that exhibited low identity to Arabidopsis genes (Fig. 4). Of the genes involved in biological processes, the number of genes in the categories for transport, DNA metabolic process, generation of precursor metabolites and energy, response to abiotic stimulus, multicellular organismal development, response to external stimulus, and cell differentiation in Thellungiella were more than 1.5 times the number in Arabidopsis (Fig. 4). Moreover, in regards to molecular function, the proportion of genes involved in transporter activity in Thellungiella was also higher than in Arabidopsis. Less NaCl accumulates in Thellungiella plants than in Arabidopsis under similar salinity conditions, suggesting that Thellungiella has a superior system for suppressing Na+ influx or for excreting Na+ [5]. Electrophysiological analysis indicates that Thellungiella also exhibits high potassium/sodium selectivity, implying that Thellungiella has specific ion channel features that lead to superior homeostasis with respect to sodium and potassium [7]. Arabidopsis that overexpresses a plasma membrane Na+/H+ antiporter gene, SOS1, shows salinity tolerance and represses its sodium uptake compared with that of wild-type plants [44]. Likewise, the expression level of SOS1 in Thellungiella is higher than in Arabidopsis [5,45]. Although SOS1 overexpression suggests a contribution of this gene to the salt tolerance of Thellungiella, the large proportion of transport genes may imply that Thellungiella has a distinct ion transportation system regulated by these specific genes.

Salt tolerance system using Thellungiella-specific transporter genes

Table 3 [see Additional file 2] lists the Thellungiella genes with low identity (E value > 1.0e-50) to the Arabidopsis genes classified under transporter genes. Several transporters, including chloride channels and P-type H+-ATPase, play important roles in the salt tolerance of plants. Homeostasis of Na+ and Cl- is an important mechanism to reduce NaCl stress in higher plants. Chloride channels (CLCs) are a group of voltage-gated Cl- channels originally identified in animals [46]; they have diverse cellular functions such as stabilizing cell membrane potential and regulating cell volume and transcellular chloride transport [47]. Recently, a chloride channel gene, GmCLC1, was cloned from soybeans [48]. Transgenic tobacco BY-2 cells expressing GmCLC1 were able to drain Cl- more efficiently from vacuoles than was the case in untransformed BY-2 cells, and the transgenics showed a higher NaCl tolerance [48]. The plant cell membrane is energized by an electrochemical gradient produced by P-type H+-ATPase (proton pump). These pumps are encoded by at least 12 genes in Arabidopsis. One of the Arabidopsis P-type H+-ATPase genes, AHA4, was expressed most strongly in the root endodermis [49]. The aha4 mutant plants exhibited a clear growth reduction under a mild stress of 75 mM NaCl compared with wild-type plants, and the ratio of Na+ to K+ in the aha4 mutants increased to between four and five times the values in wild-type plants. These results suggest that the aha4 mutants were compromised in their ability to exclude Na+ under salinity stress [49]. P-type H+-ATPases were also found in a halotolerant cyanobacterium, Aphanothece halophytica, and a marine alga, Tetraselmis viridis [50,51]. Aphanothece halophytica grows under a wide range of salinity conditions (from 0.25 to 3.0 M NaCl), and Na+/H+ antiporters in A. halophytica play a crucial role in Na+ efflux to provide enhanced salt tolerance. Since the efflux of Na+ mediated by Na+/H+ antiporters utilizes protons as the motive force provided by a primary proton pump, H+-ATPase, the P-type H+-ATPase is thought to contribute to the salt tolerance of this species [52]. On the other hand, vacuolar ATPase (V-ATPase) is the major proton pump that establishes and maintains an electrochemical proton gradient across the tonoplast. Expression of several V-ATPase subunits or an increase in V-ATPase activity induced by salt stress has been observed in a number of glycophytic species [53], suggesting that increased V-ATPase levels or activity are required to drive Na+ sequestration under salt stress. Recently, the V-ATPase-deficient det3 Arabidopsis mutant was shown to be extremely salt sensitive. Moreover, SOS2, a protein kinase that phosphorylates SOS1, interacted directly with the V-ATPase regulatory subunits B1 and B2 [54]. These studies indicate that V-ATPase activity plays a key role in salt tolerance. Although most Thellungiella genes show approximately 90% identity with Arabidopsis genes, the Thellungiella genes encoding transporters appear to be remarkably different from their Arabidopsis co-orthologs. Whether the sequence diversities among these genes are the source of the large differences in salt tolerance between Thellungiella and Arabidopsis is a topic of great interest.
Table 3

Thellungiela genes showing low identity against Arabidopsis genes classified in 'transport' using GO slima

Clone nameInterPro IDDescriptionAGI codebE value c
RTFL01-07-H15IPR001807Chloride channel, voltage gatedAT5G40890.21.00E-49

RTFL01-12-M19IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT1G60190.16.00E-45

RTFL01-29-P17IPR000463Cytosolic fatty-acid bindingAT2G25590.18.00E-45

RTFL01-05-O18IPR000803Facilitated glucose transporterAT3G58130.25.00E-43

RTFL01-21-M15IPR003612Plant lipid transfer protein/seed storage/trypsin-alpha amylase inhibitorAT2G38540.15.00E-43

RTFL01-24-G14IPR000264Serum albuminAT5G09460.15.00E-43

RTFL01-49-P05IPR003612Plant lipid transfer protein/seed storage/trypsin-alpha amylase inhibitorAT2G38540.15.00E-43

RTFL01-36-E03IPR007271Nucleotide-sugar transporterAT5G65000.23.00E-41

RTFL01-05-G14IPR001993Mitochondrial substrate carrierAT5G42130.13.00E-38

RTFL01-14-G02IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT1G08010.21.00E-34

RTFL01-43-C04IPR002075Nuclear transport factor 2AT1G69250.14.00E-34

RTFL01-21-H04IPR004240Nonaspanin (TM9SF)AT4G12650.15.00E-34

RTFL01-18-D17IPR006455Homeobox domain, ZF-HD classAT5G42780.12.00E-31

RTFL01-07-P02IPR005829Sugar transporter superfamilyAT4G10410.12.00E-30

RTFL01-08-J20IPR001757ATPase, P-type, K/Mg/Cd/Cu/Zn/Na/Ca/Na/H-transporterAT2G31150.13.00E-28

RTFL01-06-N05IPR003612Plant lipid transfer protein/seed storage/trypsin-alpha amylase inhibitorAT3G18840.21.00E-27

RTFL01-40-M18IPR004240Nonaspanin (TM9SF)AT1G10950.11.00E-27

RTFL01-11-J21IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT3G24503.17.00E-27

RTFL01-39-I23IPR008389ATPase, V0 complex, subunit HAT4G26710.22.00E-23

RTFL01-52-J14IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT3G54760.15.00E-21

RTFL01-33-P14IPR000568ATPase, F0 complex, subunit AAT4G13740.16.00E-15

RTFL01-01-D06IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT1G29760.16.00E-15

RTFL01-20-A07IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT1G29760.16.00E-15

RTFL01-28-P24IPR001622Voltage-dependent potassium channelAT5G55430.13.00E-13

RTFL01-25-D12IPR000245ATPase, V0 complex, proteolipid subunit C,AT1G75630.17.00E-08

RTFL01-22-P12IPR000264Serum albuminAT5G09460.11.00E-07

RTFL01-03-P24IPR006121Heavy metal transport/detoxification proteinAT5G11890.12.00E-06

RTFL01-40-P02IPR002946Intracellular chloride channelAT5G08450.33.00E-04

RTFL01-03-G04IPR003663Sugar transporterAT5G50540.10.001

RTFL01-11-N06IPR000109TGF-beta receptor, type I/II extracellular regionAT3G55610.10.019

RTFL01-13-J08IPR005829Sugar transporter superfamilyAT5G49665.10.073

RTFL01-11-G05IPR007114Major facilitator superfamilyAT1G05300.20.075

RTFL01-17-J21IPR000194ATPase, F1/V1/A1 complex, alpha/beta subunit, nucleotide-bindingAT4G19830.10.25

RTFL01-38-L07IPR011116SecA Wing and ScaffoldAT3G55160.10.76

RTFL01-20-G20IPR004100ATPase, F1/V1/A1 complex, alpha/beta subunit, N-terminalAT5G60470.11.2

a The 9,569 nonredundant Thellungiella genes were submitted to InterPro. The 763 Thellungiella genes exhibiting low identity (E value > 1.0e-50) against Arabidopsis genes assigned to InterPro ID were classified according to the GO terms using GO slim.

b Arabidopsis gene showing the highest identity with the Thellungiella cDNA clone.

c Sequence identity between the Thellungiella clone and the Arabidopsis counterpart (show AGI code) using BLASTX searches

against Arabidopsis database.

Thellungiela genes showing low identity against Arabidopsis genes classified in 'transport' using GO slima a The 9,569 nonredundant Thellungiella genes were submitted to InterPro. The 763 Thellungiella genes exhibiting low identity (E value > 1.0e-50) against Arabidopsis genes assigned to InterPro ID were classified according to the GO terms using GO slim. b Arabidopsis gene showing the highest identity with the Thellungiella cDNA clone. c Sequence identity between the Thellungiella clone and the Arabidopsis counterpart (show AGI code) using BLASTX searches against Arabidopsis database.

Conclusion and cDNA resources

We generated a full-length enriched cDNA library of Thellungiella halophila from various tissues and whole plants treated with salinity, chilling, freezing stresses, or ABA. We isolated about 20000 full-length enriched cDNA clones (RTFL cDNAs) and sequenced them from both ends, and we outlined the features of their predicted functions (coding Thellungiella proteins) by comparing them with those of Arabidopsis. Moreover, the 35171 RTFL cDNA sequences have been deposited in the DDBJ public data center. The number of T. halophila (T. salsuginea) ESTs entries was 9388 as of July 2008, which means that our effort has increased the number of ESTs by four times the number before our study. Our sequences will thus contribute to correct annotation of the Thellungiella genome sequence in the near future. The RTFL cDNA clones will also enable the construction of overexpressing mutant plants by introduction of the cDNAs driven by a constitutive promoter, as well as the complementation of Thellungiella mutants and the determination of promoter regions in the Thellungiella genome. The RTFL clones will be available for distribution through the RIKEN Bioresource Center .

Methods

Plant materials and stress treatments

Thellungiella halophila (Shandong ecotype) seeds were sown on Murashige and Skoog (MS) plates containing 0.8% (wt/vol) agar and 1% sucrose. The seeds were stratified at 4°C for two weeks and then transferred to 22°C under continuous light for germination and growth. Three weeks after germination, seedlings of Thellungiella were transferred to 250 mM NaCl (salt stress) or 50 μM ABA (ABA treatment) water, or were transferred onto separate 9-cm plastic pots filled with a 1:1 mixture of perlite/vermiculite and watered with 1000-fold diluted Hyponex™ (Hyponex, Osaka, Japan). One week after transfer onto the soil pots under 16 hours light – 8 hours darkness at 22°C, the seedlings were subjected to 4 °C (cold stress) or -6°C (freezing stress) in a growth chamber under 24 hours darkness. A Thellungiella full-length cDNA library was constructed from a mixture of mRNA extracted from stress-treated plants and various tissues of Thellungiella. Thellungiella plants were subjected to various stress treatments: high-salinity (250 mM NaCl for 1, 2, 3, 7, and 14 days), cold temperatures (4°C for 2, 4, 8, and 24 hours), freezing temperatures (-6°C for 1, 2, 4, or 8 hours), or ABA (50 μM ABA for 1, 2, 4, and 8 hours). Control plants were grown under unstressed conditions under16 hours light – 8 hours darkness at 22°C. After the stress treatments, mRNA was extracted from whole plants (salt stress and ABA treatment) or rosette leaves (cold and freezing stresses) collected at each point in time. Rosette leaves and cauline leaves, roots, flowers, and siliques were collected from 7- to 10-week-old plants, and mature seeds were collected from 12- to 20-week-old plants.

RNA extraction and construction of a full-length cDNA library

Total RNA was prepared by using TRIZOL Reagent (Life Technologies, Rockville, MD, USA) from the treated samples. A full-length cDNA library was constructed as previously reported [14,27,28] by means of the biotinylated CAP trapper method using trehalose-thermoactivated reverse transcriptase [28]. We used the λFLCIII [33] vector, which accommodates cDNAs in a broad range of sizes and is thus useful for the high-efficiency cloning of long cDNA fragments, for construction of the cDNA libraries [33]. The λFLCIII vectors can also be bulk-excised by a Cre-lox-based system free of size bias to generate the plasmid libraries. Normalization [29] was also introduced in the construction of the full-length cDNA library to reduce the frequency of highly expressed mRNAs in the library. The method is based on hybridization of first-strand, full-length cDNA as the tester and cellular biotinylated RNA extracted from stress-treated plants and various tissues of Thellungiella as the normalizing driver.

Sequencing of Thellungiella cDNA clones

The DNA of each clone was directly amplified from 384 bacterial cultures in a glycerol stock plate by the RCA method [55] using a TempliPhi HT DNA amplification kit (GE Healthcare, United Kingdom). End sequencing of 20000 clones was performed using ABI 3700 capillary sequencers (Applied Biosystems, Foster City, CA, U.S.). The M13 (-21) primer (5'-TGTAAAACGACGGCCAGT-3') and the 1233 primer (5'-AGCGGATAACAATTTCACACAGGA-3') were used for forward and reverse sequencing, respectively.

Trimming of sequence data and assembly

We used sim4 software for the detection of vector sequences [56]. Raw sequence data were base-called using the Phred software [57,58]. Regions of low quality found at both edges of each raw sequence were discarded, and we extracted only the high-quality region (Phred quality score > 14, and more than 20 bases repeated). After this initial evaluation, sequence data shorter than 100 bases in length or with many low-quality regions (Phred quality score ≤ 14, and more than 50% of its total length) were omitted. The ESTs were assembled by using CAP3 software [35] with its default parameters. All sequences were submitted to the DNA Databank of Japan (DDBJ) with accession numbers BY800476 to BY835646.

cDNA insert size of the RTFL clones

The sequence lengths of the Thellungiella cDNA inserts were determined from a total of 1,161 clones by digestion with SfiI (in the cDNA cloning site) or PCR amplification using T3 (5'-TGTAAAACGACGGCCAGT-3') and T7 primers (5'-AATACGACTCACTATAGGG-3').

Full-length cDNA library quality

To examine the proportion of full-length cDNA clones in this library, we selected the following clones as calculation objects: (1) clones with sequences from both ends, and (2) clones showing an Expected Value of (E) < 1.0e-20 in a fastx search using forward sequences as queries against Arabidopsis proteins (TIGR v5, ATH1.pep), with the correct direction of the reading frame. We used the following criteria to classify clones as full-length: (1) clones must include the first methionine, and (2) the reverse reading sequence must include the polyA sequence.

Scaffold construction

In order to obtain a non-redundant set of transcripts, the clones were clustered according to clone names. To accomplish this, we parsed the .ace file from the CAP3 program output to build scaffolds, which are groups of sequences that represent a unique transcript for which the relative position and orientation of the fragments can be inferred. Using clone names, the contigs or singletons corresponding to the two ends of a given clone were joined by adding 20 N's in the middle of both sequences. Since 20 is more than the default window size in BLAST searches, these N's did not interfere with the BLAST analyses.

Functional annotation of the sequences

Once these scaffolds were created, the sequences were submitted to InterPro [36] to obtain functional domain information. Protein matches in InterPro were pre-calculated with InterProScan software, available from [37,59]. InterProScan provided the corresponding InterPro accession numbers and GO annotation in the results [37]. The genes assigned to InterPro ID were classified according to the GO terms developed by InterPro using Plant GO Slim (; [38]).

Authors' contributions

TT contributed to and participated in the entire study and drafted the manuscript. TS, KM, AI, and AK performed the bioinformatics analyses (assembly, clustering, annotation and comparative analysis). YT carried out annotation and registration in DDBJ. AT and YS conducted sequencing of the cDNA clones. MS assisted in the construction of the cDNA library. HO checked the length distribution of the cDNA inserts. YS and ST helped draft the manuscript. KS coordinated the project and helped draft the manuscript.

Additional file 1

List of 9,569 culusters with accession numbers and the annotation. Click here for file

Additional file 2

Thellungiela genes showing low identity against Arabidopsis genes classified in 'transport' using GO slim with the most homologous gene. Click here for file
  59 in total

1.  Plant salt tolerance.

Authors:  J K Zhu
Journal:  Trends Plant Sci       Date:  2001-02       Impact factor: 18.313

Review 2.  Genetic analysis of plant salt tolerance using Arabidopsis.

Authors:  J K Zhu
Journal:  Plant Physiol       Date:  2000-11       Impact factor: 8.340

3.  Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes.

Authors:  P Carninci; Y Shibata; N Hayatsu; Y Sugahara; K Shibata; M Itoh; H Konno; Y Okazaki; M Muramatsu; Y Hayashizaki
Journal:  Genome Res       Date:  2000-10       Impact factor: 9.043

4.  Balanced-size and long-size cloning of full-length, cap-trapped cDNAs into vectors of the novel lambda-FLC family allows enhanced gene discovery rate and functional analysis.

Authors:  P Carninci; Y Shibata; N Hayatsu; M Itoh; T Shiraki; T Hirozane; A Watahiki; K Shibata; H Konno; M Muramatsu; Y Hayashizaki
Journal:  Genomics       Date:  2001-09       Impact factor: 5.736

5.  Rapid amplification of plasmid and phage DNA using Phi 29 DNA polymerase and multiply-primed rolling circle amplification.

Authors:  F B Dean; J R Nelson; T L Giesler; R S Lasken
Journal:  Genome Res       Date:  2001-06       Impact factor: 9.043

6.  The Arabidopsis thaliana salt tolerance gene SOS1 encodes a putative Na+/H+ antiporter.

Authors:  H Shi; M Ishitani; C Kim; J K Zhu
Journal:  Proc Natl Acad Sci U S A       Date:  2000-06-06       Impact factor: 11.205

7.  A Drosophila complementary DNA resource.

Authors:  G M Rubin; L Hong; P Brokstein; M Evans-Holm; E Frise; M Stapleton; D A Harvey
Journal:  Science       Date:  2000-03-24       Impact factor: 47.728

8.  Functional annotation of a full-length mouse cDNA collection.

Authors:  J Kawai; A Shinagawa; K Shibata; M Yoshino; M Itoh; Y Ishii; T Arakawa; A Hara; Y Fukunishi; H Konno; J Adachi; S Fukuda; K Aizawa; M Izawa; K Nishi; H Kiyosawa; S Kondo; I Yamanaka; T Saito; Y Okazaki; T Gojobori; H Bono; T Kasukawa; R Saito; K Kadota; H Matsuda; M Ashburner; S Batalov; T Casavant; W Fleischmann; T Gaasterland; C Gissi; B King; H Kochiwa; P Kuehl; S Lewis; Y Matsuo; I Nikaido; G Pesole; J Quackenbush; L M Schriml; F Staubli; R Suzuki; M Tomita; L Wagner; T Washio; K Sakai; T Okido; M Furuno; H Aono; R Baldarelli; G Barsh; J Blake; D Boffelli; N Bojunga; P Carninci; M F de Bonaldo; M J Brownstein; C Bult; C Fletcher; M Fujita; M Gariboldi; S Gustincich; D Hill; M Hofmann; D A Hume; M Kamiya; N H Lee; P Lyons; L Marchionni; J Mashima; J Mazzarelli; P Mombaerts; P Nordone; B Ring; M Ringwald; I Rodriguez; N Sakamoto; H Sasaki; K Sato; C Schönbach; T Seya; Y Shibata; K F Storch; H Suzuki; K Toyo-oka; K H Wang; C Weitz; C Whittaker; L Wilming; A Wynshaw-Boris; K Yoshida; Y Hasegawa; H Kawaji; S Kohtsuki; Y Hayashizaki
Journal:  Nature       Date:  2001-02-08       Impact factor: 49.962

9.  Evidence for a role in growth and salt resistance of a plasma membrane H+-ATPase in the root endodermis.

Authors:  V Vitart; I Baxter; P Doerner; J F Harper
Journal:  Plant J       Date:  2001-08       Impact factor: 6.417

10.  Construction of a full-length cDNA library from young spikelets of hexaploid wheat and its characterization by large-scale sequencing of expressed sequence tags.

Authors:  Yasunari Ogihara; Keiichi Mochida; Kanako Kawaura; Koji Murai; Motoaki Seki; Asako Kamiya; Kazuo Shinozaki; Piero Carninci; Yoshihide Hayashizaki; Tadasu Shin-I; Yuji Kohara; Yukiko Yamazaki
Journal:  Genes Genet Syst       Date:  2004-08       Impact factor: 1.517

View more
  32 in total

Review 1.  Life at the extreme: lessons from the genome.

Authors:  Dong-Ha Oh; Maheshi Dassanayake; Hans J Bohnert; John M Cheeseman
Journal:  Genome Biol       Date:  2012       Impact factor: 13.583

2.  Insights into salt tolerance from the genome of Thellungiella salsuginea.

Authors:  Hua-Jun Wu; Zhonghui Zhang; Jun-Yi Wang; Dong-Ha Oh; Maheshi Dassanayake; Binghang Liu; Quanfei Huang; Hai-Xi Sun; Ran Xia; Yaorong Wu; Yi-Nan Wang; Zhao Yang; Yang Liu; Wanke Zhang; Huawei Zhang; Jinfang Chu; Cunyu Yan; Shuang Fang; Jinsong Zhang; Yiqin Wang; Fengxia Zhang; Guodong Wang; Sang Yeol Lee; John M Cheeseman; Bicheng Yang; Bo Li; Jiumeng Min; Linfeng Yang; Jun Wang; Chengcai Chu; Shou-Yi Chen; Hans J Bohnert; Jian-Kang Zhu; Xiu-Jie Wang; Qi Xie
Journal:  Proc Natl Acad Sci U S A       Date:  2012-07-09       Impact factor: 11.205

Review 3.  Genomics and bioinformatics resources for crop improvement.

Authors:  Keiichi Mochida; Kazuo Shinozaki
Journal:  Plant Cell Physiol       Date:  2010-03-05       Impact factor: 4.927

Review 4.  Halophytism: What Have We Learnt From Arabidopsis thaliana Relative Model Systems?

Authors:  Yana Kazachkova; Gil Eshel; Pramod Pantha; John M Cheeseman; Maheshi Dassanayake; Simon Barak
Journal:  Plant Physiol       Date:  2018-09-20       Impact factor: 8.340

5.  A novel approach to dissect the abscission process in Arabidopsis.

Authors:  Zinnia Haydee González-Carranza; Ahmad Ali Shahid; Li Zhang; Yang Liu; Unchalee Ninsuwan; Jeremy Alan Roberts
Journal:  Plant Physiol       Date:  2012-09-19       Impact factor: 8.340

6.  Genome structures and halophyte-specific gene expression of the extremophile Thellungiella parvula in comparison with Thellungiella salsuginea (Thellungiella halophila) and Arabidopsis.

Authors:  Dong-Ha Oh; Maheshi Dassanayake; Jeffrey S Haas; Anna Kropornika; Chris Wright; Matilde Paino d'Urzo; Hyewon Hong; Shahjahan Ali; Alvaro Hernandez; Georgina M Lambert; Gunsu Inan; David W Galbraith; Ray A Bressan; Dae-Jin Yun; Jian-Kang Zhu; John M Cheeseman; Hans J Bohnert
Journal:  Plant Physiol       Date:  2010-09-10       Impact factor: 8.340

7.  Molecular characterization of Brassica napus stress related transcription factors, BnMYB44 and BnVIP1, selected based on comparative analysis of Arabidopsis thaliana and Eutrema salsugineum transcriptomes.

Authors:  Roohollah Shamloo-Dashtpagerdi; Hooman Razi; Esmaeil Ebrahimie; Ali Niazi
Journal:  Mol Biol Rep       Date:  2018-07-23       Impact factor: 2.316

8.  The Thellungiella salsuginea tonoplast aquaporin TsTIP1;2 functions in protection against multiple abiotic stresses.

Authors:  Li-Li Wang; An-Ping Chen; Nai-Qin Zhong; Ning Liu; Xiao-Min Wu; Fang Wang; Chun-Lin Yang; Michael F Romero; Gui-Xian Xia
Journal:  Plant Cell Physiol       Date:  2013-11-09       Impact factor: 4.927

9.  QuickGO: a user tutorial for the web-based Gene Ontology browser.

Authors:  Rachael P Huntley; David Binns; Emily Dimmer; Daniel Barrell; Claire O'Donovan; Rolf Apweiler
Journal:  Database (Oxford)       Date:  2009-09-29       Impact factor: 3.451

10.  A large insert Thellungiella halophila BIBAC library for genomics and identification of stress tolerance genes.

Authors:  Weiquan Wang; Yaorong Wu; Yin Li; Jiaying Xie; Zhonghui Zhang; Zhiyong Deng; Yiyue Zhang; Cuiping Yang; Jianbin Lai; Huawei Zhang; Hongyan Bao; Sanyuan Tang; Chengwei Yang; Peng Gao; Guixian Xia; Huishan Guo; Qi Xie
Journal:  Plant Mol Biol       Date:  2009-09-29       Impact factor: 4.076

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.