| Literature DB >> 15453914 |
Lila O Vodkin1, Anupama Khanna, Robin Shealy, Steven J Clough, Delkin Orlando Gonzalez, Reena Philip, Gracia Zabala, Françoise Thibaud-Nissen, Mark Sidarous, Martina V Strömvik, Elizabeth Shoop, Christina Schmidt, Ernest Retzel, John Erpelding, Randy C Shoemaker, Alicia M Rodriguez-Huete, Joseph C Polacco, Virginia Coryell, Paul Keim, George Gong, Lei Liu, Jose Pardinas, Peter Schweitzer.
Abstract
BACKGROUND: Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15453914 PMCID: PMC526184 DOI: 10.1186/1471-2164-5-73
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Steps in the construction and documentation of cDNA microarrays using a low redundancy soybean 'unigene' set of 27,513 cDNA clones. See Methods for details. (a) NSF Plant Genome Program project "A Functional Genomics Program for Soybean" (NSF DBI #9872565); (b) Soybean Public EST project [5]; (c) Washington University Genome Center, St. Louis, MO; (d) Center for Computational Genomics and Bioinformatics, University of Minnesota, Minneapolis, MN [20]; (e) Genome Systems, St. Louis, MO, until its closure; (f) Keck Center for Comparative and Functional Genomics, University of Illinois, Urbana, IL [21] (g) Soybean Functional Genomics, Department of Crop Sciences, University of Illinois, Urbana, IL [30]; (h) databases maintained by the National Center for Biotechnology Information, Bethesda, MD [22].
Figure 2Gene discovery is increased by selection of weakly expressed cDNAs clones from a cDNA library made from immature cotyledons. (A) Phosphorimager pattern: A high density membrane containing 18,432 double spotted colonies from the Gm-c1007 cDNA library made from immature cotyledons was hybridized with 33P-labeled cDNAs transcribed from mRNAs isolated from immature cotyledons. (B) Graphical representation of the new cDNAs selected by the normalization process using filter hybridization. Circles represent the 931 total sequences obtained from the non-normalized source cDNA library Gm-c1007 versus the 1799 sequences of Gm-r1030 that were selected as weakly expressed sequences from the filter hybridization experiments shown in part (A). The intersection of the circles represent sequences common to both sets. H = number of sequences with hits in the databases; N = number of sequences that did not have a hit in the databases; and T = total number of sequences.
Comparison of the percent unique sequences as determined by either CAP3 or Phrap analysis for the 5' and 3' ESTs represented in each of the four successive reracked clone subsets that constitute the low redundancy soybean 'unigene' set
| Rerack order & name | Number cDNAs | No. ESTs clustereda | Cap3b | Phrapb | % Unique ESTsc Cap3 or Phrap |
| 1. Gm-r1021 | 4,089 | 2,797, 5' | 2,202 s 259 c | 2,054 s 334 c | 88.0% : 80.4% |
| 1. Gm-r1021 | 4,089 | 2,797, 3' | 1,836 s 413 c | 1,682 s 505 c | 85.4% : 78.2% |
| 2. Gm-r1070 | 9,216 | 6,938 5' | 5,566 s 620 c | 5,116 s 831 c | 89.2% : 78.0% |
| 2. Gm-r1070 | 9,216 | 6,938 3' | 4,284 s 1,124 c | 3,900 s 1,340 c | 85.7% : 75.5% |
| 3. Gm-r1083 | 4,992 | 3,879 5' | 3,426 s 200 c | 3,289 s 260 c | 93.5% : 79.7% |
| 3. Gm-r1083 | 4,992 | 3,879 3' | 2,474 s 599 c | 2,256 s 723 c | 91.5% : 76.8% |
| 4. Gm-r1088 | 9,216 | 7,434 5' | 6,295 s 521 c | 5,909 s 745 c | 91.7% : 89.5% |
| 4. Gm-r1088 | 9,216 | 7,434 3' | 4,719 s 1,173 c | 4,152 s 1,513 c | 79.3% : 76.2% |
| Entire set, 1–4 | 27,513 | 27,513 5d | 21,873 s 2,402 c | 18,663 s 3,966 c | 88.2% : 81.2% |
| Entire set, 1–4 | 27,513 | 21,048 3' | 11,959 s 4,156 c | 8,341 s 5,641 c | 73.0% : 63.3% |
a Unless otherwise noted, the ESTs included in the cluster analysis represent only the cDNAs for which both the 5' and 3' sequences are known and for which the read length is over 200 bases.
b The number of singletons (s) and number of contigs (c) are shown.
c The % unique sequences is the number of singletons plus the number of contigs divided by the total number of ESTs.
d In this analysis, all 5' sequences were included even if the corresponding 3' sequence was not known.
Information contained in a comprehensive cross list of soybean unigene clone IDs. Shown are various identifiers and annotations for 27,513 reracked cDNAs used in microarray construction. The full list is provided with arrays and available upon request.
| Cross List Identifiers (for each cDNA clone) | Example (one of 27,543 cDNAs) | Comments |
| Reracked Clone ID | Gm-r1021-12 | The individual cDNA clone ID in the 384-well destination plates after reracking or rearraying of the selected clones from the cDNA source library plates. |
| Reracked Plate ID | Gm-r1021 #1 | The 384-well reracked plate name in increments of 384 (ie., 1, 385, 769, etc.) |
| Reracked row_column position | A12 | Position of the clone in the 384-well reracked plate |
| Reracked 3' Keck Sequence ID | GM210001A21A6 | Sequence identifier assigned by the Keck Center for the 3' EST |
| Reracked 3' Genbank Accession | AW348131 | Genbank assigned accession number for the 3' EST |
| Reracked 3' Annotation | glutathione S-transferase GST 22 [Glycine max] | Top BLASTX hit for the 3' EST, at E10-6 or lower |
| Source Clone ID | Gm-c1004-464 | The individual cDNA clone ID in the 384-well source plate. |
| Source Plate ID | Gm-c1004 #385 | The 384-well source plate name in increments of 384 (ie., 1, 385, 769, etc.) |
| Source row_column position | D8 | Position of the clone in the 384-well source plate |
| Source WashU Sequence ID | sa26h04.y1 | Sequence identifier assigned by Washington University, 5' EST |
| Source 5' Genbank Accession | AI442436 | Genbank assigned accession number for the 5' EST |
| Source 5' Annotation | glutathione S-transferase GST 22 [Glycine max] | Top BLASTX hit for the 5' EST at E10-6 or lower |
| Source Library | Gm-c1004 | Name of the cDNA source library |
| Cultivar/Genotype | Williams | Specific information on the soybean variety or genotype |
| Tissue/Developmental Stage | Entire roots of 8-day old seedlings | Tissue/organ system/stage from which the cDNA library was constructed |
Soybean microarrays and low redundancy and low redudancy unigene sets built from the public EST collection
| Microarrays and Reracked Unigene cDNA sets a | Source cDNA Library a | No. of cDNAs on array | Soybean Variety | Soybean tissues b |
| Gm-r1070 | Gm-c1016 | 2242 | Williams 82 | immature flowers |
| " | Gm-c1015 | 1696 | Williams 82 | mature flowers |
| " | Gm-c1008 | 869 | Williams | whole young pods (2 cm) |
| " | Gm-c1029 | 589 | Williams | immature cotyledons from 25–50 mg fresh weight seed |
| " | Gm-c1010 | 234 | Williams | immature cotyledons 100–200 mg seed fresh wt. |
| " | Gm-c1011 | 88 | Williams | immature cotyledons 100–200 mg seed fresh wt. |
| " | Gm-c1007 | 528 | Williams | immature cotyledons 100–300 mg seed fresh wt. |
| " | Gm-c1030 | 1200 | Williams | immature cotyledons 100–300 mg seed fresh wt. low expressing cDNAs fromGm-c1007 filter hybridizations |
| " | Gm-c1023 | 89 | T157 | immature seed coats from seed of 100–200 mg fresh wt. |
| " | Gm-c1019 | 1681 | Williams | immature seed coats from seed of 200–300 mg fresh wt. |
| Gm-r1021(c) | Gm-c1004 | 4224 | Williams | roots of 8-days old seedlings |
| Gm-r1083 | Gm-c1009 | 1117 | Williams | roots, 2 month old plants |
| " | Gm-c1028 | 3055 | Supernod | roots innoculated with B. japonicum |
| " | Gm-c1013 | 820 | Williams | whole 2–3 week old seedlings |
| Gm-r1088 | Gm-c1019 | 426 | Williams | immature seed coats from seed of 200–300 mg fresh wt. |
| " | Gm-c1023 | 929 | T157 | immature seed coats from seed of 100–200 mg fresh wt. |
| " | Gm-c1027 | 2706 | Williams | cotyledons of 3- and 7-day-old seedlings |
| " | Gm-c1036 | 613 | Jack | somatic embryos cultured on MSD 20 for 2 to 9 mo. |
| " | Gm-c1075 | 304 | Jack | differentiating somatic embryos cultured on MSM6AC |
| " | Gm-c1064 | 707 | Williams | epicotyl, 2 week old seedling, auxin treatment |
| " | Gm-c1065 | 1309 | Williams | germinating shoot, cold stressed, 3 day old seedlings |
| " | Gm-c1066 | 191 | Williams | leaf and shoot tip, salt stressed, 2 wk. old seedling |
| " | Gm-c1067 | 438 | Williams82 | germinating shoot, 3 day old seedling, auxin treatment |
| " | Gm-c1068 | 630 | Williams82 | leaf, drought stressed. 1 month old plants |
| " | Gm-c1072 | 365 | PI 567.374 | leaves and shots from 2–3 week old seedlings induced for SDS symptoms |
| " | Gm-c1073 | 324 | Williams 82 | leaves and shoots from 2–3 week old seedlings included for SDS symptoms |
| " | Gm-c1074 | 274 | Williams 82 | 9–11 day old seedlings induced for HR response by P. syringae carrying avrB gene |
a More description of the reracked and source libraries are available in Genbank at
b Tissues were collected from plants grown in greehouse or growth chamber except for the immature and mature flowers which were collected from plants grown in the field.
c Since the Gm-r1021 reracked liabrary contains 4089 cDNAs, a total of 135 were repeated to obtain an even 9216 when combined with the Gm-r1083 cDNAs.
Figure 3The scatter plots of the log values of expression data from two duplicate microarray slides before (left) and after flagging and normalization (right). RNAs were extracted from seed coats of the 50–75 mg per seed fresh weight range by standard methods [13]. Replicate 1 was hybridized with Cy5 labeled RNA from seed coats of the T/T genotype and Cy3 labeled RNA from seed coats of the isogenic t*/t* mutant line. Replicate 2 is a dye swap experiment in which the mRNA from the T/T genotype is labeled with Cy3 and the isogenic t*/t* line is labeled with Cy5. The lines in each graph indicate expression either two-fold higher or two-fold lower than equivalent levels of expression. The dots encircled by the box represent repeats of flavonoid 3' hydroxylase cDNAs on the array that are overexpressed in the RNA samples from seed coats of the T/T genotype.
Differentially expressed cDNAs detected in dual labeling microarray experiments comparing isogenic lines of the T locus in soybean.
| Clone ID | Genbank | Intensitiesa | Expression Ratios | Functional Annotationd | |||||
| 3' Accession | Replicate 1 | Replicate 2 | |||||||
| XB22A ( | 37609 ( | XB22A ( | 37609 ( | XB22A/37609 ( | |||||
| Cy 5 | Cy3 | Cy3 | Cy5 | Rep 1b | Rep 2b | Aveb,c | |||
| Gm-b10BB-23 | AF499730 | 28686 | 10847 | 38350 | 8194 | 2.645 | 4.680 | 3.520 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 26794 | 9746 | 34956 | 7839 | 2.749 | 4.459 | 3.512 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 23979 | 9018 | 33272 | 7626 | 2.659 | 4.363 | 3.440 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 26094 | 9231 | 23580 | 5264 | 2.827 | 4.479 | 3.427 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 26812 | 10670 | 35600 | 7963 | 2.513 | 4.471 | 3.350 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 25663 | 9746 | 32520 | 7685 | 2.633 | 4.232 | 3.338 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 24020 | 9468 | 34329 | 8241 | 2.537 | 4.166 | 3.295 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 24578 | 10218 | 35465 | 8158 | 2.405 | 4.347 | 3.267 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 24662 | 9850 | 26041 | 5957 | 2.504 | 4.371 | 3.208 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 22240 | 9526 | 31641 | 7643 | 2.335 | 4.140 | 3.138 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 24548 | 11084 | 35328 | 8233 | 2.215 | 4.291 | 3.100 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 19465 | 8343 | 30897 | 7912 | 2.333 | 3.905 | 3.098 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 26572 | 12004 | 37720 | 8844 | 2.214 | 4.265 | 3.084 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 21583 | 9921 | 31828 | 7410 | 2.175 | 4.295 | 3.082 | Flavonoid-3' hydroxylase |
| Gm-b10BB-22 | AF499731 | 21122 | 10431 | 37273 | 8852 | 2.025 | 4.211 | 3.028 | Flavonoid-3' hydroxylase |
| Gm-b10BB-23 | AF499730 | 26053 | 12897 | 37691 | 8164 | 2.020 | 4.617 | 3.027 | Flavonoid-3' hydroxylase |
| Gm-r1070-484 | BE819850 | 2843 | 6235 | 3931 | 8702 | 0.456 | 0.452 | 0.454 | Bowman-Birk inhibitor |
| Gm-r1070-8083 | BE823467 | 2471 | 5070 | 3498 | 8557 | 0.487 | 0.409 | 0.438 | Ribonucleoprotein homolog |
| Gm-r1070-8006 | BE823540 | 3353 | 7237 | 4145 | 12219 | 0.463 | 0.339 | 0.385 | Trypsin inhibitor, Kunitz |
| Gm-r1070-9195 | BE824378 | 1889 | 4384 | 2383 | 7562 | 0.431 | 0.315 | 0.358 | No hits found |
| Gm-r1070-120 | BE657237 | 3083 | 7548 | 3769 | 11877 | 0.408 | 0.317 | 0.353 | Trypsin inhibitor, Kunitz |
| Gm-r1070-8909 | BE824331 | 3771 | 10945 | 3756 | 13568 | 0.345 | 0.277 | 0.307 | Beta conglycinin |
| Gm-r1070-9099 | BE824364 | 1806 | 19569 | 1663 | 31399 | 0.092 | 0.053 | 0.068 | Albumin precursor/leginsulin |
(a) Intensities after background subtraction and global normalization between replicates and within each slide are shown.
(b) The mean ratios of the 16 flavonoid hydroxylase cDNAs are significant below the p = 0.0001 level in a t-test compared to 2.0 as the mean.
(c) The average ratio of both slides is calculated by as follows: (XB22A Rep 1 intensity + XB22A Rep2 intensity) / (37609 Rep 1 intensity + 37609 Rep 2 intensity)
(d) The matches for all of the functional annotations were to soybean (Glycine max) sequences except for the ribonucleoprotein homolog which was to Arabidopsis thaliana.
Figure 4One of the scatter plots of the log values of expression data from microarray slides hybridized with Cy3 labeled RNA from leaves and Cy5 labeled RNA from roots. Many cDNAs have differential expression above or below the two-fold level as indicated by the lines.
A selection of genes that are differentially expressed in leaves or in roots
| Clone Identification | GenBank accession no. | Average Ratio1 | Function2 | Annotation (BLAST hit and organism) |
| Gm-b10BB-41 | AI495218 | 0.103 | en | Rubisco ( |
| Gm-r1088-7900 | BU550654 | 0.294 | en | Light harvesting chlorophyll a/b binding protein ( |
| Gm-r1088-8981 | BU549821 | 0.261 | en | Photosystem I subunit ( |
| Gm-r1088-3538 | BU546899 | 0.365 | en | Thylakoid lumen protein ( |
| Gm-b10BB-47 | AW185639 | 0.171 | en | Plastocyanin precursor ( |
| Gm-r1088-2905 | BU546179 | 0.258 | en | Trehalose-6-phosphate phosphatase ( |
| Gm-r1088-7106 | BU548940 | 0.150 | st | Vegetative Storage Protein ( |
| Gm-r1088-5827 | BU548964 | 0.465 | df | Acidic chitinase ( |
| Gm-r1088-6724 | BU549206 | 0.217 | cmg | Putative calreticulin ( |
| Gm-r1088-3756 | BU546067 | 0.415 | cmg | Cytochrome P450 ( |
| Gm-r1088-8229 | BU550097 | 0.454 | cmg | Catalase ( |
| Gm-r1088-5243 | BU547961 | 0.143 | cmg | Putative serine carboxypeptidase II-3 precursor ( |
| Gm-r1088-1433 | BU545435 | 0.200 | cmg | H protein ( |
| Gm-b10BB-12 | AI900038 | 0.211 | cmg | F3H (Flavanone-3-Hydroxylase) ( |
| Gm-r1088-4994 | BU547986 | 0.205 | cmg | Matrix metalloproteinase MMP2 ( |
| Gm-r1088-2794 | BU547254 | 0.138 | cmg | Putative lipoic acid synthase (LIP1) ( |
| Gm-r1088-4578 | BU547484 | 0.165 | cmg | Lipid transfer protein-like protein ( |
| Gm-r1088-5321 | BU547784 | 3.806 | no | Nodulin-26 ( |
| Gm-r1088-7410 | BU550525 | 2.304 | no | MtN19 homolog ( |
| Gm-r1088-6955 | BU551008 | 4.630 | no | Similar to nodulins and lipase homolog ( |
| Gm-r1088-6384 | BU550458 | 4.431 | to | bZIP transcription factor ( |
| Gm-b10BB-11 | AI930858 | 3.008 | df | Chalcone isomerase ( |
| Gm-r1088-6204 | BU547671 | 2.541 | cmg | Putative aquaporin (tonoplast intrinsic protein) ( |
| Gm-r1088-2818 | BU546503 | 2.848 | cmg | Phosphoenolpyruvate carboxkinase ( |
| Gm-r1088-1741 | BU544616 | 2.724 | cmg | Similar to sucrose synthase ( |
| Gm-b10BB-37 | AW309104 | 6.228 | cmg | Proline-rich protein ( |
| Gm-b10BB-38 | AI442449 | 2.233 | cmg | DAD-1 (Defender Against apoptopic cell Death) ( |
| Gm-r1088-5369 | BU547794 | 8.372 | cmg | Ripening related protein ( |
| Gm-r1088-7112 | BU548943 | 6.661 | cmg | Germin-like protein ( |
| Gm-r1088-5330 | BU547868 | 4.974 | cmg | Pectinesterase (EC 3.1.1.11) precursor ( |
| Gm-r1088-6104 | BU549267 | 4.554 | cmg | Asparagine synthase (glutamine-hydrolyzing) ( |
| Gm-r1088-741 | BU544257 | 2.455 | cmg | Cationic peroxidase ( |
| Gm-b10BB-45 | AW318233 | 2.135 | cmg | Tubulin (b chain) ( |
| Gm-r1088-5315 | BU547781 | 4.334 | u | Specific tissue protein 1 ( |
| Gm-r1088-5332 | BU547869 | 4.267 | oth | Auxin-repressed protein ( |
1 The average ratios of individual values from two slides after normalization and using a dye swap procedure.
2 en – energy; st – storage; to – transcription; cmg – cell growth and maintenance; u – unknown; oth – other; df – defense; no – nodulation related