| Literature DB >> 20946615 |
Hong Fang1, Joshua Xu, Don Ding, Scott A Jackson, Isha R Patel, Jonathan G Frye, Wen Zou, Rajesh Nayak, Steven Foley, James Chen, Zhenqiang Su, Yanbin Ye, Steve Turner, Steve Harris, Guangxu Zhou, Carl Cerniglia, Weida Tong.
Abstract
BACKGROUND: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed a genomics tool, ArrayTrack™, which provides extensive functionalities to manage, analyze, and interpret genomic data for mammalian species. ArrayTrack™ has been widely adopted by the research community and used for pharmacogenomics data review in the FDA's Voluntary Genomics Data Submission program.Entities:
Mesh:
Year: 2010 PMID: 20946615 PMCID: PMC3026378 DOI: 10.1186/1471-2105-11-S6-S4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Screenshot of ArrayTrack Among other functionalities, the center spreadsheet displays annotations for all the genes contained in the Microbial Library. The panel on the left can be used to simultaneously search for a specific list of genes.
Figure 2Flag-based hierarchical clustering analysis of all 41 samples in the FDA-ECSG dataset. Flag-based hierarchical clustering analysis of the FDA-ECSG data using a subset of 658 genes known to be associated with the E. coli 536 strain. The green and red colors within the HCA denote absent and present genes, respectively. The bacterial isolates colored in blue, purple, and gold denote the E. coli 536_EC1381, E. coli F11_EC1519 and E. coli CFT073_EC1521 samples, respectively. Their strain-specific genes are also colored accordingly.
Figure 3Flag concordance heat map of all 41 Heat map created using ArrayTrackTM’s default settings. The samples are reorganized using a clustering algorithm that places similar samples together. The areas in red within the heat map indicate high similarity. The E. coli O157:H7 samples are circled in red, while the E. coli 536_EC1381, E. coli F11_EC1519, and E. coli CFT073_EC1521 samples are circled in green. Both of these two groups are noted for their high within-group similarity values.
Figure 4Two mixed scatter plots from the FDA-ECSG dataset. Mixed scatter plot A compares E. coli 536_EC1381 vs.E. coli F11_EC1519, and B compares E. coli 536_EC1381 vs.Shigellasonnei 53G_SH20009. Each gene was plotted on the graph based on its log2 intensity value in each sample, and color coded based on absent/marginal/present values according to the key in the upper left: pink genes are those considered to be present in the E. coli 536_EC1381 sample and absent/marginal in the other; yellow genes are absent in the E. coli 536_EC1381 sample and present in the other; blue genes are absent/marginal in both; and green genes are present in both.
Figure 5Flag concordance heat map of 34 Red cells indicate a high percentage of matching gene absent/present calls while blue cells indicate a low percentage of matching gene calls. The gold outline indicates a subset of 20 isolates that show relatively high similarity to each other.
Figure 6Hierarchical clustering analysis of 34 . Flag-based HCA of the 34 Salmonella isolates. To identify the dissimilarity among isolates, only those genes (297) that showed differences in absent and present calls between the isolates were used. The green and red colors indicate absent or present calls, respectively. The isolates with a “Rough” serotype were colored in blue and their specific gene cluster is outlined in blue.