Komal S Sandhu1, Vamsi Veeramachaneni1, Xiang Yao2, Alex Nie3, Peter Lord4, Dhammika Amaratunga5, Michael K McMillian6, Geert R Verheyen7. 1. Strand Life Sciences Pvt Ltd, Bangalore, India. 2. Data Sciences, R&D IT, Janssen Pharmaceutical Research & Development, LLC, 3120 Merryfield Row, San Diego, CA 92121, USA. 3. Special Counsel, Patent Atterney, Sheppard, Mullin, Richter & Hampton LLP, 379 Lytton Ave, Palo Alto, CA 94301, USA. 4. Discotox Ltd, Hebden Bridge, West Yorkshire, UK. 5. Independent Consultant & Researcher, Bridgewater, NJ 08807, USA. 6. InvitroCue Pte Ltd, 81 Science Park Drive, Singapore 118257, Singapore. 7. Radius Group, Thomas More University College, Kleinhoefstraat 4, 2440 Geel, Belgium.
Abstract
AIM: We release the Janssen Toxicogenomics database. This rat liver gene-expression database was generated using Codelink microarrays, and has been used over the past years within Janssen to derive signatures for multiple end points and to classify proprietary compounds. MATERIALS & METHODS: The release consists of gene-expression responses to 124 compounds, selected to give a broad coverage of liver-active compounds. A selection of the compounds were also analyzed on Affymetrix microarrays. RESULTS: The release includes results of an in-house reannotation pipeline to Entrez gene annotations, to classify probes into different confidence classes. High confidence unambiguously annotated probes were used to create gene-level data which served as starting point for cross-platform comparisons. Connectivity map-based similarity methods show excellent agreement between Codelink and Affymetrix runs of the same samples. We also compared our dataset with the Japanese Toxicogenomics Project and observed reasonable agreement, especially for compounds with stronger gene signatures. We describe an R-package containing the gene-level data and show how it can be used for expression-based similarity searches. CONCLUSION: Comparing the same biological samples run on the Affymetrix and the Codelink platform, good correspondence is observed using connectivity mapping approaches. As expected, this correspondence is smaller when the data are compared with an independent dataset such as TG-GATE. We hope that this collection of gene-expression profiles will be incorporated in toxicogenomics pipelines of users.
AIM: We release the Janssen Toxicogenomics database. This rat liver gene-expression database was generated using Codelink microarrays, and has been used over the past years within Janssen to derive signatures for multiple end points and to classify proprietary compounds. MATERIALS & METHODS: The release consists of gene-expression responses to 124 compounds, selected to give a broad coverage of liver-active compounds. A selection of the compounds were also analyzed on Affymetrix microarrays. RESULTS: The release includes results of an in-house reannotation pipeline to Entrez gene annotations, to classify probes into different confidence classes. High confidence unambiguously annotated probes were used to create gene-level data which served as starting point for cross-platform comparisons. Connectivity map-based similarity methods show excellent agreement between Codelink and Affymetrix runs of the same samples. We also compared our dataset with the Japanese Toxicogenomics Project and observed reasonable agreement, especially for compounds with stronger gene signatures. We describe an R-package containing the gene-level data and show how it can be used for expression-based similarity searches. CONCLUSION: Comparing the same biological samples run on the Affymetrix and the Codelink platform, good correspondence is observed using connectivity mapping approaches. As expected, this correspondence is smaller when the data are compared with an independent dataset such as TG-GATE. We hope that this collection of gene-expression profiles will be incorporated in toxicogenomics pipelines of users.