Literature DB >> 19615117

OligoRAP - an Oligo Re-Annotation Pipeline to improve annotation and estimate target specificity.

Pieter B T Neerincx¹, Han Rauwerda, Haisheng Nie, Martien A M Groenen, Timo M Breit, Jack A M Leunissen.

Abstract

BACKGROUND: High throughput gene expression studies using oligonucleotide microarrays depend on the specificity of each oligonucleotide (oligo or probe) for its target gene. However, target specific probes can only be designed when a reference genome of the species at hand were completely sequenced, when this genome were completely annotated and when the genetic variation of the sampled individuals were completely known. Unfortunately there is not a single species for which such a complete data set is available. Therefore, it is important that probe annotation can be updated frequently for optimal interpretation of microarray experiments.
RESULTS: In this paper we present OligoRAP, a pipeline to automatically update the annotation of oligo libraries and estimate oligo target specificity. OligoRAP uses a reference genome assembly with Ensembl and Entrez Gene annotation supplemented with a set of unmapped transcripts derived from RefSeq and UniGene to handle assembly gaps. OligoRAP produces alignments of each oligo with the reference assembly as well as with unmapped transcripts. These alignments are re-mapped to the annotation sources, which results in a concise, as complete as possible and up-to-date annotation of the oligo library. The building blocks of this pipeline are BioMoby web services creating a highly modular and distributed system with a robust, remote programmatic interface.OligoRAP was used to update the annotation for a subset of 791 oligos from the ARK-Genomics 20 K chicken array, which were selected as starting material for the oligo annotation session of the EADGENE/SABRE Post-analysis workshop. Based on the updated annotation about one third of these oligos is problematic with regard to target specificity. In addition, the accession numbers or ids the oligos were originally designed for no longer exist in the updated annotation for almost half of the oligos.
CONCLUSION: As microarrays are designed on incomplete data, it is important to update probe annotation and check target specificity regularly. OligoRAP provides both and due to its design based on BioMoby web services it can easily be embedded as an oligo annotation engine in customised applications for microarray data analysis. The dramatic difference in updated annotation and target specificity for the ARK-Genomics 20 K chicken array as compared to the original data emphasises the need for regular updates.

Entities: Chemical Gene Species

Year: 2009 PMID： 19615117 PMCID： PMC2712747 DOI： 10.1186/1753-6561-3-S4-S4

Source DB: PubMed Journal: BMC Proc ISSN： 1753-6561

Background

DNA microarray technology has evolved rapidly to become the most popular platform for high throughput gene expression analysis as it allow biologists to measure the expression of entire transcriptomes at relatively high speed and low cost. This makes microarrays ideal for applications like sample clustering/fingerprinting, genome annotation, detection of differential gene expression, detection of polymorphisms and re-sequencing [1,2]. Microarrays contain oligonucleotides (probes) that can hybridise with the labelled reverse complement of mRNA. Since the probes are immobilised on the surface of an array and it is known which probes are located where on the array, signal at a certain spot can be used as a measure for gene expression. This requires that probes are unique for their target genes and hence optimal microarray design requires 1) a completely sequenced reference genome, 2) complete annotation for this reference genome to know what parts may be expressed and 3) complete knowledge about the natural variation amongst the sampled individuals. Unfortunately there is currently not a single species for which such complete information is available. Although some reference genomes are now close to completion, annotation of these reference genomes as well as information on how individuals differ from these reference genomes is far from complete. Hence, microarray design is currently sub-optimal even for species with a rather complete reference genome. Probe design based on incomplete or erroneous data can lead to serious problems like non-specific probes causing cross hybridisation, orphan probes designed for non-existing targets, missing probes and misleading probes due to erroneous annotation. Therefore, it is important to update the annotation for arrays regularly to improve the functional annotation of the targets as well as the reliability of probe-target assignments. Several tools have been developed for this purpose [3-12], but these provide either limited annotation, require complicated local installations with many dependencies, do not scale well or do not support our species of interest. We have developed OligoRAP (Oligo ReAnnotation Pipeline) to overcome these issues.

Implementation

The pipeline consists of 5 steps: I. Convert oligo library data into BioMoby objects, II. Align oligos with a reference genome assembly and with a set of unmapped transcripts (UMTs), III. Analyse oligo annotation, IV. Analyse oligo quality and V. Make summary charts (see Figure 1). Implementation details are described and illustrated in Additional files 1, 2, 3, 4, 5, 6. In this section we will only focus on the key advantages of OligoRAP.

Figure 1

Summarizing OligoRAP flowchart. Blue blocks represent user input, green blocks databases, pink blocks output and finally orange blocks represent one or more BioMoby web services. For a more detailed description see Additional files 1, 2, 3, 4, 5, 6. Firstly, OligoRAP does not rely solely on a reference genome or solely on transcripts (or sequences derived thereof), but uses both where possible. For the genome OligoRAP uses reference assemblies and annotation as provided by the Ensembl [13] project. Ensembl was chosen as primary annotation source, because it is the largest and richest resource of its kind with support for most popular model species in the animal kingdom. In addition to reference assemblies OligoRAP uses a set of unmapped transcripts (UMTs) to get a more complete picture. The UMT set contains RefSeq [14] and UniGene [14] entries, which failed to map to the reference assembly. Where available annotation derived from Ensembl (for hits on the genome) and from RefSeq or UniGene (for hits on UMTs) can be expanded with links to Entrez Gene [14] and GO [15]. The combination of reference genome supplemented with UMTs provides optimally complete annotation for well-annotated species whilst keeping redundancy at a minimum. At the same time this strategy is flexible enough to support less well-annotated species even if there is no reference assembly available. In that case all of a species' transcripts simply become part of the UMT set. Secondly, OligoRAP provides annotation for all hits instead of only for the best hit. This allows OligoRAP to provide not only updated annotation, but also oligo target specificity based on the amount and type of hits. OligoRAP can differentiate between primary hits (high hybridisation potential) and secondary hits (low hybridisation potential). Hybridisation potential is determined using three filters, which users can adjust based on their experimental setup. Based on their target specificity oligos are divided into six target specificity classes (TSCs): 1. Gene-specific probes with maximum signal potential, 2. Gene-specific probes with reduced signal potential, 3. Non-specific probes with maximum signal potential, 4. Non-specific probes with mixed signal potential, 5. Non-specific probes with reduced signal potential and 6. Orphan probes with background signal potential. Finally, each of the steps is implemented as one or more web services [16], which were built using the BioMoby framework [17,18]. These web services provide remote programmatic access and can be glued together using a variety of BioMoby clients like the Taverna Workbench [19] or custom code built with the BioMoby Perl or Java framework. Using web services we created a highly customisable and modular annotation pipeline with a robust interface. This allows for OligoRAP to be embedded in microarray data analysis workflows for improved scalability without tedious, local installations suffering from complex dependencies.

Results and discussion

OligoRAP was used to update annotation and target specificity for the subset of 791 oligos from the ARK-Genomics 20 K chicken array (see methods in Additional file 1). Figure 2 shows how these oligos are divided over OligoRAP's target specificity classes (TSCs) with transcriptome-based target specificity (TbTS) in Figure 2A and genome-based target specificity (GbTS) in figure 2B.

Figure 2

Distribution of Oligos over Target Specificity Classes (TSCs). Distribution of the 791 oligos selected for the workshop over the 6 TSCs with transcriptome-based (A) and genome-based target specificity (B). The status of the link between the oligo and the accession number/identifier it was originally designed for is indicated by a tint difference in the colour for TSC 1, 3 and 4: accession/id still present in the annotation, hence "target unchanged" (dark tint) or accession/id absent, hence "target changed" (light tint). For TSC 2, 5 and 6 the target status is always "changed".

Transcriptome-based versus genome-based target specificity

Up till recently the transcriptome of higher eukaryotes was thought to contain a very small subset of the genome. For example in Ensembl 50 less than 5% of the chicken genome is annotated as exon. Since only potentially expressed sequences can hybridise to probes on a microarray, most oligo design and annotation efforts have focused on known and/or predicted transcripts without taking the rest of the genome into account. Apart from a few structural elements like the centromeres and telomeres it's still not clear what the function of the other 95% or more of DNA is, but slowly evidence is piling up indicating the size of the transcriptome is vastly underestimated. Especially the pilot phase of the ENCODE project showed that the human "genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts" [20]. It remains unclear whether all these transcripts are biologically functional or whether they just represent noise, but it is clear that all transcripts can potentially hybridise with the oligos on microarrays. Therefore it is probably more appropriate to evaluate target specificity in the context of the entire genome as compared to what is currently annotated as transcriptome. Looking at TbTS and GbTS for the 791 ARK-Genomics chicken oligos the total amount of gene-specific oligos differs only by 2.3% with 69.5% and 67.2%, respectively. Hence taking the entire genome into account as compared to looking only at the transcriptome does not lead to a dramatic decrease of gene-specific probes. Unfortunately at least one third of the probes are non-specific. For these problematic non-specific probes the TbTS and GbTS pictures look quite different.

Annotation quality

For most of the oligos it is extremely difficult to verify their predicted target specificity except for the orphan oligos of TSC 6. The 791 oligos selected as starting material for this EADGENE/SABRE workshop were picked, because they do show a high differential signal on the microarrays. Hence these oligos clearly bind labelled cDNA derived from one or more target genes, but OligoRAP classifies 3.5% and 16.1% of the oligos as orphans with GbTS and TbTS, respectively. These numbers indicate that OligoRAP's TSC assignments are currently more an indicator for the relatively immature status of the chicken genome assembly and its annotation than for target specificity. Furthermore, for almost half of the oligos, the sequence identifier they were originally designed for is no longer present in their updated annotation, which is indicated with "target changed" in Figure 2. The fact that these identifiers no longer link to these oligos not necessarily means that the oligo no longer represents expression of the same gene as before, but it does indicate at least major changes in the annotation. On the other hand annotation associated with certain identifiers may have received considerable "minor" updates keeping the sequence identifier intact. Hence, the large amount of oligos with changed targets is still an underestimate of the total amount of changed annotation.

Future work

Although the ENCODE pilot study covered only approximately 1% of the human genome it is clear that our view on the transcriptome will change dramatically over the next years. This will have a big impact on oligo annotation & target specificity making it more important than ever to be able to update oligo annotation quickly and regularly. In addition to regular updates of the data, annotation pipelines like OligoRAP will need to be updated too to adapt the annotation strategies to our changing insights in gene expression.

Conclusion

Microarray probes are designed on incomplete data. Therefore it is important to update probe annotation and estimate target specificity regularly. OligoRAP provides such functionality for Ensembl species and can easily be embedded in customised applications for microarray data analysis due to its design based on BioMoby web services. The rather high amount of oligos with changed targets shows the importance of updated annotation and reflects the limited amount and quality of the annotation available at the time the ARK-Genomics 20 K chicken array was designed.

Further information

ZIP-archive containing the final results of the OligoRAP pipeline run as well as all intermediate results. See included README for details.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

PBTN designed and programmed the pipeline of web services and drafted the manuscript. HR conceived the pipeline, participated in its design and helped to improve the manuscript. HN helped with data analysis for debugging and helped to improve the manuscript. TMB, MAMG and JAML secured funding, managed the project and helped to improve the manuscript. All authors read and approved the final manuscript.

Additional file 1

Implementation details, Availability & Requirements and Methods. Text describing details with regard to Implementation, Availability & Requirements and Methods. Click here for file

Additional file 2

Detailed flowchart. Figure in PDF format. (A) OligoRAP components. User inputs are in blue, databases in green and results in orange. Yellow blocks represent a single synchronous web service or a set of two asynchronous services for a specific task (one service for job submission and one for requesting a job's status). Some steps are executed multiple times. BLAT, BLAST, Concatenate, Analyse Annotation and Merge Hits and Analyse Quality are executed multiple times for multiple chunks as indicated by sets of three connecting lines starting with a filled circle (●). The Create Chart step is executed multiple times for different inputs (not chunks) as indicated by a set of two connecting lines starting with a filled square (■). (B) Some examples of how OligoRAP can be extended or linked to downstream analyses tools. Click here for file

Additional file 3

Relationships between filter thresholds, primary & secondary hits and estimated signal intensity. Figure in PDF format. Primary hits (green) represent (near) perfect alignments of oligos with their targets. Secondary hits (orange) are defined as worse than primary hits, but still capable of generating signal above background. Relative signal intensity is shown on the vertical axis and the 3 filters – mismatches, sequence identity and longest contiguous stretch – on 3 horizontal axes. Signal intensity drops as the amount of mismatches increases and as the percentage sequence identity or the length of the longest contiguous stretch decreases. Estimated signal intensity above the primary hit threshold (green) is defined as "maximum signal". Estimated signal below the primary and above the secondary hit threshold (orange) is defined as "reduced". Finally estimated signal below the secondary hit threshold is defined as "background signal". Click here for file

Additional file 4

Overview of OligoRAP's six target specificity classes, which are defined by the amount of primary and secondary hits. Figure in PDF format showing how target specificity classes are defined based on hits. Primary hits (green) represent (near) perfect alignments of oligos with their targets. Secondary hits (orange) are defined as worse than primary hits, but still capable of generating signal above background. Classes are named after the combination of probe type (gene-specific, non-specific or orphan) and estimated potential signal (maximum, reduced, mixed or background). Click here for file

Additional file 5

OligoQualityAnalyser example output for Compugen oligo CGEN-MOUSE_30000003_1. Figure in PDF format showing an example of OligoQualityAnalyser output with an OligoQualityRecord (blue) containing 2 hits/alignments (green) and 2 target specificity assignments (orange). The first hit overlaps with Ensembl features resulting in annotation in the form of cross-references (purple), while the second hit targets 'intergenic' space resulting in a lack of cross-references. Each target specificity block contains the oligo's specificity for two contexts – genome and transcriptome – and refers by id attribute to the filter settings (thresholds) described elsewhere in the XML (not shown). The Cigar Like Line (CLL) is derived from the Ensembl Cigar line [21] and is used to store alignment details (matches, mismatches, insertions, deletions & intron gaps) in a compact string. See figure additional file 6 for detailed CLL examples. Together with subject sequence accession/ID, start, and stop, the CLL provides all information necessary to reconstruct the alignments. It can be used for example to create UCSC "custom tracks" [22,23] for visualization in the UCSC or Ensembl genome browsers. Click here for file

Additional file 6

Detailed CLL examples. Figure in PDF format showing five example alignments with their accompanying cigar like lines (CLLs). A CLL is a compact way to represent the alignment of a first sequence (query) with a second one (subject/DB). In these examples the single stranded oligo is the query and the double stranded DNA the subject. Example with matches and substitutions of an oligo hit on the forward DNA strand (1). Similar example of an oligo hit on the reverse DNA strand (2). Note that the CLL describes the alignment from the perspective of the oligo in terms of insertions and deletions, but is always read from left to right with the forward strand of the subject written from 5' on the left to 3' on the right side. Hence in the case of example 2 the CLL corresponds to 3' on the left to 5' on the right for the oligo sequence. Examples of insertions & deletions (3) and of an intron gap (4). In this context introns are special cases of deletions and usually the result of merging multiple smaller hits into one larger alignment. Example of mixed case nucleotides (5): a number followed by 2 or more characters (m/s/i/d/n) indicates this amount of nucleotides can be a mix of the corresponding classes. In this case there are 25 ns nucleotides, which corresponds to a mix of substitutions with an intron gap. Due to the substitutions it's not possible to determine exactly where the intron gap starts and ends in the oligo sequence. Hence alignments corresponding to 7m5s20n9m, 7m3s20n2s9 m (shown) and 7m20n5s9 m can al be written as 7m25ns9m. Click here for file

22 in total

1. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal: Nat Genet Date: 2000-05 Impact factor: 38.330

2. Taverna: a tool for the composition and enactment of bioinformatics workflows.

Authors: Tom Oinn; Matthew Addis; Justin Ferris; Darren Marvin; Martin Senger; Mark Greenwood; Tim Carver; Kevin Glover; Matthew R Pocock; Anil Wipat; Peter Li
Journal: Bioinformatics Date: 2004-06-16 Impact factor: 6.937

Review 3. Evolution of web services in bioinformatics.

Authors: Pieter B T Neerincx; Jack A M Leunissen
Journal: Brief Bioinform Date: 2005-06 Impact factor: 11.622

Review 4. Microarrays: an overview.

Authors: Norman H Lee; Alexander I Saeed
Journal: Methods Mol Biol Date: 2007

5. AffyProbeMiner: a web resource for computing or retrieving accurately redefined Affymetrix probe sets.

Authors: Hongfang Liu; Barry R Zeeberg; Gang Qu; A Gunes Koru; Alessandro Ferrucci; Ari Kahn; Michael C Ryan; Antej Nuhanovic; Peter J Munson; William C Reinhold; David W Kane; John N Weinstein
Journal: Bioinformatics Date: 2007-07-27 Impact factor: 6.937

Review 6. Interoperability with Moby 1.0--it's better than sharing your toothbrush!

Authors: Mark D Wilkinson; Martin Senger; Edward Kawas; Richard Bruskiewich; Jerome Gouzy; Celine Noirot; Philippe Bardou; Ambrose Ng; Dirk Haase; Enrique de Andres Saiz; Dennis Wang; Frank Gibbons; Paul M K Gordon; Christoph W Sensen; Jose Manuel Rodriguez Carrasco; José M Fernández; Lixin Shen; Matthew Links; Michael Ng; Nina Opushneva; Pieter B T Neerincx; Jack A M Leunissen; Rebecca Ernst; Simon Twigger; Bjorn Usadel; Benjamin Good; Yan Wong; Lincoln Stein; William Crosby; Johan Karlsson; Romina Royo; Iván Párraga; Sergio Ramírez; Josep Lluis Gelpi; Oswaldo Trelles; David G Pisano; Natalia Jimenez; Arnaud Kerhornou; Roman Rosset; Leire Zamacola; Joaquin Tarraga; Jaime Huerta-Cepas; Jose María Carazo; Joaquin Dopazo; Roderic Guigo; Arcadi Navarro; Modesto Orozco; Alfonso Valencia; M Gonzalo Claros; Antonio J Pérez; Jose Aldana; M Mar Rojano; Raul Fernandez-Santa Cruz; Ismael Navas; Gary Schiltz; Andrew Farmer; Damian Gessler; Heiko Schoof; Andreas Groscurth
Journal: Brief Bioinform Date: 2008-01-31 Impact factor: 11.622

7. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Authors: Ewan Birney; John A Stamatoyannopoulos; Anindya Dutta; Roderic Guigó; Thomas R Gingeras; Elliott H Margulies; Zhiping Weng; Michael Snyder; Emmanouil T Dermitzakis; Robert E Thurman; Michael S Kuehn; Christopher M Taylor; Shane Neph; Christoph M Koch; Saurabh Asthana; Ankit Malhotra; Ivan Adzhubei; Jason A Greenbaum; Robert M Andrews; Paul Flicek; Patrick J Boyle; Hua Cao; Nigel P Carter; Gayle K Clelland; Sean Davis; Nathan Day; Pawandeep Dhami; Shane C Dillon; Michael O Dorschner; Heike Fiegler; Paul G Giresi; Jeff Goldy; Michael Hawrylycz; Andrew Haydock; Richard Humbert; Keith D James; Brett E Johnson; Ericka M Johnson; Tristan T Frum; Elizabeth R Rosenzweig; Neerja Karnani; Kirsten Lee; Gregory C Lefebvre; Patrick A Navas; Fidencio Neri; Stephen C J Parker; Peter J Sabo; Richard Sandstrom; Anthony Shafer; David Vetrie; Molly Weaver; Sarah Wilcox; Man Yu; Francis S Collins; Job Dekker; Jason D Lieb; Thomas D Tullius; Gregory E Crawford; Shamil Sunyaev; William S Noble; Ian Dunham; France Denoeud; Alexandre Reymond; Philipp Kapranov; Joel Rozowsky; Deyou Zheng; Robert Castelo; Adam Frankish; Jennifer Harrow; Srinka Ghosh; Albin Sandelin; Ivo L Hofacker; Robert Baertsch; Damian Keefe; Sujit Dike; Jill Cheng; Heather A Hirsch; Edward A Sekinger; Julien Lagarde; Josep F Abril; Atif Shahab; Christoph Flamm; Claudia Fried; Jörg Hackermüller; Jana Hertel; Manja Lindemeyer; Kristin Missal; Andrea Tanzer; Stefan Washietl; Jan Korbel; Olof Emanuelsson; Jakob S Pedersen; Nancy Holroyd; Ruth Taylor; David Swarbreck; Nicholas Matthews; Mark C Dickson; Daryl J Thomas; Matthew T Weirauch; James Gilbert; Jorg Drenkow; Ian Bell; XiaoDong Zhao; K G Srinivasan; Wing-Kin Sung; Hong Sain Ooi; Kuo Ping Chiu; Sylvain Foissac; Tyler Alioto; Michael Brent; Lior Pachter; Michael L Tress; Alfonso Valencia; Siew Woh Choo; Chiou Yu Choo; Catherine Ucla; Caroline Manzano; Carine Wyss; Evelyn Cheung; Taane G Clark; James B Brown; Madhavan Ganesh; Sandeep Patel; Hari Tammana; Jacqueline Chrast; Charlotte N Henrichsen; Chikatoshi Kai; Jun Kawai; Ugrappa Nagalakshmi; Jiaqian Wu; Zheng Lian; Jin Lian; Peter Newburger; Xueqing Zhang; Peter Bickel; John S Mattick; Piero Carninci; Yoshihide Hayashizaki; Sherman Weissman; Tim Hubbard; Richard M Myers; Jane Rogers; Peter F Stadler; Todd M Lowe; Chia-Lin Wei; Yijun Ruan; Kevin Struhl; Mark Gerstein; Stylianos E Antonarakis; Yutao Fu; Eric D Green; Ulaş Karaöz; Adam Siepel; James Taylor; Laura A Liefer; Kris A Wetterstrand; Peter J Good; Elise A Feingold; Mark S Guyer; Gregory M Cooper; George Asimenos; Colin N Dewey; Minmei Hou; Sergey Nikolaev; Juan I Montoya-Burgos; Ari Löytynoja; Simon Whelan; Fabio Pardi; Tim Massingham; Haiyan Huang; Nancy R Zhang; Ian Holmes; James C Mullikin; Abel Ureta-Vidal; Benedict Paten; Michael Seringhaus; Deanna Church; Kate Rosenbloom; W James Kent; Eric A Stone; Serafim Batzoglou; Nick Goldman; Ross C Hardison; David Haussler; Webb Miller; Arend Sidow; Nathan D Trinklein; Zhengdong D Zhang; Leah Barrera; Rhona Stuart; David C King; Adam Ameur; Stefan Enroth; Mark C Bieda; Jonghwan Kim; Akshay A Bhinge; Nan Jiang; Jun Liu; Fei Yao; Vinsensius B Vega; Charlie W H Lee; Patrick Ng; Atif Shahab; Annie Yang; Zarmik Moqtaderi; Zhou Zhu; Xiaoqin Xu; Sharon Squazzo; Matthew J Oberley; David Inman; Michael A Singer; Todd A Richmond; Kyle J Munn; Alvaro Rada-Iglesias; Ola Wallerman; Jan Komorowski; Joanna C Fowler; Phillippe Couttet; Alexander W Bruce; Oliver M Dovey; Peter D Ellis; Cordelia F Langford; David A Nix; Ghia Euskirchen; Stephen Hartman; Alexander E Urban; Peter Kraus; Sara Van Calcar; Nate Heintzman; Tae Hoon Kim; Kun Wang; Chunxu Qu; Gary Hon; Rosa Luna; Christopher K Glass; M Geoff Rosenfeld; Shelley Force Aldred; Sara J Cooper; Anason Halees; Jane M Lin; Hennady P Shulha; Xiaoling Zhang; Mousheng Xu; Jaafar N S Haidar; Yong Yu; Yijun Ruan; Vishwanath R Iyer; Roland D Green; Claes Wadelius; Peggy J Farnham; Bing Ren; Rachel A Harte; Angie S Hinrichs; Heather Trumbower; Hiram Clawson; Jennifer Hillman-Jackson; Ann S Zweig; Kayla Smith; Archana Thakkapallayil; Galt Barber; Robert M Kuhn; Donna Karolchik; Lluis Armengol; Christine P Bird; Paul I W de Bakker; Andrew D Kern; Nuria Lopez-Bigas; Joel D Martin; Barbara E Stranger; Abigail Woodroffe; Eugene Davydov; Antigone Dimas; Eduardo Eyras; Ingileif B Hallgrímsdóttir; Julian Huppert; Michael C Zody; Gonçalo R Abecasis; Xavier Estivill; Gerard G Bouffard; Xiaobin Guan; Nancy F Hansen; Jacquelyn R Idol; Valerie V B Maduro; Baishali Maskeri; Jennifer C McDowell; Morgan Park; Pamela J Thomas; Alice C Young; Robert W Blakesley; Donna M Muzny; Erica Sodergren; David A Wheeler; Kim C Worley; Huaiyang Jiang; George M Weinstock; Richard A Gibbs; Tina Graves; Robert Fulton; Elaine R Mardis; Richard K Wilson; Michele Clamp; James Cuff; Sante Gnerre; David B Jaffe; Jean L Chang; Kerstin Lindblad-Toh; Eric S Lander; Maxim Koriabine; Mikhail Nefedov; Kazutoyo Osoegawa; Yuko Yoshinaga; Baoli Zhu; Pieter J de Jong
Journal: Nature Date: 2007-06-14 Impact factor: 49.962

8. The UCSC Genome Browser Database: 2008 update.

Authors: D Karolchik; R M Kuhn; R Baertsch; G P Barber; H Clawson; M Diekhans; B Giardine; R A Harte; A S Hinrichs; F Hsu; K M Kober; W Miller; J S Pedersen; A Pohl; B J Raney; B Rhead; K R Rosenbloom; K E Smith; M Stanke; A Thakkapallayil; H Trumbower; T Wang; A S Zweig; D Haussler; W J Kent
Journal: Nucleic Acids Res Date: 2007-12-17 Impact factor: 16.971

9. Database resources of the National Center for Biotechnology Information.

Authors: David L Wheeler; Tanya Barrett; Dennis A Benson; Stephen H Bryant; Kathi Canese; Vyacheslav Chetvernin; Deanna M Church; Michael Dicuccio; Ron Edgar; Scott Federhen; Michael Feolo; Lewis Y Geer; Wolfgang Helmberg; Yuri Kapustin; Oleg Khovayko; David Landsman; David J Lipman; Thomas L Madden; Donna R Maglott; Vadim Miller; James Ostell; Kim D Pruitt; Gregory D Schuler; Martin Shumway; Edwin Sequeira; Steven T Sherry; Karl Sirotkin; Alexandre Souvorov; Grigory Starchenko; Roman L Tatusov; Tatiana A Tatusova; Lukas Wagner; Eugene Yaschenko
Journal: Nucleic Acids Res Date: 2007-11-27 Impact factor: 16.971

10. Ensembl 2008.

Authors: P Flicek; B L Aken; K Beal; B Ballester; M Caccamo; Y Chen; L Clarke; G Coates; F Cunningham; T Cutts; T Down; S C Dyer; T Eyre; S Fitzgerald; J Fernandez-Banet; S Gräf; S Haider; M Hammond; R Holland; K L Howe; K Howe; N Johnson; A Jenkinson; A Kähäri; D Keefe; F Kokocinski; E Kulesha; D Lawson; I Longden; K Megy; P Meidl; B Overduin; A Parker; B Pritchard; A Prlic; S Rice; D Rios; M Schuster; I Sealy; G Slater; D Smedley; G Spudich; S Trevanion; A J Vilella; J Vogel; S White; M Wood; E Birney; T Cox; V Curwen; R Durbin; X M Fernandez-Suarez; J Herrero; T J P Hubbard; A Kasprzyk; G Proctor; J Smith; A Ureta-Vidal; S Searle
Journal: Nucleic Acids Res Date: 2007-11-13 Impact factor: 16.971

7 in total

1. Re-annotation is an essential step in systems biology modeling of functional genomics data.

Authors: Bart H J van den Berg; Fiona M McCarthy; Susan J Lamont; Shane C Burgess
Journal: PLoS One Date: 2010-05-14 Impact factor: 3.240

2. Gene expression in chicken reveals correlation with structural genomic features and conserved patterns of transcription in the terrestrial vertebrates.

Authors: Haisheng Nie; Richard P M A Crooijmans; Aart Lammers; Evert M van Schothorst; Jaap Keijer; Pieter B T Neerincx; Jack A M Leunissen; Hendrik-Jan Megens; Martien A M Groenen
Journal: PLoS One Date: 2010-08-05 Impact factor: 3.240

3. Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example.

Authors: Han Rauwerda; Mark de Jong; Wim C de Leeuw; Herman P Spaink; Timo M Breit
Journal: BMC Res Notes Date: 2010-07-13

4. SigReannot-mart: a query environment for expression microarray probe re-annotations.

Authors: François Moreews; Gaelle Rauffet; Patrice Dehais; Christophe Klopp
Journal: Database (Oxford) Date: 2011-09-18 Impact factor: 3.451

5. The EADGENE and SABRE post-analyses workshop.

Authors: Florence Jaffrezic; Jakob Hedegaard; Magali Sancristobal; Christophe Klopp; Dirk-Jan de Koning
Journal: BMC Proc Date: 2009-07-16

6. Methods for interpreting lists of affected genes obtained in a DNA microarray experiment.

Authors: Cristina Arce; Silvio Bicciato; Agnès Bonnet; Bart Buitenhuis; Melania Collado-Romero; Lene N Conley; Magali SanCristobal; Francesco Ferrari; Juan J Garrido; Martien Am Groenen; Henrik Hornshøj; Ina Hulsegge; Li Jiang; Ángeles Jiménez-Marín; Arun Kommadath; Sandrine Lagarrigue; Jack Am Leunissen; Laurence Liaubet; Pieter Bt Neerincx; Haisheng Nie; Jan van der Poel; Dennis Prickett; María Ramirez-Boo; Johanna Mj Rebel; Christèle Robert-Granié; Axel Skarman; Mari A Smits; Peter Sørensen; Gwenola Tosser-Klopp; Michael Watson; Jakob Hedegaard
Journal: BMC Proc Date: 2009-07-16

7. Comparison of three microarray probe annotation pipelines: differences in strategies and their effect on downstream analysis.

Authors: Pieter Bt Neerincx; Pierrot Casel; Dennis Prickett; Haisheng Nie; Michael Watson; Christophe Klopp; Jack Am Leunissen; Martien Am Groenen
Journal: BMC Proc Date: 2009-07-16

7 in total