| Literature DB >> 15461802 |
John E Collins1, Charmain L Wright, Carol A Edwards, Matthew P Davis, James A Grinham, Charlotte G Cole, Melanie E Goward, Begoña Aguado, Meera Mallya, Younes Mokrab, Elizabeth J Huckle, David M Beare, Ian Dunham.
Abstract
We have developed a systematic approach to generating cDNA clones containing full-length open reading frames (ORFs), exploiting knowledge of gene structure from genomic sequence. Each ORF was amplified by PCR from a pool of primary cDNAs, cloned and confirmed by sequencing. We obtained clones representing 70% of genes on human chromosome 22, whereas searching available cDNA clone collections found at best 48% from a single collection and 60% for all collections combined.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15461802 PMCID: PMC545604 DOI: 10.1186/gb-2004-5-10-r84
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Analysis of genome-wide collections
| cDNA collection | Total cDNAs available | Matches to 398 chromosome 22 ORFs at more than 95% identity* | ||||
| Exact match | Gapped match | 5' end match | 3' end match | Internal match | ||
| MGC | 15,454 | 193 | 14 | 21 | 23 | 17 |
| FLJ | 25,696 | 72 | 24 | 25 | 75 | 25 |
| DKFZ | 9,271 | 25 | 10 | 3 | 49 | 16 |
| KIAA | 2,035 | 28 | 1 | 1 | 18 | 13 |
| Invitrogen | 4,361 | 16 | 0 | 61 | 1 | 17 |
| Combined | 56,817 | 240† | 25 | 27 | 39 | 14 |
*For definitions of match types see Materials and methods. Values are not significantly altered by raising the identity required to >99%. †Only 227 (57% of the total ORFs) of these clones maintain the correct reading frame at the amino acid level.
Figure 1Summary of the ORF cloning method.
Figure 2Sequence characteristics of cloned ORFs. (a) Plot of the distribution of the 398 chromosome 22 ORFs by GC content (%) and length (bases). Closed circles are the 331 ORFs that were isolated as acceptable clones (278) or as clones with the correct ORF but currently with a problem in the sequence (53). Dotted circles are the rest of the ORFs which were not amplifiable or clonable (67). (b) Overlap of chromosome 22 ORF clones isolated here with cDNA collections. Analysis of GC content and length for 398 chromosome 22 ORFs, split according to whether the gene has been isolated only by the strategy described here (SANGER, red circles), only in the cDNA collections (OTHER, green triangles), in both (BOTH, black circles), or not at all (NOT, yellow triangles).
Figure 3Schematic Venn diagram showing the relationships of the set of ORF clones isolated here compared with the full-length cDNA clones in current high-throughput clone collections (227 maintain the correct reading frame at the amino acid level from Table 1) for the 398 annotated full-length chromosome 22 ORFs. The four different classes of genes are labeled as in the text and Figure 2b.