| Literature DB >> 17637833 |
Alex S Nord1, Karen Vranizan, Whittemore Tingley, Alexander C Zambon, Kristina Hanspers, Loren G Fong, Yan Hu, Peter Bacchetti, Thomas E Ferrin, Patricia C Babbitt, Scott W Doniger, William C Skarnes, Stephen G Young, Bruce R Conklin.
Abstract
BACKGROUND: High-throughput mutagenesis of the mammalian genome is a powerful means to facilitate analysis of gene function. Gene trapping in embryonic stem cells (ESCs) is the most widely used form of insertional mutagenesis in mammals. However, the rules governing its efficiency are not fully understood, and the effects of vector design on the likelihood of gene-trapping events have not been tested on a genome-wide scale. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2007 PMID: 17637833 PMCID: PMC1910612 DOI: 10.1371/journal.pone.0000617
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Diagram of major mechanisms of gene trapping of an endogenous gene with two exons.
(A) In the SA-trap, the SA site allows trapping when inserted into any part of the gene via plasmid or viral integration. (B) The poly-A trap relies on the poly-A (pA) of the endogenous gene because the neomycin-resistance gene does not have a poly-A tail. Note that the poly-A trap has its own constitutive promoter (prom). Also indicated are the splice donor (SD), splice acceptor (SA), and neomycin resistance (NeoR). The major components of each trap were excluded from this diagram to emphasize on the essential elements needed to understand the trapping models. Detailed maps of each major vector type are referenced in the Methods section.
Summary of gene trap data sets
| Vector | Data set summary | |||
| Lines | Traps | Genes | % in Genes | |
| SA-plasmid | 8410 | 5857 | 2683 | 69.60% |
| SA-viral | 3033 | 1989 | 708 | 65.60% |
| Poly-A | 4879 | 1748 | 998 | 35.80% |
| All IGTC | 49258 | 29147 | 5788 | 59.20% |
Lines, number of cell lines in public gene trap database; Traps, number of gene-trap events mapped to a gene; Genes, total number of unique genes trapped; % in Genes, percent of gene-trap events mapped to exon/intron regions (including UTR) of known genes.
Figure 2Trapped genes by length and expression.
For each vector type, genes were plotted according to their size and level of expression in ESCs. Genes that have been trapped are shown in red. The circle size is proportional to the number of times a gene has been trapped.
Figure 3Models of trap likelihood for gene-trap vectors.
Models of the likelihood of trapping a gene with particular length (x-axis) and expression (y-axis) values for each gene-trap event were created through an iterative process, in which outliers (P<0.001) were removed before the final model was created. Probability (z-axis) is given as events per million traps.
Hotspot Effects and Model Summary
| Vector | Model Summary | Hotspot Effect | ||||||
| Modeled Events | Modeled Genes | Expression P Value | Length P Value | Explained Deviance | Hotspot Genes | Traps in Hotspots | % Total Traps | |
| SA-plasmid | 3513 | 1545 | <0.0001 | <0.0001 | 34% | 26 | 366 | 10.42% |
| SA-retroviral | 1187 | 400 | <0.0001 | <0.0001 | 19% | 18 | 358 | 30.16% |
| Poly-A | 805 | 442 | 0.013 | <0.0001 | 6% | 9 | 170 | 21.12% |
Modeled events and genes represent the number of trap events and unique trapped genes considered in the modeling process. P values for expression and length represent likelihood ratio significance tests. Explained deviance is analogous to the percent of the variance that is explained in a linear regression model. Hotspots reported as the number of genes that fell outside the hotspot cut-off, the number of trap events in the hotspot gene set, and as the percent of modeled traps in hotspot genes.