| Literature DB >> 20185563 |
Anda M Cornea1, David W Russell.
Abstract
The effects of chromosomal position and neighboring genomic elements on gene targeting in human cells remain largely unexplored. To study these, we used a shuttle vector system in which murine leukemia virus (MLV)-based proviral targets present at different chromosomal locations and containing mutations in the neomycin phosphotransferase (neo) gene were corrected by adeno-associated virus (AAV)-mediated gene targeting. Sixteen identical target loci present in HT-1080 human sarcoma cells were all successfully corrected by gene targeting. The gene targeting frequencies varied by as much as 10-fold, and there was a clear bias for correction of one of the targets in clones containing two target sites. The targeting frequency at each site was correlated to the proximity and density of various genomic elements, and we found a significant association of higher targeting frequencies at loci near a subset of dinucleotide microsatellite repeats (r = -0.55, P < 0.05), in particular GT repeats (r = -0.87, P < 0.0001). Additionally, there was a correlation between meiotic recombination rates and targeting frequencies at the target loci (r = 0.52, P < 0.05). There was no correlation between surrounding chromosomal transcription units and targeting frequencies. Our results indicate that certain chromosomal positions are preferred sites for gene targeting in human cells.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20185563 PMCID: PMC2887962 DOI: 10.1093/nar/gkq095
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Shuttle vector rescue of targeted loci. (A) Maps of AAV targeting vector AAV2-HSN5′ containing a neo gene truncated at bp 629 and an MLV vector LHSN37Δ4O provirus containing a 4-bp deletion in neo at bp 37. The locations of the AAV inverted terminal repeats (ITRs), simian virus 40 (SV40) and Tn5 promoters, transcriptional start sites (arrows), hph and neo genes, p15A replication origin, and retrovirus long terminal repeats (LTR) are shown. The fragment used to probe Southern blots is indicated. The strategy used for recovering targeted loci is shown below. (B) Southern blot of genomic DNA from HT-1080-derived clonal cell lines containing a single copy of the LHSN37Δ4O provirus, digested with EcoRI and probed for hph sequences. The positions of size standards are shown on the left. (C) Southern blot of genomic DNA from HT-1080-derived clonal cell lines containing two copies of the LHSN37Δ4O provirus, with clone 17 as a single-copy control, digested and probed as in (B).
Targeting frequencies and site locations in single-target clones
| Clone | Targeting frequency | Chromosomal location | Nearest RefSeq gene | Location of integration site | Distance to transcription start site | Predicted size of hph-hybridizing fragment |
|---|---|---|---|---|---|---|
| 14 | 2.19 × 10−4 | Chr 2: 157,621,301 | Intergenic | −201.3 kb | 5.8 kb | |
| 17 | 3.04 × 10−4 | Chr 6: 45,515,506 | Intron 5 | +17.6 kb | 4.3 kb | |
| 21 | 1.55 × 10−3 | Chr 12: 78,950,944 | Intergenic | +259.5 kb | 7.2 kb | |
| 24 | 2.69 × 10−4 | Chr 7: 129,782,991 | Intron 6 | +11.1 kb | 8.6 kb | |
| 27 | 1.81 × 10−4 | Chr 9: 16,326,741 | Intergenic | −72.8 kb | 4.1 kb | |
| 36 | 5.77 × 10−4 | Chr 6: 150,220,289 | Intron 6 | −31.0 kb | 13.1 kb | |
| 42 | 1.07 × 10−3 | Chr 2: 15,149,860 | Intergenic | −74.6 kb | 12.6 kb | |
| 45 | 1.06 × 10−3 | Chr 2: 207,966,780 | Intergenic | −136.1 kb | 9.6 kb | |
| 48 | 2.21 × 10−4 | Chr 14: 37,194,933 | Intron 2 | +66.0 kb | 4.2 kb | |
| 61 | 9.44 × 10−4 | Chr 19: 18,560,834 | Exon 2 | +0.3 kb | 18.7 kb |
aPositive and negative distances indicate that the integration site is downstream or upstream, respectively, relative to the nearest RefSeq transcription start site.
bSee Figure 1B for bands of corresponding sizes on Southern blots.
Targeting frequencies and site locations in double-target clones
| Clone | Overall targeting frequency | Proportion targeted at each site | Partial targeting frequency | Chromosomal location | Nearest RefSeq gene | Location of integration site | Distance to transcription start site | Predicted size of hph-hybridizing fragment |
|---|---|---|---|---|---|---|---|---|
| 2 | 1.15 × 10−3 | A: 13/20 | 7.46 × 10−4 | Chr 16: 19,636,191 | Intron 1 | −0.9 kb | 4.4 kb | |
| B: 7/20 | 4.01 × 10−4 | Chr 9: 12,804,291 | Intron 1 | −39.3 kb | 7.1 kb | |||
| 38 | 2.10 × 10−3 | A: 18/20 | 1.89 × 10−3 | Chr 3: 173,242,513 | Intron 1 | −1.5 kb | 8.4 kb | |
| B: 2/20 | 2.10 × 10−4 | Chr 16: 30,576,276 | Intergenic | −6.5 kb | 10.3 kb | |||
| 49 | 1.32 × 10−3 | A: 16/21 | 1.00 × 10−3 | Chr 5: 153,581,833 | Intron 1 | −31.3 kb | 5.5 kb | |
| B: 5/21 | 3.14 × 10−4 | Chr 4: 167,577,088 | Intergenic | +314.0 kb | 7.5 kb |
aRepresents the number of rescued target plasmids from site A or B over the total analyzed for each clone.
bPositive and negative distances indicate that the integration site is downstream or upstream, respectively, relative to the nearest RefSeq transcription start site.
cSee Figure 1C for bands of corresponding sizes on Southern blot.
Figure 2.Gene targeting frequencies of clonal cell lines. (A) Frequencies in clones containing one LHSN37Δ4O target locus infected with AAV2-HSN5′ at an MOI of 10 000 vector particles/cell. (B) Frequencies in clones containing two LHSN37Δ4O target loci infected as in (A). (C) Partial gene targeting frequencies for each target (a or b) present in cell lines containing two targets.
Figure 3.Northern blot analysis and effect of trichostatin A (TSA) on gene targeting. (A) Northern blot of total RNA extracted from the HT-1080-derived clonal cell lines containing a single copy of the LHSN37Δ4O provirus having the two lowest (clones 27 and 14) and two highest (clones 42 and 21) gene targeting frequencies. The blot was probed for neo transcripts and for GAPDH transcripts to check loading. The positions of size standards are shown on the left, and the three expected neo transcript forms (full-length, spliced or short) and corresponding sizes are indicated on the right. (B) Comparison of targeting frequencies (calculated relative to targeting frequency of clone 27) and neo transcript levels (calculated relative to full-length transcript of clone 27) for the four clones. (C) Gene targeting frequencies in four clones treated with or without 125 nM TSA from 4.5 h after infection with AAV2-HSN5′ at an MOI of 10 000 vector particles/cell until splitting.
Correlation between targeting frequencies and distance to or density of genomic elements
| Genomic element | Distance | Density | |||
|---|---|---|---|---|---|
| Interval size | |||||
| bp | Natural log bp | ±1 kb | ±10 kb | ±100 kb | |
| Correlation coefficient, | Correlation coefficient, | Correlation coefficient, | Correlation coefficient, | Correlation coefficient, | |
| RefSeq transcripts | 0.01 | −0.15 | 0.15 | −0.03 | 0.02 |
| CpG islands | −0.23 | −0.21 | 0.00 | 0.04 | −0.17 |
| Simple repeats | −0.11 | −0.13 | 0.05 | 0.39 | 0.32 |
| Microsatellites (all) | −0.33 | −0.37 | n/a | 0.31 | 0.34 |
| SINEs | −0.12 | −0.03 | −0.04 | 0.25 | 0.07 |
| LINEs | −0.07 | −0.08 | 0.17 | −0.01 | −0.05 |
| DNA transposons | 0.03 | 0.05 | −0.17 | 0.10 | 0.17 |
| LTR retrotransposons | −0.23 | −0.18 | −0.18 | 0.29 | 0.26 |
| RNA repeats | 0.21 | 0.19 | n/a | n/a | −0.28 |
aDistance was measured from bp 37 of the neo gene in the MLV target locus to the center of the nearest respective genomic element, with the exception of RefSeq transcripts, where distance was measured to the start site of the nearest RefSeq transcription unit; correlation coefficients were computed using either distance (in bp) or the natural logarithm of distance.
bDensity was measured as the total number of base pairs covered by the corresponding genomic element within a given interval surrounding the target integration site.
cSee Figure 3 for representative scatter plots.
dn/a denotes the absence of specified genomic element within given interval.
Figure 4.Effects of neighboring genetic elements on gene targeting. In each panel, scatter plots on the left graph the targeting frequency at each target site and the natural logarithm of the distance from bp 37 of the neo gene in each target to the center of the indicated genetic element. Scatter plots on the right graph the density (total number of base pairs covered) of each genomic element within 100 kb (open circle), 10 kb (filled triangle) or 1 kb (open square) on either side of each target locus integration site. The lack of density values within ±1 kb in some panels is due to the absence of the respective genomic element in that interval. Correlation coefficients (r) and P-values (for significant r) are to the right of the scatter plots.
Correlation between targeting frequencies and distance to or density of dinucleotide repeats
| Dinucleotide repeat | Distance | Density | |||
|---|---|---|---|---|---|
| Interval size | |||||
| bp | Natural log bp | ±1 kb | ±10 kb | ±100 kb | |
| Correlation coefficient, | Correlation coefficient, | Correlation coefficient, | Correlation coefficient, | Correlation coefficient, | |
| GT, TG, AC and CA repeats | −0.44 | −0.55 | n/a | 0.43 | 0.36 |
| GT | −0.62 | −0.87 | n/a | 0.75 | 0.85 |
| TG | 0.26 | 0.26 | n/a | −0.21 | −0.12 |
| AC | −0.19 | −0.15 | n/a | 0.06 | −0.09 |
| CA | −0.15 | −0.16 | n/a | n/a | 0.08 |
| GA, AG, TC and CT repeats | −0.20 | −0.05 | n/a | n/a | −0.02 |
| GA | −0.11 | −0.01 | n/a | n/a | n/a |
| AG | 0.00 | −0.01 | n/a | n/a | 0.16 |
| TC | −0.14 | −0.12 | n/a | n/a | 0.20 |
| CT | −0.17 | 0.00 | n/a | n/a | −0.26 |
| AT and TA repeats | 0.49 | 0.40 | n/a | −0.24 | −0.07 |
| AT | 0.39 | 0.30 | n/a | n/a | 0.44 |
| TA | 0.36 | 0.41 | n/a | −0.24 | −0.24 |
aMicrosatellites GT, TG, AG, CA, GT, GA, AG, TC, CT, AT and TA represent sequences of at least 15 perfect dinucleotide repeats.
bDistance and density were computed as in Table 3.
cSee Figure 3 for representative scatter plots.
dRepresents significant correlation (P < 0.05).
en/a denotes the absence of specified genomic element within given interval.
fRepresents significant correlation (P = 0.01).
gRepresents significant correlation (P < 0.0001).
hRepresents significant correlation (P < 0.001).
iRepresents significant correlation (P < 0.0001).
Figure 5.Correlation between gene targeting frequencies and the sex-averaged meiotic recombination rates. The scatter plot graphs meiotic recombination rates from the deCODE (open circle), Marshfield (filled triangle) and Genethon (open square) genetic maps at each provirus target site in centiMorgans/Megabase. Correlation coefficients (r) and P-values (for significant r) are to the right of the corresponding legends.