| Literature DB >> 21368825 |
Athurva Gore1, Zhe Li, Ho-Lim Fung, Jessica E Young, Suneet Agarwal, Jessica Antosiewicz-Bourget, Isabel Canto, Alessandra Giorgetti, Mason A Israel, Evangelos Kiskinis, Je-Hyuk Lee, Yuin-Han Loh, Philip D Manos, Nuria Montserrat, Athanasia D Panopoulos, Sergio Ruiz, Melissa L Wilbert, Junying Yu, Ewen F Kirkness, Juan Carlos Izpisua Belmonte, Derrick J Rossi, James A Thomson, Kevin Eggan, George Q Daley, Lawrence S B Goldstein, Kun Zhang.
Abstract
Defined transcription factors can induce epigenetic reprogramming of adult mammalian cells into induced pluripotent stem cells. Although DNA factors are integrated during some reprogramming methods, it is unknown whether the genome remains unchanged at the single nucleotide level. Here we show that 22 human induced pluripotent stem (hiPS) cell lines reprogrammed using five different methods each contained an average of five protein-coding point mutations in the regions sampled (an estimated six protein-coding point mutations per exome). The majority of these mutations were non-synonymous, nonsense or splice variants, and were enriched in genes mutated or having causative effects in cancers. At least half of these reprogramming-associated mutations pre-existed in fibroblast progenitors at low frequencies, whereas the rest occurred during or after reprogramming. Thus, hiPS cells acquire genetic modifications in addition to epigenetic modifications. Extensive genetic screening should become a standard procedure to ensure hiPS cell safety before clinical use.Entities:
Mesh:
Year: 2011 PMID: 21368825 PMCID: PMC3074107 DOI: 10.1038/nature09805
Source DB: PubMed Journal: Nature ISSN: 0028-0836 Impact factor: 49.962
Sequencing statistics for mutation discovery
Quality filtered sequence represents the total amount of sequencer data generated that passed the Illumina GA IIx quality filter. Number of high quality coding variants is the number of variants found with sequencing depth of at least 8 and consensus quality score of at least 30. dbSNP percentage represents the percent of identified variants present in the dbSNP database. Shared coding region is the portion of the genome, in base pairs, that was sequenced at high depth and quality in both the iPS line and its progenitor fibroblast. The number of coding mutations lists both the number of identified coding mutations and a projection of the total number of identified mutations based on the fraction of CCDS variants (out of ~17,000 expected variants)16 successfully identified in both hiPS and Fibroblast.
| Cell Line | Exome Capture | Quality Filtered | # of High- | dbSNP | Shared High- | # Coding |
|---|---|---|---|---|---|---|
| CV-hiPS-F | Padlock+SeqCapEZ | 9,928,014,640 | 15,595 | 98% | 16,374,878 | 14 (15) |
| CV-hiPS-B | SeqCap EZ | 7,977,894,480 | 14,876 | 98% | 21,891,518 | 10 (12) |
| CV-Fibroblast | Padlock+SeqCapEZ | 7,586,731,600 | 15,442 | 98% | ||
|
| ||||||
| DF-6-9-9 | Padlock+SeqCapEZ | 9,289,593,520 | 14,366 | 95% | 17,806,151 | 6 (7) |
| DF-19-11 | SeqCap EZ | 3,212,662,880 | 13,792 | 95% | 21,342,017 | 7 (9) |
| iPS4.7 | SeqCap EZ | 3,132,462,400 | 14,154 | 95% | 21,729,562 | 4 (5) |
| Foreskin | Padlock+SeqCapEZ | 8,430,654,720 | 14,819 | 95% | ||
|
| ||||||
| PGP1-iPS | SeqCap EZ | 4,599,556,400 | 14,105 | 95% | 19,681,915 | 3 (4) |
| PGP1- | SureSelect | 3,504,437,120 | 14,781 | 95% | ||
|
| ||||||
| dH1F-iPS8 | SeqCap EZ | 3,950,994,160 | 13,552 | 96% | 16,874,057 | 8 (10) |
| dH1F-iPS9 | SeqCap EZ | 3,945,196,800 | 14,191 | 95% | 21,536,158 | 3 (4) |
| dH1F Fibroblast | SeqCap EZ | 3,373,535,920 | 13,838 | 95% | ||
|
| ||||||
| iPS11a | SureSelect | 1,836,303,440 | 13,845 | 95% | 18,557,098 | 4 (5) |
| iPS11b | SureSelect | 3,378,603,200 | 15,152 | 95% | 17,206,934 | 7 (8) |
| Hib11 Fibroblast | SureSelect | 5,660,864,960 | 13,579 | 95% | ||
|
| ||||||
| iPS17a | SureSelect | 4,805,756,800 | 15,039 | 95% | 17,888,773 | 4 (5) |
| iPS17b | SureSelect | 7,129,037,520 | 15,400 | 95% | 19,902,076 | 5 (6) |
| Hib17 Fibroblast | SureSelect | 3,962,506,880 | 13,365 | 96% | ||
|
| ||||||
| iPS29A | SureSelect | 4,112,237,360 | 13,464 | 94% | 17,328,182 | 2 (3) |
| iPS29e | SureSelect | 1,669,916,080 | 13,800 | 94% | 18,985,791 | 7 (9) |
| Hib29 Fibroblast | SureSelect | 4,388,388,320 | 14,445 | 95% | ||
|
| ||||||
| dH1cF16-iPS1 | SeqCap EZ | 4,321,661,440 | 15,061 | 95% | 19,601,528 | 2 (2) |
| dH1cF16-iPS4 | SeqCap EZ | 4,668,085,920 | 14,958 | 95% | 23,956,732 | 6 (7) |
| dH1cF16 | SeqCap EZ | 4,178,664,160 | 14,879 | 95% | ||
|
| ||||||
| CF-RiPS1.4 | SeqCap EZ | 4,733,743,840 | 11,344 | 96% | 21,272,233 | 1 (2) |
| CF-RiPS1.9 | SeqCap EZ | 3,143,591,760 | 13,674 | 95% | 21,165,013 | 3 (4) |
| CF Fibroblast | SeqCap EZ | 3,204,874,880 | 11,855 | 96% | ||
|
| ||||||
| FiPS3F1 | SeqCap EZ | 3,397,397,360 | 13,333 | 94% | 20,723,620 | 3 (4) |
| FiPS4F7 | SeqCap EZ | 3,346,801,280 | 14,584 | 94% | 21,608,258 | 2 (3) |
| HFFxF | SeqCap EZ | 3,331,494,880 | 13,040 | 94% | ||
|
| ||||||
| FiPS4F2p9 | SeqCap EZ | 4,725,258,400 | 18,033 | 92% | 25,188,054 | 7 (7) |
| FiPS4F2p40 | SeqCap EZ | 4,848,006,000 | 18,376 | 92% | 25,411,595 | 4 (4) |
| FiPS4F- | SeqCap EZ | 4,911,008,400 | 19,491 | 92% | 25,240,944 | 8 (8) |
| IMR90 | SeqCap EZ | 5,019,916,240 | 18,220 | 92% | ||
For DF-6-9-9 and FS, mutation calling was performed individually using both Padlock Probe data and hybridization capture data. Each method found five mutations, four of which were shared, leading to a total of six mutations. Padlock probe and hybridization capture have separate strengths (specificity vs. unbiased coverage); it appears these factors directly affect the ability to find separate mutations.
Figure 1hiPS acquired protein-coding somatic mutations
Somatic mutations in the gene NTRK3 were found in two independent hiPS lines but were not present in their fibroblast progenitors. Detailed information for all mutations is in the Supplementary Materials.
List of genes found to be mutated in coding regions in hiPS cells
The full details of each mutation are in Supplementary Table 1.
| Cell Line | Mutated Genes | Number of | Detectable at Low |
|---|---|---|---|
| CF-RiPS1.4 |
| 1 | N/A |
| CF-RiPS1.9 |
| 3 | N/A |
| CV-hiPS-B |
| 7 | 7/8 |
| CV-hiPS-F |
| 12 | 4/7 |
| DF19.11 |
| 5 | N/A |
| DF6-9-9 |
| 5 | 0/5 |
| dH1CF16-iPS1 |
| 1 | N/A |
| dH1CF16-iPS4 |
| 4 | N/A |
| dH1F-iPS8 |
| 6 | N/A |
| dH1F-iPS9 |
| 3 | N/A |
| FiPS3F1 |
| 2 | N/A |
| FiPS4F7 |
| 2 | N/A |
| iPS11A |
| 3 | 1/1 |
| iPS11B |
| 5 | 0/1 |
| iPS17A |
| 4 | N/A |
| iPS17B |
| 5 | 1/1 |
| iPS29A |
| 2 | 2/2 |
| iPS29E |
| 6 | 1/4 |
| iPS4.7 |
| 2 | N/A |
| PGP1-iPS |
| 1 | 1/3 |
| FiPS4F2 |
| 7 | N/A |
| FiPS4F-shpRB4.5 |
| 5 | N/A |
Mutation was observed at passage 40 but not at passage 9. FiPS4F2 was sequenced at both passage 9 and passage 40. Six mutations were present after reprogramming (FiPS4F2P9), while four more became fixed after extended culture (FiPS4F2P40). All six mutations found after reprogramming were also present after extended culture.