Literature DB >> 28596815

Next-generation sequencing traces human induced pluripotent stem cell lines clonally generated from heterogeneous cancer tissue.

Tetsuya Ishikawa1.   

Abstract

AIM: To investigate genotype variation among induced pluripotent stem cell (iPSC) lines that were clonally generated from heterogeneous colon cancer tissues using next-generation sequencing.
METHODS: Human iPSC lines were clonally established by selecting independent single colonies expanded from heterogeneous primary cells of S-shaped colon cancer tissues by retroviral gene transfer (OCT3/4, SOX2, and KLF4). The ten iPSC lines, their starting cancer tissues, and the matched adjacent non-cancerous tissues were analyzed using next-generation sequencing and bioinformatics analysis using the human reference genome hg19. Non-synonymous single-nucleotide variants (SNVs) (missense, nonsense, and read-through) were identified within the target region of 612 genes related to cancer and the human kinome. All SNVs were annotated using dbSNP135, CCDS, RefSeq, GENCODE, and 1000 Genomes. The SNVs of the iPSC lines were compared with the genotypes of the cancerous and non-cancerous tissues. The putative genotypes were validated using allelic depth and genotype quality. For final confirmation, mutated genotypes were manually curated using the Integrative Genomics Viewer.
RESULTS: In eight of the ten iPSC lines, one or two non-synonymous SNVs in EIF2AK2, TTN, ULK4, TSSK1B, FLT4, STK19, STK31, TRRAP, WNK1, PLK1 or PIK3R5 were identified as novel SNVs and were not identical to the genotypes found in the cancer and non-cancerous tissues. This result suggests that the SNVs were de novo or pre-existing mutations that originated from minor populations, such as multifocal pre-cancer (stem) cells or pre-metastatic cancer cells from multiple, different clonal evolutions, present within the heterogeneous cancer tissue. The genotypes of all ten iPSC lines were different from the mutated ERBB2 and MKNK2 genotypes of the cancer tissues and were identical to those of the non-cancerous tissues and that found in the human reference genome hg19. Furthermore, two of the ten iPSC lines did not have any confirmed mutated genotypes, despite being derived from cancerous tissue. These results suggest that the traceability and preference of the starting single cells being derived from pre-cancer (stem) cells, stroma cells such as cancer-associated fibroblasts, and immune cells that co-existed in the tissues along with the mature cancer cells.
CONCLUSION: The genotypes of iPSC lines derived from heterogeneous cancer tissues can provide information on the type of starting cell that the iPSC line was generated from.

Entities:  

Keywords:  Cancer associated fibroblast; Clonal evolution; Colon cancer; Genotype; Heterogeneous cancer tissue; Induced pluripotent stem cell; Next-generation sequencing; Pre-cancer cell; Single cell; Single-nucleotide variant

Year:  2017        PMID: 28596815      PMCID: PMC5440771          DOI: 10.4252/wjsc.v9.i5.77

Source DB:  PubMed          Journal:  World J Stem Cells        ISSN: 1948-0210            Impact factor:   5.326


Core tip: Ten induced pluripotent stem cell (iPSC) lines were clonally generated from heterogeneous colon cancer tissues and analyzed with next-generation sequencing. Non-synonymous single-nucleotide variants (SNVs) of the iPSC lines were not identical to the genotypes of the cancer tissues. The SNVs were de novo or pre-existing mutations that originated from a minor population within the cancer tissue. Meanwhile, the genotypes of the iPSC lines were not mutated genotypes of the cancer tissues, suggesting that the starting cells for the iPSC lines were not mature cancer cells. Thus, the genotypes of iPSC lines can be used to trace the genomic origins of single cells within heterogeneous cancer tissue.

INTRODUCTION

Gene transfer of OCT3/4, SOX2, KLF4, and c-Myc to somatic cells generates human induced pluripotent stem cells (iPSCs)[1-3] although c-MYC is not required for iPSC generation[4]. Human iPSCs are indistinguishable from human embryonic stem cells (ESCs) in terms of their long-term self-renewal ability and their in vivo pluripotency[3,5]. The starting cells for iPSC generation should be appropriately chosen to generate normal or aberrant iPSC lines for the purpose of regenerative medicine or cancer research/therapy. Human iPSC lines for regenerative medicine would be ideally generated from normal neonatal tissues[3] that are typically free of postnatal aberrant mutations and epigenetic changes. Human iPSCs (or iPSC-like cells) have also been generated from cancer cell lines[6,7], the somatic cells from familial cancer patients[8,9], and pancreatic ductal adenocarcinomas[10]. For cancer research/therapy, it is of great interest to generate iPSCs from heterogeneous cancer tissues. In our recent study[11], human iPSC lines were clonally generated from a heterogeneous mixture of primary cells derived from gastric tissues or colon cancer tissues and were subjected to microarray gene expression analysis. The resultant iPSC lines expressed all ESC-enriched genes including POU5F1, SOX2 and NANOG that are essential for self-renewal ability and pluripotency[5,12] at a level equivalent to those of the typical human iPSC line (201B7)[1]. Genome-wide gene expression patterns were used to categorize the reference iPSC line 201B7 and the iPSC lines derived from distinct cancer tissues into three different groups. The gene expression profiles of these iPSC lines demonstrated differences derived from their distinct starting tissues and similarity and heterogeneity derived from their common starting heterogeneous tissues. More recently, it was reported that reference component analysis (RCA), an algorithm that substantially improves clustering accuracy, was developed to robustly cluster single-cell transcriptomes[13]. The RCA of single-cell transcriptomes elucidated cellular heterogeneity in human colorectal cancer[13]. In this study, iPSC technology and next-generation sequencing were used to resolve genotype variation among single cells within a heterogeneous cancer tissue. The genomic DNA of ten iPSC lines that were clonally generated from human colon cancer tissue was analyzed and compared with the genomic DNA from their cancer tissue of origin and matched adjacent non-cancerous tissue.

MATERIALS AND METHODS

Tissues derived from a single colon cancer patient

This study was conducted with the approval of the Institutional Review Boards of the National Cancer Center of Japan and the Japanese Collection of Research Bioresources (JCRB), National Institutes of Biomedical Innovation, Health and Nutrition. Written informed consent from a single donor was obtained for the use of the tissues for research. The anonymous remnant non-cancerous and cancerous tissues were provided by the JCRB Tissue Bank. The tissues were derived from the surgical waste material from an operation performed on a 55-year-old Japanese male S-shaped colon cancer patient.

Primary cell culture from cancer tissues

Heterogeneous primary cell culture from the colon cancer tissues was prepared as previously described[11]. Briefly, the tissues were washed with Hank’s balanced salt solution (HBSS) and minced into pieces with scissors. The pieces were further washed with HBSS. DMEM with collagenase was added to the tissue precipitates and mixed at 37 °C for 1 h on a shaker. After washing with DMEM, cells were seeded on collagen-coated dishes and cultured in DMEM supplemented with 10% FBS.

Generation of human iPSC lines

The study was approved by the Institutional Recombinant DNA Advisory Committee. Heterogeneous primary cells from the cancer tissue were cultured for 24 h at approximately 5%-10% confluency and then incubated with a pantropic retrovirus vector solution (OCT3/4, KLF4, and SOX2) at 37 °C for an additional 24 h. The vector solution was prepared as previously described[14]. Mitomycin C-treated mouse embryonic fibroblasts (MEFs, ReproCell) were seeded and co-cultured with the primary cells following the retroviral infection. The culture medium was replaced with MEF-conditioned ESC medium every 3 d until the cell layer was fully confluent and then further refreshed with mTeSR1 medium (STEMCELL Technologies) every day. Each independent colony was isolated from the culture using forceps under a microscope. The independent iPSC lines were sub-cultured with MEF in gelatin-coated 24-well plates.

Expansion and passage culture of human iPSC lines

Human iPSC lines were cultured with the MEFs in primate ESC, ReproStem (ReproCell) or mTeSR1 medium in gelatin-coated dishes[11]. The expanded iPSC lines were treated with a dissociation solution (ReproCell) or 0.25% trypsin-EDTA (Gibco) and passaged in media supplemented with 10-20 μmol/L Y-27632 to avoid cell death[3]. Independent iPSC lines were passaged from the 24-well plates into 6-well plates, further expanded into 100-mm dishes, and minimally passaged in 100-mm dishes under similar culture conditions. Each genomic DNA sample was prepared from independent iPSC lines.

Real-time RT-PCR analysis

Total RNA was prepared using the miRNeasy Mini Kit (Qiagen). Reverse transcription of the RNA was carried out using an iScript™ Advanced cDNA Synthesis Kit for RT-qPCR (Bio-Rad). Quantitative PCR was carried out with an SsoAdvanced Universal SYBR® Green Supermix using the CFX96 Real-Time PCR Detection System (Bio-Rad). PCR primer sets for OCT3/4, SOX2, NANOG, ZFP42, and GAPDH are listed in Supplemental Table 1. PCR data were analyzed using CFX Manager Software (Bio-Rad). PCR data from the iPSC 201B7[1] RNA were used as a positive control, and PCR data from cancer tissue-derived iPSC lines are presented as quantification cycle (Cq) values. Read number (original) NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Target selection and sequencing

Target sequencing was conducted for twelve DNA samples from the cancer tissues, the non-cancerous tissues, and the ten iPSC lines. Genomic DNA was extracted from each of twelve samples using the DNeasy Blood AND Tissue Kit (Qiagen), sheared into approximately 150-bp fragments, and used to make a library for multiplexed paired-end sequencing with the SureSelectXT Reagent Kit (Agilent Technologies). The constructed library was hybridized to biotinylated cRNA oligonucleotide baits from the SureSelectXT Human Kinome Kit (Agilent Technologies) for target enrichment. Targeted sequences were purified by magnetic beads, amplified, and sequenced on an Illumina HiSeq2000 platform in a paired-end 101 bp configuration.

Mapping and single-nucleotide variant calling

Adapter sequences were removed by cutadapt (v1.2.1). After quality control, reads were mapped to the human reference genome hg19 using BWA (ver.0.6.2). Mapping results were corrected using Picard (ver.1.73) for removing duplicates and GATK (ver.1.5-32) for local alignment and base quality score recalibration. Single-nucleotide variant (SNV) calls were performed with multi-sample calling using GATK (UnifiedGenotyper) and filtered to coordinates with a variant call quality score ≥ 30 and a depth ≥ 8. SNVs were further classified based on their predicted functions of missense, nonsense or read-through. For final confirmation, SNVs were manually curated using the Integrative Genomics Viewer. Annotations of SNVs were based on dbSNP135, CCDS (NCBI release 20111122), RefSeq (UCSC Genome Browser, dumped 20111122), GENCODE (UCSC Genome Browser, ver. 7), and 1000 Genomes (release 20111011) sequences.

RESULTS

Human iPSC lines derived from colon cancer tissues

The human iPSC lines CC1-1, CC1-2, CC1-7, CC1-8, CC1-9, CC1-11, CC1-12, CC1-17, CC1-18, and CC1-25 were clonally generated from heterogeneous primary cells cultured from colon cancer tissue. The iPSC lines were expanded serially with MEFs in gelatin-coated dishes. The cancer tissue-derived iPSCs were indistinguishable in morphology from typical (fibroblast-derived) human iPSCs under conventional culture with MEFs (upper panels in Figure 1 and Supplemental Figure 1). The human iPSCs formed colonies consisting of very small cells and were efficiently passaged at a high recovery ratio with the addition of 10-20 μmol/L Y-27632 to the cell culture medium. Human iPSCs were also cultured with feeder-free mTeSR1 medium in BD MatrigelTM-coated 100-mm dishes and showed a high nucleus-to-cytoplasm ratio (lower panels in Figure 1 and Supplemental Figure 1).
Figure 1

Phase contrast micrographs of colon cancer tissue-derived induced pluripotent stem cells. The human iPSC line CC1-1 was expanded with mitomycin C-treated mouse embryonic fibroblasts in gelatin-coated dishes (upper left panel: × 10, upper right panel: × 20) and cultured with feeder-free mTeSR1 medium in BD MatrigelTM-coated dishes (lower left panel: × 10, lower right panel: × 20). iPSC: Induced pluripotent stem cell; MEF: Mouse embryonic fibroblast.

Phase contrast micrographs of colon cancer tissue-derived induced pluripotent stem cells. The human iPSC line CC1-1 was expanded with mitomycin C-treated mouse embryonic fibroblasts in gelatin-coated dishes (upper left panel: × 10, upper right panel: × 20) and cultured with feeder-free mTeSR1 medium in BD MatrigelTM-coated dishes (lower left panel: × 10, lower right panel: × 20). iPSC: Induced pluripotent stem cell; MEF: Mouse embryonic fibroblast.

Expression of human ESC-essential genes

ESC-essential gene expression of the cancer tissue-derived iPSC lines was quantitatively analyzed by real-time RT-PCR. All ten iPSC lines expressed POU5F1, SOX2, and NANOG, which are essential for self-renewal and pluripotency, at a level equivalent to those of the reference iPSC line (Supplemental Table 2). The study results support previously published microarray data showing that cancer tissue-derived iPSCs equally express ESC-enriched genes[11]. Read number (modified) Modified read file is a data set from the original read file with the adapter sequences and low-quality bases removed; Therefore, there were reads shorter than the number indicated by the read length (b) in a portion of the modified read file. NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Next-generation sequencing

The target region (SureSelect Human Kinome) in genomic DNA samples from the ten iPSC lines, their starting cancer tissues, and the matched adjacent non-cancerous tissues was analyzed using next-generation sequencing. The target region of approximately 3.2 Mb covers the genome of the coding region of all known protein kinase genes and selected oncogenes and tumor suppressor genes, for a total of 612 genes (Supplemental Table 3). The original reads (2.6-4.0 Gb of sequence) were obtained from each genomic DNA sample by sequencing (Table 1). The modified reads were generated from the original reads (Table 2). The results of the mapped reads, the sequencing depth, and target capture are summarized in Tables 3, 4 and 5. The average depth on the target region ranged from 317 to 496. More than 99.76% of the target region was covered with at least 8 × depth for high-quality genotype calls (variant call quality score ≥ 30).
Table 1

Read number (original)

Original
SampleNo. of readsRead length (b)No. of bases (Gb)
NCC1182607181013.7
18260718101
CC1167061901013.4
16706190101
CC1-1130457401012.6
13045740101
CC1-2177257721013.6
17725772101
CC1-7147805071013.0
14780507101
CC1-8173119721013.5
17311972101
CC1-9166640671013.4
16664067101
CC1-11154556381013.1
15455638101
CC1-12153913611013.1
15391361101
CC1-17190099571013.8
19009957101
CC1-18197463131014.0
19746313101
CC1-25154925601013.1
15492560101

NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Table 2

Read number (modified)

Modified1
SampleNo. of readsRead length (b)Ratio (%) (Mod/Ori)
NCC118146940101299.38
18146940101299.38
CC116597436101299.35
16597436101299.35
CC1-112942132101299.21
12942132101299.21
CC1-217614866101299.37
17614866101299.37
CC1-714687008101299.37
14687008101299.37
CC1-817180329101299.24
17180329101299.24
CC1-916545785101299.29
16545785101299.29
CC1-1115346749101299.30
15346749101299.30
CC1-1215281269101299.28
15281269101299.28
CC1-1718880292101299.32
18880292101299.32
CC1-1819618808101299.35
19618808101299.35
CC1-2515378555101299.26
15378555101299.26

Modified read file is a data set from the original read file with the adapter sequences and low-quality bases removed;

Therefore, there were reads shorter than the number indicated by the read length (b) in a portion of the modified read file. NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Table 3

Mapped reads

NCC1CC1CC1-1CC1-2CC1-7CC1-8CC1-9CC1-11CC1-12CC1-17CC1-18CC1-25
No. of total reads ①362938803319487225884264352297322937401634360658330915703069349830562538377605843923761630757110
No. of mapped reads ② (③+ ④ + ⑤)362108413306619425545096348455962891155533842665329767802989898630464308366705263807850830173511
No. of mapped reads with Paired-End ③269351802633374821884450259811402149882226807330267042802538683025103580295136943086918621622036
No. of mapped reads with Single-End ④157412405014051165081233617984146512742914354273362545619687
No. of discarded reads1925992067083963646595884794874003977017351625784944847275346374712949671838668531788
No. of unmapped reads (①‐②)830391286783391683841364624615179931147907945129823010900581159108583599
No. of effective reads (③ + ④)269509212635779821898501259976482151115826825314267189312541425925117934295410303089464221641723

Discarded reads were as follows: Reads mapped to chromosomes other than the target; Reads where each paired-end is mapped to a different chromosome; Reads not used for single-nucleotide variant/InDel detection such as PCR duplicates. Each number of ② consists of each total number of "③ plus ④ plus ⑤"; "① - ②" means "each number of ① minus each number of ②"; "③ + ④" means "each number of ③ plus each number of ④". NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the iPSC lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Table 4

Sequence depth

Theoretical depth1
Observed depth2
SampleTotal bases (Mb)Depth3Effective bases on target (Mb)Average depth on target
NCC136891173.311399445.06
CC133751073.431084344.89
CC1-12635838.23999317.89
CC1-235811138.941279406.72
CC1-72986949.691195380.13
CC1-834971112.351352430.01
CC1-933661070.721355431.06
CC1-113122993.071077342.50
CC1-123109988.941359432.12
CC1-1738401221.451407447.66
CC1-1839891268.761559495.99
CC1-253129995.451083344.37

Theoretical depth calculated from the total number of bases obtained by DNA sequencing;

Observed depth used for single-nucleotide variant/InDel identification;

Theoretical depth [Total Bases (Mb)]/[Target region (Mb)]. Target region: SureSelect Human Kinome Kit (approximately 3.1 Mb). NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Table 5

Target capture

NCC1CC1CC1-1CC1-2CC1-7CC1-8CC1-9CC1-11CC1-12CC1-17CC1-18CC1-25
Initial bases on target ①314381231438123143812314381231438123143812314381231438123143812314381231438123143812
Initial bases near target ②379064537906453790645379064537906453790645379064537906453790645379064537906453790645
Initial bases on or near target ③693445769344576934457693445769344576934457693445769344576934457693445769344576934457
Total effective reads ④269509212635779821898501259976482151115826825314267189312541425925117934295410303089464221641723
Total effective bases (Mb) ⑤266326022157257321252643263725032480291430472133
Read length mean (b)98.9198.8398.5899.0298.8498.6198.7798.5898.7998.7498.6998.65
Read length median (b)101101101101101101101101101101101101
Effective bases on target (Mb) ⑥13991084999127911951352135510771359140715591083
Effective bases near target (Mb) ⑦440342355405380438448327445430450342
Effective bases on or near target (Mb) ⑧183914261354168315751790180414031804183720091425
Fraction of effective bases on target (%) (⑥/⑤)52.5441.6746.3249.7056.2551.1451.3843.0154.7848.3051.1850.75
Fraction of effective bases near target (%) (⑦/⑤)16.5213.1416.4515.7317.8816.5617.0013.0517.9614.7514.7716.05
Fraction of effective bases on or near target (%) (⑧/⑤)69.0654.8262.7865.4374.1267.7068.3856.0672.7463.0565.9566.80
Average sequencing depth on target (⑥/①)445.06344.89317.89406.72380.13430.01431.06342.50432.12447.66495.99344.37
Average sequencing depth near target (⑦/②)116.0590.2293.63106.76100.21115.48118.3186.16117.51113.43118.6890.33
Average sequencing depth on or near target (⑧/③)265.21205.68195.30242.75227.11258.07260.10202.38260.14264.96289.74205.50
Base covered on target ⑨314322131431523142784314354031432803143263314327731428873143338314303531432963142818
Coverage of target region (%) (⑨/①)99.9899.9899.9799.9999.9899.9899.9899.9799.9899.9899.9899.97
Base covered near target ⑩377567137738693774076377191537688923773823377694237602183776060376621537620313762823
Coverage of near target region (%) (⑩/②)99.6099.5699.5699.5199.4399.5699.6499.2099.6299.3699.2599.27
Fraction of target covered with at least 15 × (%)99.7299.6299.5999.6999.6599.6899.7099.5599.6999.6099.6899.58
Fraction of target covered with at least 8 × (%)99.8699.7899.7899.8399.8199.8299.8399.7699.8399.7899.8399.77
Fraction of target covered with at least 4 × (%)99.9399.8999.8999.9099.9099.9099.9199.8799.9299.8999.9199.89
Fraction of flanking region covered with at least 15 × (%)86.8884.2487.5285.8484.6287.9289.8579.8590.1784.5080.7282.36
Fraction of flanking region covered with at least 8 × (%)94.3293.1394.6693.6692.9594.7095.7789.9395.8792.5589.9691.39
Fraction of flanking region covered with at least 4 × (%)97.8597.3397.9197.5097.1597.8798.2795.7698.3296.8695.7196.37

The target region, as covered by the SureSelect Human Kinome Kit, was approx. 3.1 Mb. Near target region was 200 bases forward and backward of the target region. "⑥/⑤" means "each number of ⑥ devided by each number of ⑤"; "⑦/⑤" means "each number of ⑦ devided by each number of ⑤"; "⑧/⑤" means "each number of ⑧ devided by each number of ⑤"; "⑥/①" means "each number of ⑥ devided by each number of ①"; "⑦/②" means "each number of ⑦ devided by each number of ②"; "⑧/③" means "each number of ⑧ devided by each number of ③"; "⑨/①" means "each number of ⑨ devided by each number of ①"; "⑩/②" means "each number of ⑩ devided by each number of ②". NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Mapped reads Discarded reads were as follows: Reads mapped to chromosomes other than the target; Reads where each paired-end is mapped to a different chromosome; Reads not used for single-nucleotide variant/InDel detection such as PCR duplicates. Each number of ② consists of each total number of "③ plus ④ plus ⑤"; "① - ②" means "each number of ① minus each number of ②"; "③ + ④" means "each number of ③ plus each number of ④". NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the iPSC lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line. Sequence depth Theoretical depth calculated from the total number of bases obtained by DNA sequencing; Observed depth used for single-nucleotide variant/InDel identification; Theoretical depth [Total Bases (Mb)]/[Target region (Mb)]. Target region: SureSelect Human Kinome Kit (approximately 3.1 Mb). NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line. Target capture The target region, as covered by the SureSelect Human Kinome Kit, was approx. 3.1 Mb. Near target region was 200 bases forward and backward of the target region. "⑥/⑤" means "each number of ⑥ devided by each number of ⑤"; "⑦/⑤" means "each number of ⑦ devided by each number of ⑤"; "⑧/⑤" means "each number of ⑧ devided by each number of ⑤"; "⑥/①" means "each number of ⑥ devided by each number of ①"; "⑦/②" means "each number of ⑦ devided by each number of ②"; "⑧/③" means "each number of ⑧ devided by each number of ③"; "⑨/①" means "each number of ⑨ devided by each number of ①"; "⑩/②" means "each number of ⑩ devided by each number of ②". NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line.

Non-synonymous SNVs compared with hg19

After sequencing, the reads underwent bioinformatics analysis (Figure 2). Through comparison with the human reference genome hg19, the non-synonymous SNVs (missense, nonsense or read-through) were called on the target region (on and near DNA target enrichment baits) of twelve samples (the ten iPSC lines, their starting cancer tissues, and the matched adjacent non-cancerous tissues). Of the resulting 378 non-synonymous SNVs (Supplemental Table 4), 50 were novel SNVs (not reported in dbSNP135 or 1000 Genomes, Supplemental Table 5).
Figure 2

Pipeline of bioinformatics analysis following next-generation sequencing. The thirteen confirmed SNVs are shown in Tables 6-8. SNVs: Single-nucleotide variants.

Pipeline of bioinformatics analysis following next-generation sequencing. The thirteen confirmed SNVs are shown in Tables 6-8. SNVs: Single-nucleotide variants.
Table 6

Chromosome number, genome position, reference vs single-nucleotide variant, novelty vs dbSNP135, gene symbol, and mutation types of single-nucleotide variants

SNV No.Chromo-some No.Genome positionRef.|SNVNovel/knownGene symbolMutation types
1chr237336419C|TNovelEIF2AK2Missense
2chr2179408086A|GNovelTTNMissense
3chr341705179A|CNovelULK4Missense
4chr5112769527C|TNovelTSSK1BMissense
5chr5180048626C|TNovelFLT4Missense
6chr631947203T|CNovelSTK19Missense
7chr723808650G|TNovelSTK31Missense
8chr798490141G|CNovelTRRAPMissense
9chr121009680C|TNovelWNK1Missense
10chr1623690401C|TNovelPLK1Missense
11chr178789811G|ANovelPIK3R5Nonsense
12chr1737881392A|GNovelERBB2Missense
13chr192046399G|ANovelMKNK2Missense

Ref.: The allele of the human reference genome hg19; SNV: Single-nucleotide variant.

Table 8

Genotypes of single-nucleotide variants among the matched adjacent non-cancerous tissue, the starting cancer tissue, and the cancer tissue-derived induced pluripotent stem cell lines

Genotypes of SNVs
SNV No.NCC1CC1CC1-1CC1-2CC1-7CC1-8CC1-9CC1-11CC1-12CC1-17CC1-18CC1-25
1C/CC/CC/CC/CC/CC/CC/CC/CC/CC/CC/TC/C
2A/AA/AA/AA/AA/AA/AA/GA/AA/AA/AA/AA/A
3A/AA/AA/AA/AA/AA/AA/AA/AA/AA/AA/AA/C
4C/CC/CC/CC/CC/CC/TC/CC/CC/CC/CC/CC/C
5C/CC/CC/TC/CC/CC/CC/CC/CC/CC/CC/CC/C
6T/TT/TT/TT/CT/TT/TT/TT/TT/TT/TT/TT/T
7G/GG/GG/GG/GG/GG/GG/GG/TG/GG/GG/GG/G
8G/GG/GG/GG/GG/GG/GG/GG/GG/CG/GG/GG/G
9C/CC/CC/CC/CC/CC/TC/CC/CC/CC/CC/CC/C
10C/CC/CC/CC/TC/CC/CC/CC/CC/CC/CC/CC/C
11G/GG/GG/GG/GG/GG/GG/GG/GG/GG/GG/GG/A
12A/AA/GA/AA/AA/AA/AA/AA/AA/AA/AA/AA/A
13G/GG/AG/GG/GG/GG/GG/GG/GG/GG/GG/GG/G

NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line; SNV: Single-nucleotide variant.

Confirmed genotypes of the twelve samples

Of the 378 non-synonymous SNVs from the twelve samples, 40 were distinct heteroallelic genotypes and included known SNVs in the 612 sequenced gene target region. Supplemental Table 6 lists the forty SNVs that were distinct among the human iPSC lines, their starting cancer tissue, and the matched adjacent non-cancerous tissues. The allelic depth and genotype quality of thirteen of the forty SNVs were validated and manually curated using the Integrative Genomics Viewer (Figure 2).

Mutated genotypes of cancer tissue-derived iPSC lines

The chromosome number, genome position, novelty, gene symbol, and mutation type of the thirteen confirmed SNVs are shown in Table 6; the allelic depth is shown in Table 7; and the genotype is shown in Table 8. The respective SNVs of the ten iPSC lines were compared to those of their starting cancer tissue and the matched non-cancerous tissue. The genotypes, which showed nonsense or missense mutations in EIF2AK2, TTN, ULK4, TSSK1B, FLT4, STK19, STK31, TRRAP, WNK1, PLK1, or PIK3R5 (Table 6), of the iPSC lines were different from that of the non-cancerous tissue sample (Table 8). Nevertheless, the genotypes of the iPSC samples were also different from that of the starting cancer tissue sample. The heteroallelic read sequences of ULK4, TRRAP, and WNK1 of the starting cancer tissue sample consisted of 247|2 of A|C, 240|1 of G|C, and 246|2 of C|T, respectively (Table 7). Although the major read sequences indicated the genotypes of the non-cancerous tissues, the minor reads indicated missense mutations. The potential heteroallelic genotypes were identical to the mutated genotypes of the CC1-25, CC1-12, and CC1-8 iPSC lines. Meanwhile, the genotypes of all ten of the iPSC lines were different from the mutated genotypes in ERBB2 and MKNK2 of the cancer tissues and were identical to those of the non-cancerous tissues and human reference genome hg19 (Table 8). Thus, all analyzed iPSC lines were preferentially generated from starting cells without mutations in ERBB2 and MKNK2, except for those generated from mature cancer cells. Furthermore, the iPSC lines CC1-7 and CC1-17 did not have any confirmed mutated genotypes despite originating from the cancer tissue.
Table 7

Allelic depth of single-nucleotide variants among the matched adjacent non-cancerous tissue, the starting cancer tissue, and the cancer tissue-derived induced pluripotent stem cell lines

Allelic depth of SNVs
SNV No.NCC1CC1CC1-1CC1-2CC1-7CC1-8CC1-9CC1-11CC1-12CC1-17CC1-18CC1-25
1250|0246|0232|0250|0250|0250|0250|0248|0250|0250|0129|121250|0
2249|0240|0240|0248|0248|1250|0129|121248|0242|0250|0250|0244|0
3246|0247|2249|0238|1246|0248|0233|0241|0238|1241|0245|0132|106
4250|0239|0243|0248|0245|0120|129250|0236|0250|0250|0250|0249|0
5216|0150|075|79189|0184|0180|0200|1131|0176|0221|0207|0179|0
6249|0238|0250|0132|114250|0250|0242|0248|0248|0250|0250|0249|0
7250|0248|0250|0250|0245|0246|0245|0135|111249|0250|0250|0246|1
8233|0240|1243|0250|0245|0242|0247|0248|0132|113240|1247|0241|0
9249|0246|2250|0250|0249|0220|30244|0249|0250|0249|1250|0249|0
10247|0177|0188|0119|121198|0244|0241|0176|0221|0224|0249|0174|0
11246|1172|0181|0208|0209|0198|0189|0175|0244|0182|0233|095|87
12249|1195|54241|0249|0249|0249|1249|0250|0249|0250|0249|1250|0
13137|091|1079|0131|0102|0103|0103|083|0106|0111|0142|090|0

NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line; SNV: Single-nucleotide variant.

Chromosome number, genome position, reference vs single-nucleotide variant, novelty vs dbSNP135, gene symbol, and mutation types of single-nucleotide variants Ref.: The allele of the human reference genome hg19; SNV: Single-nucleotide variant. Allelic depth of single-nucleotide variants among the matched adjacent non-cancerous tissue, the starting cancer tissue, and the cancer tissue-derived induced pluripotent stem cell lines NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line; SNV: Single-nucleotide variant. Genotypes of single-nucleotide variants among the matched adjacent non-cancerous tissue, the starting cancer tissue, and the cancer tissue-derived induced pluripotent stem cell lines NCC1: The matched adjacent non-cancerous tissue; CC1: The starting cancer tissue of the induced pluripotent stem cell lines; CC1-1 to CC1-25: Each induced pluripotent stem cell line; SNV: Single-nucleotide variant.

DISCUSSION

The ten iPSC lines were clonally generated from a heterogeneous mixture of primary cells derived from the colon cancer tissue of a single patient. The genomes of the ten iPSC lines were analyzed using next-generation sequencing. The genomes of the starting cancer tissue and matched adjacent non-cancerous tissue from the same donor were also analyzed. The target region for analysis was the human kinome and cancer-related genes that are typically mutated in human tumors. A total of 378 non-synonymous SNVs identified from samples of the ten iPSC lines and the cancerous and non-cancerous tissues were identified by comparing the sequence reads to the human reference genome hg19. Most of the non-synonymous SNVs showed the genotype of the non-cancerous tissue, suggesting their germline origin. The SNVs of the ten iPSC lines were compared with those of the cancerous and non-cancerous tissues. Forty of the SNVs were distinct genotypes among all twelve samples. Thirteen of the forty SNVs were confirmed using allelic depth, genotype quality, and the Integrative Genomics Viewer. In eight of the ten iPSC lines, one or two novel, non-synonymous SNVs (heteroallelic missense or nonsense mutation) in EIF2AK2, TTN, ULK4, TSSK1B, FLT4, STK19, STK31, TRRAP, WNK1, PLK1 or PIK3R5 were identified as genotypes different from those of the non-cancerous tissue. Unexpectedly, all the SNVs were not identical to the genotypes found in the cancer tissues. Because of minor read sequences, the potential genotype of ULK4, TRRAP or WNK1 in the cancer tissues was implied. The sequences indicated a missense mutation in ULK4, TRRAP or WNK1 identical to that found in the iPSC lines CC1-25, CC1-12, and CC1-8. Accordingly, there is a possibility that each iPSC line was generated from a starting cell from a minor cell population with a mutation in ULK4, TRRAP, or WNK1 that was present within the cancer tissue. The minor read sequences could be confirmed by ultra-deep sequencing to support the potential heteroallelic genotypes. Interestingly, two iPSC lines CC1-7 and CC1-17 did not have any confirmed mutated genotypes despite originating from the cancer tissues. Therefore, the two iPSC lines might be generated from non-cancerous cells such as pre-cancer (stem) cells and cancer-associated fibroblasts[15,16]. The SNVs of the ten iPSC lines could be de novo or pre-existing mutations that originated from minor cell populations, such as multifocal cancer (stem) cells and pre-metastatic cancer cells, present within the heterogeneous cancer tissue. Primary cancer tissues include multifocal pre-, mature and pre-metastatic cancer cells, so it makes sense that their genomes would be heterogeneous. The genotypes of pre-cancer (stem) cells would not be identical to those of germline or mature cancer cells, as colon cancer develops from an adenoma to carcinoma through the accumulation of a number of genetic mutations and epigenetic aberration[17]. It is likely that the genotypes of pre-metastatic cancer cells in multiple clonal evolutions would be different from those of non-metastatic cancer cells. Meanwhile, genotypes of major mature cancer cells would be identical to those of cancer tissues; therefore, it was expected that genotypes of cancer tissue-derived iPSC lines would be identical to those of their starting cancer tissues. It was reported that ERBB2 mutations were persistent in 3.6% of patients with colorectal cancer[18]. Indeed, a mutated genotype in ERBB2 of the colon cancer tissues was also identified in this study. Nevertheless, the genotypes of the ten iPSC lines were different from the mutated ERBB2 and MKNK2 genotypes in the cancer tissues and were identical to those of the non-cancerous tissues and the human reference genome hg19. This result suggests that the starting cells for the iPSC lines did not carry the mutations in ERBB2 and MKNK2 present in the cancer tissues. It is conceivable that the non-mutated genotypes of each iPSC line were identical to those of non-cancerous cells such as pre-cancer (stem) cells, stroma cells and immune cells that existed within the tissue. Each iPSC line was clonally established by selecting an independent single colony expanded from a putative single starting cell originating from heterogeneous cancer tissue. The genome sequence of each iPSC line was derived from its starting single cell. As a result, each iPSC line conserved the non-mutated ERBB2 and MKNK2 genotypes that originated from their respective starting single cells. Interestingly, all ten iPSC lines were not generated from cell populations containing either a mutated ERBB2 and/or a mutated MKNK2. Thus, the genotypes of each iPSC line provide information on the genomic origin of the starting single cell derived from the heterogeneous cancer tissue. Although the cause of the preference for the genomic origin of their starting cells was not clarified in this study, it seems that chemicals[19], gene sets[1,4], gene transfer[20,21], or inventive pre-culture[22,23], in which the starting cells might be preferentially specified, can affect iPSC generation. Accordingly, materials and methods can be optimized to generate normal or aberrant iPSC lines for the purposes of regenerative medicine or cancer research/therapy. Cancer tissues comprise (pre-) cancer (stem) cells, pre-metastatic cancer cells, stromal cells (such as mesenchymal stem cells, cancer-associated fibroblasts[15,16,24] and tumor endothelial cells) and immune cells (such as tumor-associated macrophages[25], dendritic cells[26] and tumor-infiltrating T cells[23]). Therefore, such a cell-derived iPSC line might be useful for immune-cell therapy[27] with cellular vaccines[28], dendritic cells[29-32] or tumor antigen-specific cytotoxic T cells[23], in addition to the development of models of carcinogenesis[33-35] and drug discovery tools[36,37]. For the purposes of regenerative medicine, human iPSCs are ideally generated from normal neonatal tissues[3,38-40] that are typically inexperienced of postnatal aberrant mutations or epigenetic changes. By contrast, aging and sun-exposed skin carries thousands of evolving clonal cells carrying cancer-causing mutations[41,42]. Indeed, genetic mutations accumulate gradually over a lifetime, even in human somatic stem cells[43]. For this reason, cell sources for iPSC generation should be selected based on the given field of research. Furthermore, iPSC lines with few or no mutations need to be established by the modification of existing methodology[39,44,45], as cell lines with de novo mutations not originating from the starting cells are not desired[46-50]. Nevertheless, cancer tissue-derived iPSCs might give rise to such de novo mutations, as their starting cells might have already suffered from an aberration (epigenetics or gene expression) associated with de novo mutations or cancer. Indeed, colon cancer tissue-derived iPSC lines exhibited unique gene expression profiles, with particular upregulation of FAM19A5 and SLC39A7[11], in comparison with those of the typical iPSC line 201B7[1]. FAM19A5 and SLC39A7 were found to be expressed at lower levels in many human iPSC and ESC lines based on a free online expression atlas (Amazonia!, http://amazonia.transcriptome.eu/search.php)[51]. FAM19A5 was reported as a novel cholangiocarcinoma biomarker[52], while SLC39A7 is an intracellular zinc transporter and a hub for tyrosine kinase activation related to diseases such as cancer[53]. The analysis of iPSC genomes might expose rare single cells, such as an authentic cancer stem cells present within cancer tissues. Thus, next-generation sequencing of heterogeneous cancer tissue-derived iPSC lines might reveal potential aberrations or changes originating from the cancer tissue. In conclusion, the genotypes of iPSC lines can be used to trace the genotype of the original single cells derived from heterogeneous cancer tissues.

ACKNOWLEDGMENTS

The author would like to thank Dr. Takanori Washio, Mr. O Kobayashi and Mr. Wataru Kurihara of Riken Genesis for supporting the informatics analysis of the sequencing data, Dr. Momoko Kobayashi and Ms. Natsumi Suda for supporting the iPSC culture and genomic DNA preparation, and the members of the Fundamental Innovative Oncology Core Center, the Division of Molecular and Cellular Medicine, the Division of Genetics, and the Division of Carcinogenesis and Prevention for the helpful discussions. The author would also like to acknowledge the professional language editing service of Springer Nature Author Services. The author also acknowledges Dr. Toshio Kitamura, the JCRB Tissue Bank, and the RIKEN BioResource Center for providing the Plat-GP packaging cells, remnant tissues from cancer patients, and the iPSC line (201B7), respectively.

COMMENTS

Background

Starting cells for induced pluripotent stem cell (iPSC) generation should be appropriately adopted to generate normal or aberrant iPSC lines for use in regenerative medicine or cancer research/therapy. Human iPSC lines for regenerative medicine would be ideally generated from normal neonatal tissues, as they are typically free of postnatal aberrant mutations and epigenetic changes. For cancer research/therapy, it is of great interest to generate iPSCs that originate from heterogeneous cancer tissues.

Research frontiers

Microarray experiments have profiled the gene expression of human iPSC lines clonally generated from a heterogeneous mixture of primary cells derived from gastric tissue or colon cancer tissue. The gene expression profiles of such iPSC lines demonstrate differences derived from their distinct starting tissues and similarity and heterogeneity derived from their common starting heterogeneous tissue.

Innovations and breakthroughs

This is the first study to analyze human iPSC lines clonally generated from a heterogeneous mixture of primary cells derived from cancer tissues using next-generation sequencing. Eight of the ten iPSC lines had single-nucleotide variants with de novo or pre-existing mutations originating from a minor population within the cancer tissues. Meanwhile, all other genotypes of the iPSC lines were not mutated as in the original cancer tissues. Two of the ten iPSC lines did not possess any confirmed mutated genotypes despite having been derived from cancer tissue. These results suggest that the majority of iPSC lines originated from starting cells other than major cancer cells. Thus, the genotypes of iPSC lines can be used to trace the genotypes of the starting single cells.

Applications

It is conceivable that cancer tissues are made up of not only pre-cancer (stem) cells and pre-metastatic cancer cells but also stroma cells (such as mesenchymal stem cells, cancer-associated fibroblasts and tumor endothelial cells) and immune cells (such as tumor-associated macrophages, dendritic cells and tumor-infiltrating T cells). These other cell types might serve as targets for drug discovery and immune-cell therapy against cancer. Therefore, such a cell-derived iPSC line might be useful for immune-cell therapies such as cancer vaccines, dendritic cells and tumor antigen-specific cytotoxic T cells, in addition to the development of models of carcinogenesis and drug discovery tools.

Terminology

Most single-nucleotide variants are heteroallelic genotypes that are validated with allelic depth and genotype quality and manually curated using the Integrative Genomics Viewer. Deeper allelic depth of next-generation sequencing further resolves genotype variations among the starting single cells present within heterogeneous cancer tissues. In this way, the genotypes of the iPSC lines may be used to trace the genomic identity of their starting single cells derived from a heterogeneous cancer tissue.

Peer-review

The manuscript is well written and easy to follow.
  53 in total

1.  Core transcriptional regulatory circuitry in human embryonic stem cells.

Authors:  Laurie A Boyer; Tong Ihn Lee; Megan F Cole; Sarah E Johnstone; Stuart S Levine; Jacob P Zucker; Matthew G Guenther; Roshan M Kumar; Heather L Murray; Richard G Jenner; David K Gifford; Douglas A Melton; Rudolf Jaenisch; Richard A Young
Journal:  Cell       Date:  2005-09-23       Impact factor: 41.582

2.  Patient-derived induced pluripotent stem cells in cancer research and precision oncology.

Authors:  Eirini P Papapetrou
Journal:  Nat Med       Date:  2016-12-06       Impact factor: 53.440

3.  Low incidence of DNA sequence variation in human induced pluripotent stem cells generated by nonintegrating plasmid expression.

Authors:  Linzhao Cheng; Nancy F Hansen; Ling Zhao; Yutao Du; Chunlin Zou; Frank X Donovan; Bin-Kuan Chou; Guangyu Zhou; Shijie Li; Sarah N Dowey; Zhaohui Ye; Settara C Chandrasekharappa; Huanming Yang; James C Mullikin; P Paul Liu
Journal:  Cell Stem Cell       Date:  2012-03-02       Impact factor: 24.633

Review 4.  Mesenchymal Cells in Colon Cancer.

Authors:  Vasiliki Koliaraki; Charles K Pallangyo; Florian R Greten; George Kollias
Journal:  Gastroenterology       Date:  2017-01-20       Impact factor: 22.682

5.  Influence of donor age on induced pluripotent stem cells.

Authors:  Valentina Lo Sardo; William Ferguson; Galina A Erikson; Eric J Topol; Kristin K Baldwin; Ali Torkamani
Journal:  Nat Biotechnol       Date:  2016-12-12       Impact factor: 54.908

6.  Generation of induced pluripotent stem cells without Myc from mouse and human fibroblasts.

Authors:  Masato Nakagawa; Michiyo Koyanagi; Koji Tanabe; Kazutoshi Takahashi; Tomoko Ichisaka; Takashi Aoi; Keisuke Okita; Yuji Mochiduki; Nanako Takizawa; Shinya Yamanaka
Journal:  Nat Biotechnol       Date:  2007-11-30       Impact factor: 54.908

7.  Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin.

Authors:  Iñigo Martincorena; Amit Roshan; Moritz Gerstung; Peter Ellis; Peter Van Loo; Stuart McLaren; David C Wedge; Anthony Fullam; Ludmil B Alexandrov; Jose M Tubio; Lucy Stebbings; Andrew Menzies; Sara Widaa; Michael R Stratton; Philip H Jones; Peter J Campbell
Journal:  Science       Date:  2015-05-22       Impact factor: 47.728

8.  Somatic coding mutations in human induced pluripotent stem cells.

Authors:  Athurva Gore; Zhe Li; Ho-Lim Fung; Jessica E Young; Suneet Agarwal; Jessica Antosiewicz-Bourget; Isabel Canto; Alessandra Giorgetti; Mason A Israel; Evangelos Kiskinis; Je-Hyuk Lee; Yuin-Han Loh; Philip D Manos; Nuria Montserrat; Athanasia D Panopoulos; Sergio Ruiz; Melissa L Wilbert; Junying Yu; Ewen F Kirkness; Juan Carlos Izpisua Belmonte; Derrick J Rossi; James A Thomson; Kevin Eggan; George Q Daley; Lawrence S B Goldstein; Kun Zhang
Journal:  Nature       Date:  2011-03-03       Impact factor: 49.962

Review 9.  Induced Pluripotent Stem Cell as a New Source for Cancer Immunotherapy.

Authors:  Farzaneh Rami; Halimeh Mollainezhad; Mansoor Salehi
Journal:  Genet Res Int       Date:  2016-02-25

10.  Tissue-specific mutation accumulation in human adult stem cells during life.

Authors:  Francis Blokzijl; Joep de Ligt; Myrthe Jager; Valentina Sasselli; Sophie Roerink; Nobuo Sasaki; Meritxell Huch; Sander Boymans; Ewart Kuijk; Pjotr Prins; Isaac J Nijman; Inigo Martincorena; Michal Mokry; Caroline L Wiegerinck; Sabine Middendorp; Toshiro Sato; Gerald Schwank; Edward E S Nieuwenhuis; Monique M A Verstegen; Luc J W van der Laan; Jeroen de Jonge; Jan N M IJzermans; Robert G Vries; Marc van de Wetering; Michael R Stratton; Hans Clevers; Edwin Cuppen; Ruben van Boxtel
Journal:  Nature       Date:  2016-10-03       Impact factor: 49.962

View more
  2 in total

1.  Reprogramming enriches for somatic cell clones with small-scale mutations in cancer-associated genes.

Authors:  Maike Kosanke; Katarzyna Osetek; Alexandra Haase; Lutz Wiehlmann; Colin Davenport; Adrian Schwarzer; Felix Adams; Marc-Jens Kleppa; Axel Schambach; Sylvia Merkert; Stephanie Wunderlich; Sandra Menke; Marie Dorda; Ulrich Martin
Journal:  Mol Ther       Date:  2021-04-06       Impact factor: 12.910

Review 2.  An RNA Metabolism and Surveillance Quartet in the Major Histocompatibility Complex.

Authors:  Danlei Zhou; Michalea Lai; Aiqin Luo; Chack-Yung Yu
Journal:  Cells       Date:  2019-08-30       Impact factor: 6.600

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.