Literature DB >> 28413162

Whole-Genome Sequence of the Metastatic PC3 and LNCaP Human Prostate Cancer Cell Lines.

Inge Seim1,2,3, Penny L Jeffery4,2,3, Patrick B Thomas4,2,3, Colleen C Nelson2, Lisa K Chopin4,2,3.   

Abstract

The bone metastasis-derived PC3 and the lymph node metastasis-derived LNCaP prostate cancer cell lines are widely studied, having been described in thousands of publications over the last four decades. Here, we report short-read whole-genome sequencing (WGS) and de novo assembly of PC3 (ATCC CRL-1435) and LNCaP (clone FGC; ATCC CRL-1740) at ∼70 × coverage. A known homozygous mutation in TP53 and homozygous loss of PTEN were robustly identified in the PC3 cell line, whereas the LNCaP cell line exhibited a larger number of putative inactivating somatic point and indel mutations (and in particular a loss of stop codon events). This study also provides preliminary evidence that loss of one or both copies of the tumor suppressor Capicua (CIC) contributes to primary tumor relapse and metastatic progression, potentially offering a treatment target for castration-resistant prostate cancer (CRPC). Our work provides a resource for genetic, genomic, and biological studies employing two commonly-used prostate cancer cell lines.
Copyright © 2017 Seim et al.

Entities:  

Keywords:  Genome Report; WGS; cell line; genomics; human; prostate cancer; sequencing

Mesh:

Year:  2017        PMID: 28413162      PMCID: PMC5473753          DOI: 10.1534/g3.117.039909

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Cultured cancer cell lines, such as the human-derived PC3 and LNCaP, are critical for prostate cancer research. The androgen-dependent LNCaP cell line (clone FGC) is derived from a lymph node metastasis (Horoszewicz 1980; Horoszewicz ), and the androgen-independent PC3 cell line is derived from a bone metastasis (Kaighn ). Since their development, almost 40 yr ago, they have emerged as major tools in prostate cancer research (with PubMed searches 15.01.17 for “PC3 AND prostate” and “LNCaP AND prostate” returning 3266 and 7080 hits, respectively). While these cell lines have been interrogated using array- and sequencing-based technologies (Liu ; Barretina ; Spans ; Klijn ), whole-genome sequences for the PC3 and LNCaP cell lines have not been published. Albeit currently relatively costly, WGS offers better coverage than exome sequencing, and improved detection of single nucleotide and small indel mutations and structural variants such as copy number alterations (Meynert ; Belkadi ; Warr ).

Materials and Methods

Cell lines

PC3 (ATCC CRL-1435) and LNCaP clone FGC (ATCC CRL-1740; hereafter termed LNCaP) prostate cancer cell lines were obtained from the American Type Culture Collection (ATCC, Rockville, MD), and maintained in Roswell Park Memorial Institute RPMI 1640 medium (Invitrogen, Carlsbad, CA) with 10% Fetal Calf Serum (Thermo Fisher Scientific, Waltham, MA), supplemented with 100 U/ml penicillin G and 100 ng/ml streptomycin (Invitrogen). All cell lines were passaged at 2–3-d intervals on reaching 70% confluency using TrypLE Select (Invitrogen). Cell morphology and viability were monitored by microscopic observation and regular Mycoplasma testing was performed (Universal Mycoplasma Detection Kit; ATCC).

Sequencing

DNA was extracted using a QIAamp DNA mini kit (QIAGEN, Hilden, Germany) from low passage (passage four) PC3 and LNCaP cell lines, cultivated from frozen stocks obtained directly from the ATCC. Sequencing was performed by Macrogen (Seoul, South Korea). Briefly, library preparation was performed using a TruSeq Nano DNA kit (Illumina, San Diego, CA) with a target insert size of 350 bp. Paired-end libraries (150 bp) were sequenced using a HiSeqX sequencer (Illumina). Base calls were converted into FASTQ files using bcl2fastq v2.15.0 and provided to our laboratory.

Normal prostate data

Raw data from normal human prostate samples were obtained from the National Institutes of Health’s Cancer Genome Atlas (The Cancer Genome Atlas Research Network 2015) (NCBI dbGaP: phs000178.v9.p8). These included a WGS sample (PCAWG.e22e63de-c436-43c0-a595-022622c1fe06) and three RNA-seq samples (120215-UNC10-SN254-0327-AC0CMCACXX-ACTTGA-L005, 120502-UNC14-SN744-0235-BD0YUTACXX-ACTTGA-L005, and 130221-UNC9-SN296-0338-BC1PYCACXX-TGACCA-L008). The WGS file (101 bp paired-end library; 950 M reads) was provided as an unaligned BAM (uBAM) file and converted to FASTQ files using bamUtils v1.0.14 genome.sph.umich.edu/wiki/BamUtil.

Data processing

Raw reads (FASTQ) were trimmed using scythe v0.994 github.com/vsbuffalo/scythe, with default settings, to remove low quality bases and read-pairs, and contaminating adapter sequences.

Mapping of genome reads to reference genome:

FASTQ files were mapped to human reference genome GRCh38 build 82 (the reference genome in all subsequent analyses) using BWA-MEM (Li 2013), available in v0.7.12-r1039, and a sorted BAM file was generated by SAMtools v1.3.1 (Li ). Genome coverage was estimated using QualiMap v2.2.1 (García-Alcalde ; Okonechnikov ).

de novo genome assembly:

PC3 and LNCaP genomes were assembled de novo using SGA v0.10.15 (Simpson and Durbin 2012) (available at github.com/jts/sga), as outlined in the user manual, except that assembled contigs were indexed using BWA-MEM (Li 2013) instead of the bundled Python script sga-align (calls BWA sample: bwa mem -t $CPU final-contigs.fa $READ1 $READ2 | samtools view -F2304 -b -o reads.bam -). Resulting scaffolds were gap filled using “sga gapfill” and error-corrected FASTQ reads. The genome assemblies (gapfilled scaffolds) were evaluated using QUAST v4.3 (Gurevich ) and the human reference genome. Genes of interest were interrogated in the assembled genomes using BLAST, via a local instance of SequenceServer v1.0.9 (Priyam ), and GMAP v2016-11-07 (a genomic mapping and alignment program for mRNA and EST sequences) (Wu and Watanabe 2005).

Single nucleotide variant (SNV) and short indel calling:

Samtools v1.3.1 mpileup and bcftools (Li ) were used to interrogate indexed BAM files of reads aligned to the reference genome and generate a VCF (Variant Call Format) file of SNVs and short indel variants. Variants (likely to be common germline variants) present in HapMap (Gibbs ), 1000 genomes phase 3 (2504 human genomes) (Sudmant ), and the National Heart Lung and Blood Institutes Exome Sequencing Project (Tennessen ) (bundled variant data file available at https://goo.gl/mEogvD) were excluded. Next, variant files (VCF) were filtered using SnpSift (Cingolani ) with the following parameters: QUAL ≥ 200 && DP ≥ 30; where QUAL denotes minimum variance confidence and DP total depth threshold. Filtered variants were annotated using SnpEff v4.3g (Cingolani ).

Copy number variation (CNV) calling:

To screen PC3 and LNCaP genomes for CNV, we employed the R package “cn.mops” (Copy Number estimation by a Mixture Of PoissonS) (Klambauer ). Briefly, paired-end genome reads from PC3 and LNCaP were aligned to the reference genome and compared with normal prostate reads to obtain genome-wide read-depth profiles. Custom R scripts were used to parse the output.

Gene expression potential analysis:

We interrogated publicly available transcriptome data from PC3 (Wang ) (NCBI GEO: GSE65112) and LNCaP cells (Metzger ) (NCBI GEO: GSE64529). In addition, transcriptome data from normal prostate samples were obtained from TCGA (see above). Briefly, paired-end reads were trimmed using scythe, and aligned to human reference genome GRCh38 build 82 using the spliced-read mapper TopHat (v2.0.9) (Kim ) and reference gene annotations to guide the alignment. Raw gene counts were computed from the generated BAM files by featureCounts v1.4.5-p1 (Liao ), counting exon features of the gene annotation file (gtf) in order to include noncoding RNA genes. FeatureCounts output files were analyzed using the R programming language (v.3.2.2) (R Core Team 2013). Raw counts were normalized by Trimmed Mean of M-values (TMM) correction (Robinson and Oshlack 2010; Robinson ). The expression of genes in normal prostate, LNCaP, and PC3 was assessed using the Universal exPression Code (UPC) method (Piccolo ), available in the R package “SCAN.UPC”. This method estimates the active/inactive state of genes in a sample, where a UPC value > 0.5 indicates that a gene is actively transcribed.

cBioPortal analysis:

Data on copy number alterations in prostate cancer tumor tissue were obtained using the cBioPortal tool (www.cbioportal.org) (Cerami ; Gao ) with the following parameters: “GENE: HOMDEL HETLOSS;”, where “GENE” denotes a gene symbol. Clinical information was also downloaded and the data further analyzed using custom R scripts. The following cBioPortal prostate cancer data sets were interrogated: ‘NEPC (Trento/Cornell/Broad 2016)’ (34 primary and 73 metastatic tumors) (Beltran ), ‘Prostate (Broad/Cornell 2013)’ (55 primary tumors and 1 metastatic tumor) (Baca ), ‘Prostate (FHCRC, 2016)’ (19 primary and 130 metastatic tumors) (Kumar ), ‘Prostate (MICH), (11 primary and 47 metastatic tumors) (Grasso ), ‘Prostate (MSKCC 2010)’ (157 primary and 37 metastatic tumors) (Taylor ), ‘Prostate (MSKCC 2014)’ (101 primary and 3 metastatic tumors), (Hieronymus ), ‘Prostate (SU2C)’ (150 metastatic tumors) (Robinson ), and ‘Prostate (TCGA)’ (492 primary and 1 metastatic tumors) (The Cancer Genome Atlas Research Network 2015).

Kaplan–Meier survival analysis:

Kaplan–Meier survival analysis was performed to compare disease-free survival (DFS) in patient groups stratified by CNVs. DFS is defined as the time to either recurrence or relapse, second cancer, or death (Gill and Sargent 2006). In the context of prostate cancer, DFS is a suitable surrogate for overall survival (OS), given that metastatic disease is not curable and recurrence of disease would be expected to contribute significantly to mortality. ‘The Prostate (MSKCC 2010)’, ‘Prostate (MSKCC 2014)’, and ‘Prostate (TCGA)’ cBioPortal data sets were interrogated. Kaplan–Meier survival analysis (Rich ) was performed using the R package “survival” (Therneau 2013), fitting survival curves (survfit) and computing log-rank P-values using the survdiff function, with ρ = 0 (equivalent to the method employed by UCSC Xena; see goo.gl/4knf62). Survival curves were plotted where survival was significantly different between two groups (log-rank P ≤ 0.05). Groups with <10 samples with recorded events were considered unreliable (Mallett ).

Gene ontology term enrichment analysis

Gene enrichment analyses were performed using DAVID (Database for Annotation, Visualization and Integrated Discovery) (Huang ). All gene groups are potentially informative, despite lower rankings, and serve to guide biological interpretation (Huang ).

Data availability

The genome reads reported in this paper have been deposited in the BioProject database as PRJNA361315 (PC3) and PRJNA361316 (LNCaP). Code used to generate the data and CNV analysis output files (tabulated text files) are available at github.com/sciseim/PCaWGS. Genome assemblies (FASTA format) (Seim 2017a,b), and filtered and annotated single-nucleotide and indel variation data files (VCF format) (Seim 2017c), have been deposited at Zenodo. A BLAST server is available at http://ghrelinlab.org.

Results and Discussion

WGS

PC3 and LNCaP prostate cancer cells were obtained directly from ATCC, cultured for four passages, and 150 bp paired-end reads obtained using an Illumina HiSeqX sequencer. Following read trimming, 1.53 billion reads from PC3 were retained, of which 99.9% could be aligned to the Ensembl GRCh38.82 human reference genome at ∼71 × mean coverage (Figure 1A). Similarly, we obtained 1.49 billion trimmed reads from LNCaP with a 99.9% alignment rate and mean coverage of ∼69 × (Figure 1B).
Figure 1

Read-depth across chromosomes in the (A) PC3 and (B) LNCaP prostate cancer cell lines. The red dotted line indicates mean genome-wide sequencing coverage (X).

Read-depth across chromosomes in the (A) PC3 and (B) LNCaP prostate cancer cell lines. The red dotted line indicates mean genome-wide sequencing coverage (X). We also performed de novo genome assembly to allow characterization of whole-gene loci by BLAST and other mappers. The final, gap-filled PC3 genome assembly consisted of 1.66 M scaffolds (largest scaffold 692.4 kb) with an N50 of 23.3 kb and an NG50 (number of sequences with lengths equal to or larger than N50) of 22.4 kb. The LNCaP assembly consisted of 1.70 M scaffolds (largest scaffold 536.0 kb) at an N50 value of 44.4 kb and NG50 of 45.0 kb.

Single-nucleotide and indel variation

After excluding common germline sequence variants (SNVs and short indels), filtering by SnpSift (Cingolani ), and annotation by SnpEff (Cingolani ), we identified in LNCaP 0.94 M and in PC3 0.56 M sequence variants (SNVs and short indels) that were private or unique to the particular cell line (Table 1). As expected, the majority of variants were found in noncoding regions.
Table 1

SNV and indel variant-calling statistics of the prostate cancer cell lines PC3 and LNCaP

PC3 PrivateLNCaP PrivateShared
Number and percentage of variants by type
 SNVs318,38034.0%404,28272.1%166,91265.0%
 Indels618,14966.0%156,18227.9%89,91935.0%
Number of events by type
 3′-UTR15,57210,5003938
 5′-UTR premature start codon21128949
 5′-UTR26131868692
 Conservative_inframe_deletion39227
 Conservative_inframe_insertion46811972
 Disruptive_inframe_deletion62197
 Disruptive_inframe_insertion1726444
 Downstream_gene107,76156,72829,770
 Frameshift27616744
 Intergenic_region563,630326,261175,014
 Intron916,272576,268191,182
 Missense352057171667
 Non_coding_transcript_exon584838462091
 Non_coding_transcript18101
 Protein_protein_contact120175
 Sequence_feature797859301457
 Splice_acceptor8013121
 Splice_donor5613817
 Splice_region13131174437
 Start_lost16143
 Stop_gained5837829
 Stop_lost2549
 Structural_interaction11608081
 Synonymous2402272781
 Upstream_gene107,28157,4471301

Common germline variants were excluded and variants were further filtered using SnpSift, with a total depth threshold at 30 (DP ≥ 30) and a minimum variance confidence of 200 (QUAL ≥ 200), and annotated by SnpEff. SNV, single nucleotide variant; UTR, untranslated region.

Common germline variants were excluded and variants were further filtered using SnpSift, with a total depth threshold at 30 (DP ≥ 30) and a minimum variance confidence of 200 (QUAL ≥ 200), and annotated by SnpEff. SNV, single nucleotide variant; UTR, untranslated region. In particular, we noted that LNCaP had a larger number of stop_gained events, which are changes predicted to confer nonsense mutations and result in nonfunctional proteins or proteins with reduced function (Table 1). In LNCaP, SNVs and indel variants contributed 378 stop_gained events in 209 genes. We next identified biological processes overrepresented in this gene set (Table 2). This included a C→T transition at amino acid position 318 of menin (MEN1) (c.T954A in NCBI RefSeq NM_000244). Somatic inactivating mutations of menin are found in endocrine cancers (Falchetti ), suggesting that MEN1 is a tumor suppressor gene. However, it has recently been reported that MEN1 is an oncogene in prostate cancer. Menin interacts with the androgen receptor and patients with overexpression of MEN1 show poor OS (Malik ). The MEN1 SNV is present in an LNCaP sample interrogated by whole-exome sequencing (Taylor ). Therefore, it is not likely to be a sequencing or data processing artifact. The functional regions of menin are currently not known, thus, the effect of the LNCaP premature stop codon event cannot be predicted. Interrogation of eight cBioPortal data sets suggests that inactivating mutations in the coding sequence of MEN1 in prostate cancer is unique to LNCaP (data not shown); however, it is possible that distinct patient populations possess this variant (e.g., see Lindquist ).
Table 2

Significantly overrepresented biological processes associated with sequence variants contributing stop_gained events in the PC3 and LNCaP prostate cancer cell lines

GO TermGenesFisher’s Exact P
PC3 private sequence variants
 Detection of bacteriumHLA-DRB1, HLA-A, HLA-DRB5, HLA-B5.1E−10
 Antigen processing and presentationHLA-DRB1, HLA-A, HLA-DRB5, HLA-B2.4E−07
 Interferon-γ-mediated signaling pathwayHLA-DRB1, HLA-A, HLA-DRB5, HLA-B6.7E−07
 Immune responseHLA-DRB1, HLA-A, HLA-DRB5, HLA-B, IL1A4.7E−05
 Antigen processing and presentation of endogenous peptide  antigen via MHC class I via ER pathway, TAP-independentHLA-A, HLA-B2.9E−06
 Regulation of interleukin-10 secretionHLA-DRB1, HLA-DRB52.9E−06
 Regulation of interleukin-4 productionHLA-DRB1, HLA-DRB55.8E−06
 Protection from natural killer cell mediated cytotoxicityHLA-A, HLA-B9.6E−06
 Immunoglobulin production involved in immunoglobulin  mediated immune responseHLA-DRB1, HLA-DRB59.6E−06
 Humoral immune response mediated by circulating immunoglobulinHLA-DRB1, HLA-DRB52E−05
 Antigen processing and presentation of exogenous peptide antigen via  MHC class I, TAP-independentHLA-A, HLA-B3.5E−05
 T-helper 1-type immune responseHLA-DRB1, HLA-DRB56.3E−05
 Positive regulation of T cell mediated cytotoxicityHLA-A, HLA-B7.5E−05
 Inflammatory response to antigenic stimulusHLA-DRB1, HLA-DRB50.0001
 Antigen processing and presentation of peptide or polysaccharide  antigen via MHC class IIHLA-DRB1, HLA-DRB50.00013
 Negative regulation of interferon-γ productionHLA-DRB1, HLA-DRB50.00036
 Positive regulation of insulin secretion involved in cellular  response to glucose stimulusHLA-DRB1, HLA-DRB50.00039
 Antigen processing and presentation of peptide antigen via MHC class IHLA-A, HLA-B0.00041
 Negative regulation of T cell proliferationHLA-DRB1, HLA-DRB50.00063
 Protein tetramerizationHLA-DRB1, HLA-DRB50.00074
 Antigen processing and presentation of exogenous peptide antigen via  MHC class I, TAP-dependentHLA-A, HLA-B0.0018
 Type I interferon signaling pathwayHLA-A, HLA-B0.0019
 T cell costimulationHLA-DRB1, HLA-DRB50.0028
 Antigen processing and presentation of exogenous peptide antigen via  MHC class IIHLA-DRB1, HLA-DRB50.0038
 Protein deubiquitinationUSP17L18, USP17L110.0044
LNCaP private sequence variants
 Bundle of His cell to Purkinje myocyte communicationGPR155, GNAS, CBR3, CHL10.00012
 Response to nitrosative stressPRKCQ, CD3E, NLRP30.00068
 CognitionFPGT-TNNI3K, TNNI3K0.0008
 Interleukin-1 β productionMEN1, NTRK3, PAX6, PRKDC0.0011
 Positive regulation of interleukin-4 productionLAMA2, PRKCQ, ROBO1, PAX6, SPTBN1, CHL10.0012
 Humoral immune response mediated by circulating immunoglobulinGCLC, DUSP60.0023
 Type B pancreatic cell differentiationNTRK3, AP1B1, FREM2, ROBO1, PRKDC, TBC1D320.004
 Negative regulation of protein phosphorylationLRP1, STAB1, VTN, SSC4D, AMN, DMBT10.0042
 Axon guidanceNLRP3, CASP10.0076
 Heart developmentLAMA2, FREM2, ROBO1, STAB1, ITGB4, VTN, CERCAM, COL16A1, CHL1, AOC30.014
 Receptor-mediated endocytosisEXO1, HLA-DQB10.016
 Cell adhesionMEN1, PAX60.027
Shared sequence variants (PC3 and LNCaP)
 Sensory perception of tasteTAS2R43, TAS2R310.00048
 Detection of chemical stimulus involved in sensory  perception of bitter tasteTAS2R43, TAS2R310.00092
O-glycan processingMUC3A, MUC60.0021
 DigestionPRSS2, PRSS10.0023
 Extracellular matrix disassemblyPRSS2, PRSS10.0033

Stop_gained events are denoted changes predicted to confer nonsense mutations and result in nonfunctional proteins or proteins with reduced function. Gene enrichment analysis was performed using DAVID (Database for Annotation, Visualization and Integrated Discovery). MHC, major histocompatibility complex; ER, endoplasmic reticulum; TAP, transporter associated with antigen processing.

Stop_gained events are denoted changes predicted to confer nonsense mutations and result in nonfunctional proteins or proteins with reduced function. Gene enrichment analysis was performed using DAVID (Database for Annotation, Visualization and Integrated Discovery). MHC, major histocompatibility complex; ER, endoplasmic reticulum; TAP, transporter associated with antigen processing. In PC3, 58 stop_lost events (Table 1) in 20 genes, (AHNAK2, DNAH6, FAT3, GOLGA6L3, GOLGA6L9, HLA-A, HLA-B, HLA-DRB1, HLA-DRB5, HOXA9, IL1A, ITPR2, MEGF6, MUC19, OR8K3, PRPF3, PRSS1, PTPRD, USP17L11, and USP17L18) were observed. There was a significant enrichment for HLA class II antigen presenting genes associated with the immune response (Table 3; Fisher’s exact P ∼ 0.05). It has recently been shown that the PC3, LNCaP, and DuPro (but not the DU145) prostate cancer cell lines and prostate cancer tissues express HLA class II molecules (Younger ; Doonan and Haque 2015). However, we could not identify any prostate cancer patients with stop_lost events in these genes using the cBioPortal tool (Cerami ; Gao ) (data not shown). While evasion of the antitumour immune response is an emerging research area (Drake 2010; Corrales ), caution should be exercised when considering the use of PC3 cells in these studies. Sequence variant analysis and interrogation of the PC3 de novo genome assembly by BLAST and GMAP confirmed that the tumor suppressor p53 (TP53) is inactivated by a single frameshift event (p.A138fs; indel; c.*4955A in NCBI RefSeq NM_000546) (Carroll ) (Figure 2A).
Table 3

Putative deleted genes and their expression in the LNCaP and PC3 prostate cancer cell lines

SymbolDescriptionCNV RegionGene StartGene EndNP UPCLNCaP UPCPC3 UPC
LNCaP
 PWRN1Prader-Willi region nonprotein coding RNA 115:24430001–2452000024493137246521300.00.00.0
PC3
 ATP6V0A1ATPase H+ transporting V0 subunit a117:42110001–425200004245884442522611110
 CDH18Cadherin 185:19850001–199600001947295120575873000
 CICCapicua transcriptional repressor19:42280001–423200004226853742295797110
 CTNNA1Catenin α 15:138770001–138980000138610967138935034110
 DDX3YDEAD-box helicase 3, Y-linkedY:11530001–164500001290410812920478110
 DHX58DExH-box helicase 5817:42110001–425200004210140442112733100
 DYDC1DPY30 domain containing 110:80250001–805600008033610580356755000
 DYDC2DPY30 domain containing 210:80250001–805600008034474580368073000
 FAM213AFamily with sequence similarity 213 member A10:80250001–805600008040782980437115100
 GHDCGH3 domain containing17:42110001–425200004218879942194532110
 HCRTHypocretin neuropeptide precursor17:42110001–425200004218406042185452000
 HSPB9Heat shock protein family B (small) member 917:42110001–425200004212143142123352000
 KAT2ALysine acetyltransferase 2A17:42110001–425200004211310842121358110
 KCNH4Potassium voltage-gated channel Subfamily H member 417:42110001–425200004215689142181278000
 LIPJLipase family member J10:87910001–887500008858675388606976000
 LRRTM2Leucine rich repeat transmembrane neuronal 25:138770001–138980000138868923138875368000
 MAT1AMethionine adenosyltransferase 1A10:80250001–805600008027182080289684000
 MIR548ATMicroRNA 548at17:42110001–425200004249477342494830000
 NLGN4YNeuroligin 4, Y-linkedY:11530001–16450000145226381484565010.20
 PAFAH1B3Platelet activating factor acetylhydrolase 1b catalytic subunit 319:42280001–42320000422970334230354610.70
 PRR19Proline rich 1919:42280001–423200004230209842310821000
 PTENPhosphatase and tensin homolog10:87910001–887500008786311387971930110
 PTRFPolymerase I and transcript release factor17:42110001–425200004240245242423517100
 RAB5CRAB5C, member RAS oncogene family17:42110001–425200004212497642155044110
 RNLSRenalase, FAD dependent amine oxidase10:87910001–8875000088273864885845300.600
 SH2D4BSH2 domain containing 4B10:80250001–805600008053790280646560000
 SIL1SIL1 nucleotide exchange factor5:138770001–138980000138946720139293557110
 SIRPB1Signal regulatory protein β 120:1580001–162000015635211620061000
 STAT3Signal transducer and activator of transcription 317:42110001–425200004231332442388568110
 STAT5ASignal transducer and activator of transcription 5A17:42110001–425200004228754742311943100
 STAT5BSignal transducer and activator of transcription 5B17:42110001–425200004219916842276707110
 MEM145Transmembrane protein 14519:42280001–423200004231332542325062000
 TMSB4YThymosin β 4, Y-linkedY:11530001–164500001370356713706024000
 TSPAN14Tetraspanin 1410:80250001–805600008045416680533123110
 TSPY1Testis-specific protein, Y-linked 1Y:9450001–1020000094669559490081000
 TTTY13Testis-specific transcript, Y-linked 13 (nonprotein coding)Y:21420001–216300002158360021594666000
 TTTY15Testis-specific transcript, Y-linked 15 (nonprotein coding)Y:11530001–164500001266233412692233110
 USP9YUbiquitin specific peptidase 9, Y-linkedY:11530001–164500001270123112860839110
 UTYUbiquitously transcribed tetratricopeptide repeat containing, Y-linkedY:11530001–164500001324837913480673110

CNV regions are listed for putative homozygous deletion events (CNV = 0). UPC refers to Universal exPression Code score, where UPC value > 0.5 indicates that a gene is actively transcribed. CNV, copy number variation; NP, normal prostate tissue; RAB5C, Ras-related protein Rab-5C; FAD, flavin adenine dinucleotide.

Figure 2

Overview of a p53 (TP53) sequence variant in the PC3 prostate cancer cell line. (A) Genome browser display showing an indel event in the PC3 p53 gene (TP53). (B) Sequence alignment of TP53 in the PC3 genome and the reference genome assembly (Ensembl GRCh38 build 82). An indel is indicated in red. (C) Sequence alignment of TP53 protein products encoded by PC3 and the reference transcript NM_000546. An indel results in a frameshift (p.A138fs) and a truncated protein in PC3. chr, chromosome; SNV, single nucleotide variation.

CNV regions are listed for putative homozygous deletion events (CNV = 0). UPC refers to Universal exPression Code score, where UPC value > 0.5 indicates that a gene is actively transcribed. CNV, copy number variation; NP, normal prostate tissue; RAB5C, Ras-related protein Rab-5C; FAD, flavin adenine dinucleotide. Overview of a p53 (TP53) sequence variant in the PC3 prostate cancer cell line. (A) Genome browser display showing an indel event in the PC3 p53 gene (TP53). (B) Sequence alignment of TP53 in the PC3 genome and the reference genome assembly (Ensembl GRCh38 build 82). An indel is indicated in red. (C) Sequence alignment of TP53 protein products encoded by PC3 and the reference transcript NM_000546. An indel results in a frameshift (p.A138fs) and a truncated protein in PC3. chr, chromosome; SNV, single nucleotide variation. PC3 shared 0.26 M sequence variants (166,912 SNVs and 89,919 indels) with LNCaP, and 21 of these constituted stop_lost events (Table 1). Overrepresented biological processes in PC3 and LNCaP included “O-glycan processing” (the mucins MUC3A and MUC6) and “extracellular matrix disassembly” (the trypsinogens PRSS1 and PRSS2) (Table 2). Interestingly, while we have identified MUC3A stop_gained events in PC3 and LNCaP, cell lines generated from Caucasian patients, a recent study suggests that MUC3A protein-changing variants are rare in Caucasians and predominant in African Americans, the subpopulation with the highest prevalence of prostate cancer, where MUC3A changes are observed in 88% of patients (Lindquist ). Taken together, these data indicate that protein-coding genes in LNCaP are perturbed extensively by point and indel mutations. Even after filtering steps, our LNCaP data (at passage four from the ATCC stock) reveal a clear difference in the number of particular variant events compared to PC3. However, previous exome sequencing work suggests that the genome of the parental LNCaP strain sequenced here (clone FGC) and its derived strains are inherently unstable (Spans , 2014), and this could give rise to the apparently high mutation rate in protein-coding sequences. As with studies of the HeLa genome (Adey ; Landry ), further genome sequencing efforts are warranted to investigate whether the variants reported here are somatic mutations found in particular LNCaP strains, or if they represent preexisting subpopulations within the parental LNCaP strain. In the future, single-cell WGS is likely to resolve this issue. Nevertheless, LNCaP and PC3 appear to have distinct SNV and indel profiles.

Putative gene loss

Most human cancers have CNVs, which impact upon gene dosage through loss or gain of whole chromosomes or chromosome segments (Hanahan and Weinberg 2011). Previous studies have described CNVs in PC3 and LNCaP using targeted techniques, such as exome sequencing. However, WGS, together with continuously updated gene annotations, offers improved detection of copy number changes (Meynert ; Belkadi ; Warr ). CNVs were identified using the R package cn.mops (Klambauer ). In particular, we wished to identify genes that are lost in PC3 and LNCaP. The absence of this information can misinform even the most well-designed in vitro or cell line xenograft experiment (e.g., where a gene in an important pathway is lost). In the context of CNV analysis, we were interested in identifying putative homozygous deletions (CNV = 0; CNV0 events), i.e., genes that are inactivated by partial or complete gene deletion. To inform this analysis, we also considered the transcriptional potential of each gene by analyzing publicly available transcriptome (RNA-seq) data from normal prostate, LNCaP, and PC3. Genes with a UPC value of ∼0.5 were considered inactive (Piccolo ). Although a large number of SNVs and indel variations were observed in LNCaP, only a single homozygous deletion event (CNV0) was observed in this cell line. In the complex Prader-Willi gene locus there was a putative loss of PWRN1, a gene associated with epigenetic reprogramming during spermatogenesis (Wawrzik ) (Table 3). In contrast to LNCaP, 39 CNV0 events were found in PC3 (Table 3). CNV of the Y chromosome was evident from the PC3 sequence coverage (Figure 1A). In agreement with previous studies employing cDNA microarrays (Clark ) and multicolor fluorescence in situ hybridization (Aurich-Costa ), our CNV analysis revealed that large regions of the Y chromosome (including eight genes) were deleted in PC3 (Table 3). Several genes on chromosome 5 (CDH18, CTNNA1, LRRTM2, and SIL1), chromosome 10 (DYDC1, DYDC2, FAM213A, LIPJ, MAT1A, PTEN, RNLS, SH2D4B, and TSPAN14), and chromosome 17 (ATP6V0A1, DHX58, GHDC, HCRT, HSPB9, KAT2A, KCNH4, MIR548AT, PTRF, RAB5C, STAT3, STAT5A, and STAT5B) have also previously been reported to be deleted in PC3 (Liu ; Krohn ; The Cancer Genome Atlas Research Network 2015; Ibeawuchi ). Clinical observations and experimental studies indicate that the growth hormone receptor (GHR) mediates the development and progression of cancer (Brooks and Waters 2010), and GHR expression is elevated in prostate cancer cell lines and tissues (Chopin ; Weiss-Messer ). Interestingly, we noted that the genes encoding the classical growth hormone receptor signaling molecules STAT3 (STAT3) and STAT5 (STAT5A and STAT5B) were lost in PC3 cells. Thus, autocrine GHR actions are likely to be associated with alternative signaling pathways (Barclay ) in PC3. Loss of STAT3 in PC3 has been firmly established experimentally (Yuan ; Pencik ), and there is evidence to suggest that STAT3 suppresses prostate cancer metastasis and confers a good prognosis (Pencik ). We identified a homozygous deletion event spanning four genes (CIC, PAFAH1B3, PRR19, and TMEM145) on chromosome 19 in PC3 (Figure 3A). In LNCaP, a genome coverage plot of reads flanking this region revealed a putative heterozygous event (CNV1; loss of a single copy of the same genes) (Figure 3B). Of these four genes, the mammalian homolog of Drosophila CIC (Jiménez ) is particularly interesting. Capicua is a transcriptional repressor of cancer metastasis in a number of cancers (Choi ; Okimoto ). Recent WGS data also suggests that CIC is lost in PC3 cells (Iorio ). Homozygous deletions of CIC have been reported in neuroblastoma (Nagaishi ; Fransson ), and a homozygous deletion of CIC in a subpopulation of H1975 human nonsmall cell lung cancer cell line xenografts rendered them highly metastatic (Okimoto ). We interrogated 75 cBioPortal data sets from diverse tumors, confirming that one or two copies of CIC are lost in many cancer types (see Supplemental Material, Figure S1).
Figure 3

Deletion of the tumor suppressor Capicua (CIC) in prostate cancer (A) Genome browser display showing a CNV0 event (red dotted line) on chromosome 19 that spans CIC, PAFAH1B3, PRR19, and TMEM145 in the PC3 prostate cancer cell line. (B) Plot of putative chromosomal loss spanning the four-gene region in LNCaP (left panel) and PC3 (right panel). The x-axis represents the genomic position and the y-axis the number of normalized counts. (C) Loss of the chromosome 19 region encompassing CIC, PAFAH1B3, PRR19, and TMEM145 in the ‘Prostate FHCRC 2016’ cBioPortal data set. Individual tumor samples are shown in columns and genes in rows. (D) CIC copy number alterations in primary and metastatic prostate cancer samples from eight clinical data sets interrogated using cBioPortal. (E) Disease-free survival in primary prostate cancer patients with loss of a single CIC gene copy (n = 13 relapse events) is significantly decreased compared to those without any CIC deletion events. ‘The Prostate (MSKCC 2010)’, ‘Prostate (MSKCC 2014)’, and ‘Prostate (TCGA)’ cBioPortal data sets were interrogated. P-values were calculated by Kaplan–Meier analysis (log-rank test). Time denotes years. CNV0, homozygous deletion; CNV1, heterozygous deletion; WT, wild-type.

Deletion of the tumor suppressor Capicua (CIC) in prostate cancer (A) Genome browser display showing a CNV0 event (red dotted line) on chromosome 19 that spans CIC, PAFAH1B3, PRR19, and TMEM145 in the PC3 prostate cancer cell line. (B) Plot of putative chromosomal loss spanning the four-gene region in LNCaP (left panel) and PC3 (right panel). The x-axis represents the genomic position and the y-axis the number of normalized counts. (C) Loss of the chromosome 19 region encompassing CIC, PAFAH1B3, PRR19, and TMEM145 in the ‘Prostate FHCRC 2016’ cBioPortal data set. Individual tumor samples are shown in columns and genes in rows. (D) CIC copy number alterations in primary and metastatic prostate cancer samples from eight clinical data sets interrogated using cBioPortal. (E) Disease-free survival in primary prostate cancer patients with loss of a single CIC gene copy (n = 13 relapse events) is significantly decreased compared to those without any CIC deletion events. ‘The Prostate (MSKCC 2010)’, ‘Prostate (MSKCC 2014)’, and ‘Prostate (TCGA)’ cBioPortal data sets were interrogated. P-values were calculated by Kaplan–Meier analysis (log-rank test). Time denotes years. CNV0, homozygous deletion; CNV1, heterozygous deletion; WT, wild-type. CIC is abundantly expressed in normal prostate tissue, whereas its expression is reduced in primary tumors and ablated in metastatic prostate cancer (Choi ). To characterize the potential clinical significance of CIC deletions in prostate cancer, we further examined 1311 tumors from eight data sets using the cBioPortal tool. While homozygous deletion events of the four genes deleted in PC3 cells were rare, a substantial fraction of prostate tumors harbored heterozygous deletions of these genes (Figure 3C). Approximately 6% of primary prostate tumors had heterozygous deletions and 2% had homozygous deletions of CIC, whereas 21% of metastatic tumors had homozygous CIC deletions and 2% heterozygous deletions (Figure 3D). Prostate cancer relapse or recurrence frequently results in incurable metastasis, ultimately causing patient death (Wu ; Weiner ). As CIC deletions were more frequent in metastatic tumors, we reasoned that deletion of one or both copies of CIC is a means by which primary tumors in patients that eventually develop metastatic lesions achieve increased fitness and survival. The association between CIC homozygous deletion events and DFS in primary tumors could not be reliably assessed due to the low number (n = 2) of patients with recorded relapses; however, patients with primary prostate tumors with one lost copy of CIC (heterozygous deletion events) had a significantly worse outcome (P = 0.018, log-rank test) (Figure 3E). Similarly, OS is significantly worse in advanced-stage gastric cancer patients with low CIC expression (Okimoto ). A recent study comparing PC3 and LNCaP reported that the long form of the CIC protein (CIC-L) was not expressed and that the short form (CIC-S) was expressed at extremely low levels in PC3 cells (Choi ). Our CNV analysis, employing WGS reads, interrogation of the de novo PC3 assembly using BLAST and GMAP, and analysis of RNA-seq reads mapped to the reference genome, failed to detect an intact CIC gene in PC3. We sequenced low-passage PC3 cells sourced directly from ATCC and speculate that the previous study (Choi ) detected low-level gene expression by PC3 subpopulations with intact CIC resulting from genetic drift during prolonged subculture (passaging; see Festuccia ; Li ). Taken together, these data suggest that although a rare event in prostate tumors, homozygous deletion of CIC is not an idiosyncrasy of the PC3 cell line. Moreover, loss of a single gene copy of CIC is relatively common in prostate cancer. We speculate that disruption of one or both copies of CIC renders prostate cancer patients susceptible to an adverse disease outcome. A previous study employing forced overexpression of CIC in PC3 and LNCaP demonstrated that CIC is repressed by a trio of microRNAs (Choi ). Altered MAPK signaling through the ERK pathway also suppresses endogenous CIC in lung cancer (Okimoto ). Collectively, our data raise the possibility that the combination of microRNA repression, altered ERK signaling, and somatic events in the CIC locus promote tumorigenesis and confer a poor disease outcome.

Relevance of findings

In summary, we provide genome sequence data for PC3 and LNCaP, prostate cancer cell lines commonly employed in cancer research. These data contribute to a catalog of cancer genomes, adding to recent whole-transcriptome sequencing, pharmacological profiling, and whole-exome sequencing efforts (Barretina ; Klijn ; Iorio ) aimed at enhancing our understanding of human disease. For example, the phenomenon of androgen independence in prostate cancer has intrigued scientists for decades. Of the two cell lines interrogated in our study, PC3 is androgen-independent, whereas the LNCaP strain sequenced (LNCaP-FGC) is androgen-dependent. Recent work, including an investigation of 150 patients with metastatic CRPC (Robinson ), suggests that anomalies (mutations, amplifications, and deletions) in a number of genes in the androgen receptor pathway play a role in the transition to androgen independence. We speculate that future work—employing WGS, RNA-sequencing, epigenetic profiling, and similar high-throughput methods—on a large number of cell lines and clinical samples is likely to identify genes critical for androgen independence. For instance, an androgen-independent strain of LNCaP (LNCaP-LNO) has been developed from cultures of an early passage of the LNCaP cells sequenced in our study (LNCaP-FGC) (van Steenbrugge ). LNCaP-LNO and LNCaP-FGC were compared at the gene expression level (Oosterhoff ); hinting that specific gene mutations or copy number events render LNCaP-LNO cells androgen-insensitive. Raw reads (see Data availability in Materials and Methods) and sequence (SNV and indel) and CNV data are made available. We have generated de novo genome assemblies of both cell lines, allowing genes of interest to be investigated further, enabling, for example, the validation of gene loci associated with novel transcripts obtained from Trinity de novo transcriptome analysis (Grabherr ; Haas ). In addition, the genomes can be interrogated using a BLAST server, available at http://ghrelinlab.org. We acknowledge the limitations of short-insert (350 bp) genome sequencing, particularly when resolving complex repetitive or heterozygous regions (Rhoads and Au 2015; Merker ). However, we anticipate that as sequencing becomes increasingly affordable, our sequencing efforts will complement future long-read genome assembly work and prove useful when correcting for errors (sequence polishing). Finally, we reveal that one or both copies of CIC, a tumor metastasis suppressor gene, are frequently lost in prostate cancer and could drive metastatic CRPC. We anticipate that further biological insights into the role of Capicua in prostate cancer will shortly be gained by the research community, in line with the ethos of G3: Genes, Genomes, Genetics Genome Reports.

Supplementary Material

Supplemental material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.117.039909/-/DC1. Click here for additional data file.
  82 in total

1.  Multiplatform single-sample estimates of transcriptional activation.

Authors:  Stephen R Piccolo; Michelle R Withers; Owen E Francis; Andrea H Bild; W Evan Johnson
Journal:  Proc Natl Acad Sci U S A       Date:  2013-10-15       Impact factor: 11.205

2.  Mutational Landscape of Aggressive Prostate Tumors in African American Men.

Authors:  Karla J Lindquist; Pamela L Paris; Thomas J Hoffmann; Niall J Cardin; Rémi Kazma; Joel A Mefford; Jeffrey P Simko; Vy Ngo; Yalei Chen; Albert M Levin; Dhananjay Chitale; Brian T Helfand; William J Catalona; Benjamin A Rybicki; John S Witte
Journal:  Cancer Res       Date:  2016-02-26       Impact factor: 12.701

3.  Stat3 dimerization regulated by reversible acetylation of a single lysine residue.

Authors:  Zheng-Long Yuan; Ying-Jie Guan; Devasis Chatterjee; Y Eugene Chin
Journal:  Science       Date:  2005-01-14       Impact factor: 47.728

4.  Genomic deletion of PTEN is associated with tumor progression and early PSA recurrence in ERG fusion-positive and fusion-negative prostate cancer.

Authors:  Antje Krohn; Tobias Diedler; Lia Burkhardt; Pascale-Sophie Mayer; Colin De Silva; Marie Meyer-Kornblum; Darja Kötschau; Pierre Tennstedt; Joseph Huang; Clarissa Gerhäuser; Malte Mader; Stefan Kurtz; Hüseyin Sirma; Fred Saad; Thomas Steuber; Markus Graefen; Christoph Plass; Guido Sauter; Ronald Simon; Sarah Minner; Thorsten Schlomm
Journal:  Am J Pathol       Date:  2012-06-13       Impact factor: 4.307

5.  The LNCaP cell line--a new model for studies on human prostatic carcinoma.

Authors:  J S Horoszewicz; S S Leong; T M Chu; Z L Wajsman; M Friedman; L Papsidero; U Kim; L S Chai; S Kakati; S K Arya; A A Sandberg
Journal:  Prog Clin Biol Res       Date:  1980

Review 6.  The growth hormone receptor: mechanism of activation and clinical implications.

Authors:  Andrew J Brooks; Michael J Waters
Journal:  Nat Rev Endocrinol       Date:  2010-07-27       Impact factor: 43.330

7.  Co-expression of GH and GHR isoforms in prostate cancer cell lines.

Authors:  L K Chopin; T L Veveris-Lowe; A F Philipps; A C Herington
Journal:  Growth Horm IGF Res       Date:  2002-04       Impact factor: 2.372

8.  Growth hormone (GH) receptors in prostate cancer: gene expression in human tissues and cell lines and characterization, GH signaling and androgen receptor regulation in LNCaP cells.

Authors:  Esther Weiss-Messer; Osnat Merom; Ayala Adi; Rachel Karry; Maslama Bidosee; Rosalie Ber; Alexander Kaploun; Avi Stein; Ronnie J Barkey
Journal:  Mol Cell Endocrinol       Date:  2004-05-31       Impact factor: 4.102

9.  Exome Sequencing: Current and Future Perspectives.

Authors:  Amanda Warr; Christelle Robert; David Hume; Alan Archibald; Nader Deeb; Mick Watson
Journal:  G3 (Bethesda)       Date:  2015-07-02       Impact factor: 3.154

10.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

View more
  22 in total

Review 1.  The Capicua tumor suppressor: a gatekeeper of Ras signaling in development and cancer.

Authors:  Lucía Simón-Carrasco; Gerardo Jiménez; Mariano Barbacid; Matthias Drosten
Journal:  Cell Cycle       Date:  2018       Impact factor: 4.534

2.  3D Growth of Cancer Cells Elicits Sensitivity to Kinase Inhibitors but Not Lipid Metabolism Modifiers.

Authors:  Dylan T Jones; Alessandro Valli; Syed Haider; Qifeng Zhang; Elizabeth A Smethurst; Zachary T Schug; Barrie Peck; Eric O Aboagye; Susan E Critchlow; Almut Schulze; Eyal Gottlieb; Michael J O Wakelam; Adrian L Harris
Journal:  Mol Cancer Ther       Date:  2018-11-26       Impact factor: 6.261

Review 3.  Prostate cancer susceptibility and growth linked to Y chromosome genes.

Authors:  Riddhi Patel; Ahmad O Khalifa; Ilaha Isali; Sanjeev Shukla
Journal:  Front Biosci (Elite Ed)       Date:  2018-03-01

4.  Oncogenic extrachromosomal DNA functions as mobile enhancers to globally amplify chromosomal transcription.

Authors:  Yanfen Zhu; Amit D Gujar; Chee-Hong Wong; Harianto Tjong; Chew Yee Ngan; Liang Gong; Yi-An Chen; Hoon Kim; Jihe Liu; Meihong Li; Adam Mil-Homens; Rahul Maurya; Chris Kuhlberg; Fanyue Sun; Eunhee Yi; Ana C deCarvalho; Yijun Ruan; Roel G W Verhaak; Chia-Lin Wei
Journal:  Cancer Cell       Date:  2021-04-08       Impact factor: 31.743

5.  Genomic analyses of the metastasis-derived prostate cancer cell lines LNCaP, VCaP, and PC3-AR.

Authors:  Karolina Sienkiewicz; Chunsong Yang; Bryce M Paschal; Aakrosh Ratan
Journal:  Prostate       Date:  2021-12-24       Impact factor: 4.104

6.  Identification of new semen trait-related candidate genes in Duroc boars through genome-wide association and weighted gene co-expression network analyses.

Authors:  Quanshun Mei; Chuanke Fu; Goutam Sahana; Yilong Chen; Lilin Yin; Yuanxin Miao; Shuhong Zhao; Tao Xiang
Journal:  J Anim Sci       Date:  2021-07-01       Impact factor: 3.338

7.  Major changes of cell function and toxicant sensitivity in cultured cells undergoing mild, quasi-natural genetic drift.

Authors:  Simon Gutbier; Patrick May; Sylvie Berthelot; Abhimanyu Krishna; Timo Trefzer; Mehri Behbehani; Liudmila Efremova; Johannes Delp; Gerhard Gstraunthaler; Tanja Waldmann; Marcel Leist
Journal:  Arch Toxicol       Date:  2018-10-08       Impact factor: 5.153

8.  Specific and reliable detection of Myosin 1C isoform A by RTqPCR in prostate cancer cells.

Authors:  Aleena A Saidova; Daria M Potashnikova; Anna V Tvorogova; Ivan V Maly; Wilma A Hofmann; Ivan A Vorobjev
Journal:  PeerJ       Date:  2018-11-20       Impact factor: 2.984

9.  GRASShopPER-An algorithm for de novo assembly based on GPU alignments.

Authors:  Aleksandra Swiercz; Wojciech Frohmberg; Michal Kierzynka; Pawel Wojciechowski; Piotr Zurkowski; Jan Badura; Artur Laskowski; Marta Kasprzak; Jacek Blazewicz
Journal:  PLoS One       Date:  2018-08-16       Impact factor: 3.240

Review 10.  Capicua in Human Cancer.

Authors:  Ji Won Kim; Rovingaile Kriska Ponce; Ross A Okimoto
Journal:  Trends Cancer       Date:  2020-09-22
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.