Literature DB >> 26315111

A genome-wide assessment of rare copy number variants in colorectal cancer.

Zhenli Li1,2, Dan Yu1,2, Meifu Gan3, Qiaonan Shan4, Xiaoyang Yin4, Shunli Tang4, Shuai Zhang1,2, Yongyong Shi5, Yimin Zhu6, Maode Lai1,2, Dandan Zhang1,2.   

Abstract

Colorectal cancer (CRC) is a complex disease with an estimated heritability of approximately 35%. However, known CRC-related common single nucleotide polymorphisms (SNPs) can only explain ~0.65% of the heritability. This "missing heritability" may be explained partially by rare copy number variants (CNVs). In this study, we performed a genome-wide scan using Illumina Human-Omni Express BeadChip, 694 sporadic CRC cases and 1641 controls were eventually included in our analysis after quality control. The global burden analysis revealed a 1.53-fold excess of rare CNVs in CRC cases compared with controls (P < 1 × 10(-6)), and the difference being more pronounced for genic rare CNVs and CNVs overlapped with coding regions (1.65-fold and 1.84-fold, respectively, both P < 1 × 10(-6)). Interestingly, both the cases in the lowest and middle tertile of age carried a higher burden of rare CNVs comparing to the highest tertile. Furthermore, 639 CNV-disrupted genes exclusive to CRC cases were found to be significantly enriched in gene ontology (GO) terms concerning nucleosome assembly and olfactory receptor activity. Our study was the first to evaluate the burden of rare CNVs in sporadic CRC and suggested that rare CNVs contributed to the missing heritability of CRC.

Entities:  

Keywords:  colorectal cancer; genome-wide scan; nucleosome assembly; rare CNVs

Mesh:

Substances:

Year:  2015        PMID: 26315111      PMCID: PMC4694911          DOI: 10.18632/oncotarget.4621

Source DB:  PubMed          Journal:  Oncotarget        ISSN: 1949-2553


INTRODUCTION

Colorectal cancer (CRC) is the fourth most commonly diagnosed cancer in males and the third in females worldwide [1]. In the past few decades, the incidence of CRC has increased rapidly in most Asian countries, including China [2-4]. Genetic factors are known to have crucial impacts on the incidence and development of CRC. The heritability of CRC was estimated to be approximately 35% [5]. Recent genome-wide association studies (GWAS) have identified multiple single nucleotide polymorphism (SNPs) associated with CRC [6-10]. Surprisingly, known CRC-associated SNPs can explain only ~0.65% of the heritability [11]. Copy number variants (CNVs), large deletions or duplications of DNA segments (>1kb), were considered another important form of genetic variations. It was well demonstrated that changes of copy number, which lead to aberrations of gene dosage, can influence the susceptibility to complex disease by altering gene expression [12, 13]. Plenty of studies have already highlighted the importance of CNV in cancer pathogenesis, including breast cancer, prostate cancer, neuroblastoma, etc [14-16]. Also, several CNV regions have been found to be associated with CRC. Fernandez-Rozadilla C et al. discovered that deletions of 11q11 were associated with increased risk of CRC [17]. In addition, several predisposing CNVs were suggested by Venkatachalam R et al. through GWAS on CRC [18]. However, the majority of common CNVs were in linkage disequilibrium with SNPs in the human genome, making them unlikely to account for much of the “missing heritability” for complex traits [19-21]. Increasingly, recent studies suggest that rare CNVs have substantial effects on the development of complex diseases [22, 23]. Rare CNVs have also been implicated in numerous cancers such as breast cancer, testicular cancer as well as colorectal cancer. Yang R et al. identified a rare deletion at 12p.12.3 in two of 384 familial CRC cases, but none in the controls, with the results being successfully validated in another independent sample [24]. Most recently, rare CNVs were displayed at protein coding genes in colorectal adenomatous polyposis [25]. Whether rare CNVs play roles in the pathogenesis of sporadic CRC cases, however, has not been examined to date. We initiated two GWASs in CRC and individuals with metabolic syndrome (MS) with shared control data set (not published). 1008 CRC cases, 998 MS and 996 controls from China were genotyped using Illumina Human-Omni Express BeadChip. In this study, we generated CNV calls from GWAS data and investigated potential contributions of rare CNVs to sporadic CRC.

RESULTS

Characteristics of the study population

After strict quality control, 694 CRC cases (including 336 individuals with colon cancer, 340 individuals with rectal cancer and 18 individuals with both colon and rectal cancer) and 1641 controls (the information of MS controls and non-MS controls after quality control were shown in Supplementary Table 1) were finally included in our analysis (see Table 1). Genotype results from 23 pairs of duplicate samples showed ~99.9% concordance. The first two principle components of CRC cases, non-MS controls and MS controls from PCA analysis and QQ plot for rare CNVs span a particular position were plotted in Supplementary Figure 1 and Supplementary Figure 2, respectively. None of the remained samples was removed as an outlier according to the PCA analysis. There was no statistical difference in gender between cases and controls. The age of the CRC group was significantly higher than that of the control group (P < 0.001).
Table 1

Basic characteristics of the study subjects

CRC casesControlsP value a
All *ColonRectal
N6943363401641
Gender(M/F)382/312195/141175/165850/7910.151
Age(years)62.4 ± 12.362.9 ± 12.862.2 ± 11.557.4 ± 11.6<0.001

As 18 of the CRC cases diagnosed with both colon and rectal cancer, they were excluded in the following stratification analysis for tumor site.

The P value for gender was calculated by χ2 test between all the cases and controls, while the P value of the age between the two groups was derived from independent T test.

As 18 of the CRC cases diagnosed with both colon and rectal cancer, they were excluded in the following stratification analysis for tumor site. The P value for gender was calculated by χ2 test between all the cases and controls, while the P value of the age between the two groups was derived from independent T test.

Global burden analysis

A total of 1471 and 2275 autosomal rare CNVs were detected for qualified cases and controls respectively (Figure 1). The number of rare CNVs per person was significantly higher in CRC cases vs. controls (2.12 vs. 1.39, P < 1.0 × 10−6, Table 2). Both deletions and duplications were enriched in CRC cases (P < 1.0 × 10−6, P = 2.0 × 10−6, respectively). The proportion of CRC cases with at least one rare deletion was significant higher than controls (0.80 vs 0.74, P = 0.001). No significant difference between cases and controls was found in total CNV size or average CNV size span per individual. When excluding individuals with MS, a greater burden remained in CRC patients comparing to non-MS controls (see Supplementary Table 2). There was no significant difference in the frequency of rare CNVs between males and females either in all samples or stratified into cases and controls (Supplementary Table 3).
Figure 1

Outline of CNV discovery and CNV analysis

A total of 1004 sporadic CRC cases and 1994 controls were genotyped using Illumina Human-Omni Express BeadChip. aQC for SNP array data, individuals with call rate <95% or outliers were removed. bQC for sample and CNV calls by PennCNV and QuantiSNP.

Table 2

Global burden of rare CNVs between colorectal cases and controls

CategoryControls (N = 1641)CRC (N = 694)Fold Change#P value*Colon (N = 336)Fold Change#P value*Rectal (N = 340)Fold Change#P value*
Total number of rare CNVs
 Total22751471751682
 Deletion1184845431
 Duplication1091626320293
Number of rare CNVs per sample
 Total1.392.121.53<1.0 × 10−62.241.61<1.0 × 10−62.011.45<1.0 × 10−6
 Deletion0.721.221.69<1.0 × 10−61.281.78<1.0 × 10−61.141.59<1.0 × 10−6
 Duplication0.660.901.362.0 × 10−60.951.430.000030.861.300.0003
Proportion of samples with one or more rare CNVs
 Total0.740.801.080.0010.801.090.0070.791.070.02
 Deletion0.490.581.170.000090.591.190.00090.561.140.01
 Duplication0.470.511.090.030.491.040.300.541.150.01
Total length of rare CNVs spanned per sample (in kb)
 Total234.70258.401.100.53251.401.070.23265.501.130.10
 Deletion139.70157.301.130.57171.901.230.11141.201.010.41
 Duplication221.90224.501.010.58206.400.930.74243.101.100.19

Empirical p-values between cases and controls were calculated using 1000,000 permutations by PLINK, and all the P values were shown in bold if reached statistical significance (P < 0.05).

Fold change of CRC/colon/rectal cases vs controls.

Outline of CNV discovery and CNV analysis

A total of 1004 sporadic CRC cases and 1994 controls were genotyped using Illumina Human-Omni Express BeadChip. aQC for SNP array data, individuals with call rate <95% or outliers were removed. bQC for sample and CNV calls by PennCNV and QuantiSNP. Empirical p-values between cases and controls were calculated using 1000,000 permutations by PLINK, and all the P values were shown in bold if reached statistical significance (P < 0.05). Fold change of CRC/colon/rectal cases vs controls. We further stratified CRC cases into colon and rectal cancer and derived similar results. The number of rare CNVs per person was significantly enriched in colon cases/rectal cases vs. controls (1.61-fold and 1.45-fold, respectively). The proportions of samples with one or more rare deletions were significantly higher in both colon cancer patients and rectal patients than in controls (P = 0.0009 and P = 0.01, respectively). A significant increase in proportion of rectal cancer but not colon cancer with at least one duplication was observed (P = 0.01). In regards to the genic rare CNVs (rare CNVs overlapping with one or more genes as defined in methods), a remarkably higher rate was noted in CRC cases (P < 1.0 × 10−6, Table 3). Overall, a more pronounced difference in rare genic CNVs compared to global rare CNVs was observed in CRC cases vs. controls (1.65-fold vs 1.53-fold). We observed more apparent frequency difference of genic rare CNVs than non-genic rare CNVs between rectal/colon cancer and controls (Figure 2). Both colon and rectal cancer cases carried more genic rare deletions and duplications than controls did. Interestingly, we detected a significantly higher proportion of colon cancer cases carrying at least one genic deletion, but not duplications (P = 8.0 × 10−6 and P = 0.20, respectively). A significantly higher proportion of both genic deletions and duplications was observed in rectal cancer patients (P = 0.005 and P = 0.004). Furthermore, we examined the rare CNVs overlapped with protein coding sequences (CDSs). Both the colon cancer patients and the rectal patients carried more such rare deletions/duplications than controls did (Figure 3). An even greater fold change between CRC and controls was observed (1.84-fold, P < 1.0 × 10−6, Supplementary Table 4).
Table 3

Global burden of genic rare CNVs between colorectal cases and controls

CategoryControls (N = 1641)CRC (N = 694)Fold Change#P value*Colon (N = 336)Fold Change#P value*Rectal (N = 340)Fold Change#P value*
Total number of genic CNVs
 Total1271887451415
 Deletion576449231207
 Duplication695438220208
Number of genic CNVs per sample
 Total0.771.281.65<1.0 × 10−61.341.73<1.0 × 10−61.221.58<1.0 × 10−6
 Deletion0.350.651.84<1.0 × 10−60.691.96<1.0 × 10−60.611.73<1.0 × 10−6
 Duplication0.420.631.49<1.0 × 10−60.651.550.000020.611.440.00002
Proportion of samples with one or more genic CNVs
 Total0.520.631.20<1.0 × 10−60.631.210.00020.631.200.0003
 Deletion0.280.381.35<1.0 × 10−60.401.458.0 × 10−60.351.260.005
 Duplication0.330.391.150.010.361.080.200.411.230.004
Total length of genic CNVs spanned per sample (in kb)
 Total202.30206.601.020.40178.300.880.83236.701.170.11
 Deletion124.00119.500.960.48121.800.980.38114.000.920.51
 Duplication213.40220.501.030.36175.500.820.93263.201.230.05

Empirical p-values between cases and controls were calculated using 1000,000 permutations by PLINK, and all the P values were shown in bold if reached statistical significance (P < 0.05).

Fold change of CRC/colon/rectal cases vs controls.

Figure 2

Genome-wide burden of rare non-genic CNVs and genic CNVs

Genome-wide frequency of rare genic CNVs and rare non-genic CNVs were calculated for controls, rectal cancer patients and colon cancer patients respectively. Rate (Y axis) represents the number of rare genic/non genic CNVs per individual.

Figure 3

Genome-wide burden of rare CNVs overlapped with coding regions

Frequency (Y axis) of all rare CNVs, rare deletions and rare duplications overlapped with coding region were calculated separately. Each cluster consisted of three bars representing controls, rectal cancer patients and colon cancer patients.

Empirical p-values between cases and controls were calculated using 1000,000 permutations by PLINK, and all the P values were shown in bold if reached statistical significance (P < 0.05). Fold change of CRC/colon/rectal cases vs controls.

Genome-wide burden of rare non-genic CNVs and genic CNVs

Genome-wide frequency of rare genic CNVs and rare non-genic CNVs were calculated for controls, rectal cancer patients and colon cancer patients respectively. Rate (Y axis) represents the number of rare genic/non genic CNVs per individual.

Genome-wide burden of rare CNVs overlapped with coding regions

Frequency (Y axis) of all rare CNVs, rare deletions and rare duplications overlapped with coding region were calculated separately. Each cluster consisted of three bars representing controls, rectal cancer patients and colon cancer patients.

Enrichment analysis of the CNV-disrupted genes

We utilized DAVID to examine whether CNV-disrupted genes specific to CRC cases may be enriched in some functional annotations. As a result, 639 genes were disrupted by the CNVs in all CRC cases within our dataset, of which 372 were found in colon cancer patients and 299 in rectal cancer patients. Ultimately, a total of 15 items were significantly enriched after Bonferroni correction (Table 4). The most significant term in the GO analysis was a cellular component (CC) term identified as nucleosome (15 DAVID genes, Bonferroni corrected P = 2.80 × 10−6). The remaining CC terms were mainly focused on chromatin or DNA. All the biological process (BP) terms were associated with chromatin or DNA assembly, except one term concerned about sensory perception of smell. Olfactory receptor activity, a molecular function (MF) term, was also significantly enriched (31 DAVID genes, Bonferroni corrected P = 0.0215).
Table 4

Enriched GO functional terms of exclusively disrupted genes in CRC cases

Category 1TermCount 2% 3P ValueBonferroni
Rare CNVs exclusive to colorectal cancer
GOTERM_CCGO:0000786~nucleosome152.607.95E-092.80E-06
GOTERM_CCGO:0032993~protein-DNA complex172.941.05E-083.68E-06
GOTERM_BPGO:0031497~chromatin assembly162.776.79E-081.30E-04
GOTERM_BPGO:0006333~chromatin assembly or disassembly193.297.81E-081.50E-04
GOTERM_BPGO:0065004~protein-DNA complex assembly162.771.26E-072.42E-04
GOTERM_BPGO:0006334~nucleosome assembly152.602.89E-075.53E-04
GOTERM_BPGO:0006323~DNA packaging172.946.80E-071.30E-03
GOTERM_BPGO:0034728~nucleosome organization152.601.04E-062.00E-03
GOTERM_CCGO:0000785~chromatin223.811.60E-065.64E-04
GOTERM_CCGO:0005694~chromosome356.064.41E-061.55E-03
GOTERM_BPGO:0007608~sensory perception of smell325.541.36E-052.57E-02
GOTERM_BPGO:0034622~cellular macromolecular complex assembly264.502.17E-054.07E-02
GOTERM_MFGO:0004984~olfactory receptor activity315.363.59E-052.15E-02
GOTERM_CCGO:0044427~chromosomal part295.024.35E-051.52E-02
GOTERM_CCGO:0045095~keratin filament122.081.05E-043.64E-02
Rare CNVs exclusive to colon cancer
GOTERM_CCGO:0032993~protein-DNA complex164.786.16E-111.69E-08
GOTERM_CCGO:0000786~nucleosome144.181.32E-103.62E-08
GOTERM_BPGO:0006333~chromatin assembly or disassembly185.372.66E-104.02E-07
GOTERM_BPGO:0031497~chromatin assembly154.488.81E-101.33E-06
GOTERM_CCGO:0000785~chromatin216.271.25E-093.45E-07
GOTERM_BPGO:0065004~protein-DNA complex assembly154.481.63E-092.46E-06
GOTERM_BPGO:0006323~DNA packaging164.785.65E-098.54E-06
GOTERM_BPGO:0006334~nucleosome assembly144.185.85E-098.83E-06
GOTERM_BPGO:0034728~nucleosome organization144.182.08E-083.15E-05
GOTERM_CCGO:0044427~chromosomal part257.463.15E-078.67E-05
GOTERM_CCGO:0005694~chromosome278.066.24E-071.72E-04
GOTERM_BPGO:0034622~cellular macromolecular complex assembly226.577.35E-071.11E-03
GOTERM_CCGO:0045095~keratin filament123.587.91E-072.18E-04
GOTERM_BPGO:0034621~cellular macromolecular complex subunit organization226.574.64E-066.98E-03
GOTERM_BPGO:0065003~macromolecular complex assembly308.962.56E-053.79E-02
GOTERM_CCGO:0043228~non-membrane-bounded organelle7422.091.58E-044.24E-02
GOTERM_CCGO:0043232~intracellular non-membrane-bounded organelle7422.091.58E-044.24E-02
Rare CNVs exclusive to rectal cancer
none

BP, biological process; CC, cellular component; MF, molecular function.

Count, number of DAVID gene IDs identified in specific GO terms. Note that the number may be different with the number of Ensembl gene IDs as DAVID incorporates some functionally similar Ensembl gene IDs into one DAVID gene ID according to DAVID Knowledgebase.

%, (Count of involved genes / Total number of genes within a particular term) *100.

BP, biological process; CC, cellular component; MF, molecular function. Count, number of DAVID gene IDs identified in specific GO terms. Note that the number may be different with the number of Ensembl gene IDs as DAVID incorporates some functionally similar Ensembl gene IDs into one DAVID gene ID according to DAVID Knowledgebase. %, (Count of involved genes / Total number of genes within a particular term) *100. The 372 and 299 genes disrupted specifically in colon and rectal cancer patients, but not in controls, were further analyzed separately. As a result, 17 terms were overrepresented in colon cancer, mainly focused on the function of nucleosome or chromatin assembly and non-membrane-bounded organelle. However, no significant GO terms survived after Bonferroni correction for rectal cancer.

Greater rare CNV burden among the younger CRC cases

CRC cases were divided into three groups, according to age tertile within all CRC cases. All the three groups had a higher frequency of rare CNVs than controls did (all P < 0.05, data not shown). Both the lowest tertile (age < = 57) and the middle tertile (age between 57 and 69) carried greater burdens of rare CNVs than the highest tertile did (Figure 4, P = 0.01 and P = 0.06, respectively). However, the frequency of rare CNVs was fairly close among different age groups in control samples (Supplementary Figure 3). Additionally, burden comparison within age groups by decade showed that CRC cases carried a higher burden of rare CNVs than controls within each age subgroup except the oldest group (age > 80) (Supplementary Figure 5). It should be noted that the number of samples aged more than 80 was small (N CRC cases = 42 and N controls = 50). Furthermore, we speculated that the rare CNVs enriched in younger CRC cases may contribute more to CRC. Gene enrichment analysis showed that genes disrupted by rare CNVs were also associated with terms of “nucleosome or chromatin assembly” (Supplementary Table 5). Interestingly, we found that the majority of genes associated with “chromatin assembly or disassembly” were disrupted in younger cases (15 of 17 DAVID genes), indicating the importance of these genes in the pathogenesis of CRC.
Figure 4

Rate differences of rare CNVs among colorectal cases across different age groups

CRC cases were divided into three groups according to case age tertile (T1, T2, T3) (X axis). Rate (Y axis) represents the number of rare CNVs per sample. The P values between different age groups were calculated by PLINK.

Rate differences of rare CNVs among colorectal cases across different age groups

CRC cases were divided into three groups according to case age tertile (T1, T2, T3) (X axis). Rate (Y axis) represents the number of rare CNVs per sample. The P values between different age groups were calculated by PLINK.

Expression profile analysis

We compared expression difference of 38 genes (number of Ensembl gene IDs) which enriched in the “chromatin assembly or disassembly” item in our study by analyzing published microarray data sets from GEO website. We identified 58 probes corresponding to the 38 genes in both GDS2947 and GDS4382. As a result, approximately 43.1% of the probes were found to be differentially expressed between colorectal adenoma and adjacent normal tissue (GDS2947), and similar results (~41.3%) were observed in another dataset comparing CRC tumors and paired normal tissues (GDS4382) (Supplementary Table 6).

CNV validation by qPCR

For each CNV, the copy number was determined as the average of 2−ΔΔCt of two pairs of primers. qPCR confirmed all the ten randomly selected rare CNVs, and the results were graphically displayed in Supplementary Figure 4.

DISCUSSION

To our knowledge, this is the first large scale genome-wide analysis investigating rare CNVs in sporadic CRC, examining 694 sporadic CRC cases and 1641 controls after strict quality control. Results indicated that rare CNVs increased the risk of CRC. Enrichment analysis suggested that the assembly of chromatin or nucleosome-related or olfaction-associated genes specific to CRC cases may contribute to the rising risk of CRC. The burden analysis revealed remarkably significant associations between rare CNVs and the risk of CRC. When limited in rare genic CNVs, an even greater fold change of overall burden was observed (1.65 vs 1.53). This result would be expected if genic CNVs are a proxy of putatively functional CNVs. One study that investigated CNVs in Parkinson's disease also observed a significantly increased rate of rare genic CNVs in cases compared to controls [26]. Soemdedi et al identified an association of rare genic deletions with an increased risk of congenital heart disease [27]. Similar findings were also observed in psychiatric diseases such as autism [28]. Such evidence indicates that rare genic CNVs could be pathogenic in nature and contribute to the pathogenesis of complex disease via affect the expression of the genes. Rare CNVs overlapped with gene coding sequences, which may disrupt protein structure, showed an even greater fold change (1.84-fold vs 1.53). This suggested that rare CNVs overlapped with coding regions could have greater likelihood for causality. Interestingly, younger cases were inclined to carry a significantly higher burden of rare CNVs than older ones. This finding suggests that rare CNVs may have greater effects on younger affected individuals when compared to older ones. It has been suggested that genetic effects of cancer-related variants differed by age, and younger CRC patients were expected to have a more pronounced genetic predisposition [29, 30]. The genes disrupted exclusively in CRC cases were mainly over-represented in two types of GO terms: assembly of chromatin or nucleosomes and olfactory receptor activity. A recent review summarized that chromosomal instability was an important factor in the development of CRC [31]. The nucleosome is a fundamental unit of the chromatin, consisting of DNA and histones, and nucleosome assembly is crucial for the maintenance of genome stability. Chromatin structure can be regulated by nucleosome assembly, and variations in the factors involved in nucleosome assembly have been implicated in the pathogenesis of human cancer [32]. Frequent mutations of chromatin remodeling pathway were observed in glioblastoma multiforme by Jeremy Schwartzentruber et al. Additionally, they found that cases with such mutations carried more CNVs per genome [33]. Importantly, abnormal expression of nucleosome assembly-related genes such as the histone chaperone DEK proto-oncogene (DEK), chromatin assembly factor 1 (CAF-1), and chaperone anti-silencing function 1 (Asf1) were involved in the development of cancers [34-36]. Plentiful studies have revealed that DNA copy number change would result in the change of expression level of corresponding genes [37, 38]. The expression profile from GEO data showed that approximately 43.1% and 41.3% of the probes corresponded to “chromatin assembly and disassembly” item were differentially expressed between colorectal adenoma/CRC and adjacent normal tissue. Although rare CNVs may account for a small fraction of variations of gene expression, these results further supported that the DNA assembly-related genes may involve in the development of CRC. Olfactory receptors (ORs) are G protein-coupled receptors that can detect and discriminate a large variety of aromatic molecules present in the environment. A multigene family mainly expressed in olfactory epithelium that encodes ORs was first discovered by Linda Buck and Richard Axel in 1991 [39]. Later studies revealed that ORs are also expressed in a variety of non-olfactory tissues and have many additional functions, including colonic tissue [40, 41]. OR genes have been associated with several cancers, including breast, prostate cancer and salivary gland carcinoma [42-44]. Of note, a recent study found that ORs could promote the invasiveness and metastasis of cancer cells [45]. Previously, Sturzu A et al. have identified OR1D2 (olfactory receptor, family 1, subfamily D, member 2) as a promising target for prostate cancer [43]. OR4F15 (olfactory receptor, family 4, subfamily F, member 15) have also been found to be associated with salivary gland carcinoma via a GWAS on 309 cases and 535 cancer-free controls [44]. Activation of OR1A2 (olfactory receptor, family 1, subfamily A, member 2) was indicated in hepatocellular carcinoma progression with significant phosphorylation of p38 MAPK and reduced cell proliferation [46]. The aforementioned three OR genes, OR1D2, OR4F15 and OR1A1 were also disrupted in CRC cases but not controls in our study, which underscores the importance of OR activity-associated genes in colorectal cancer. Seventeen terms mainly focused on nucleosome or chromatin assembly were observed in colon cancer after gene enrichment analysis whereas no significant term was found in rectal cancer. This result was compatible with previous studies in which dysfunction of various signal pathways were varied between colon and rectal cancers, suggesting that the mechanisms of colon and rectal cancer development may not be identical [47, 48]. Burden analysis additionally showed that the colon cancer patients displayed more obvious tendency than rectal cancer patients (Figure 2 and Figure 3). Our results complemented the idea suggested by Burgess RJ et al that colon cancer possessed a stronger genetic component than rectal cancer does [49]. The current study has several limitations. Firstly, no replication in another independent population was conducted, which may result in some bias or chance findings. Although replication of global burden of rare variants is elusive and difficult, further studies involving larger sample size will be of value. Secondly, MS components data for CRC cases was not available. 1641 cancer-free controls consisted of 815 MS controls and 826 non-MS controls. MS is a clustering of metabolic abnormalities with high prevalence of about 30% varying among different populations [50, 51]. Therefore, we think the controls with inclusion of MS controls may better represent the population, although we got similar results when including or excluding MS controls (Table 2 and Supplementary Table 2). Thirdly, small CNVs, which may also have a contribution to CRC, were not evaluated in our study due to the detection limitations of SNP array. In conclusion, a greater burden of rare CNVs was observed in sporadic CRC cases than controls, and the burden was significantly decreased in older patients. Genes specifically disrupted in colon cancer, but not rectal cancer cases were significantly enriched in DNA assembly and olfactory receptor associated functional categories. These findings suggest that rare CNVs contribute to CRC predisposition and disruption of the OR pathway and DNA assembly play an underlying role in the pathogenesis of CRC.

MATERIALS AND METHODS

Study subjects

CRC patients were recruited from The First Affiliated Hospital of Zhejiang University, Sir Run Run Shaw Hospital of Zhejiang University and Taizhou Hospital of Zhejiang Province, who were diagnosed with CRC between 2006–2011. Pathologic diagnoses were evaluated by pathologists via biopsy reports and patients with familial adenomatous polyposis, hereditary non-polyposis CRC and inflammatory bowel disease were excluded. A comprehensive demographic and health survey was carried out among individuals who participated in a large-scale physical examination in the medical center of the Third People's Hospital of Xiaoshan Zhejiang from July 2010 to July 2011. Finally, a total of 1994 cancer-free controls without a family history of cancer (including 998 controls with MS and 996 non-MS controls) were included in our study. The subjects with MS were defined according to the Chinese Diabetes Society (CDS) definition [52]. All participants provided written, informed consent for this study and the ethics committee of Zhejiang University's School of Medicine approved the protocol.

Genotyping and CNV calling

Genomic DNA was extracted by a TACO automatic nucleic acid extraction apparatus (GeneReach Biotechnology Corp., Taiwan, China). Nano drop 2000 (Thermo Scientific) was used to measure the concentration and purity. Qualified DNA for all samples was genotyped using Illumina Human-Omni Express BeadChip (Illumina Inc., San Diego, CA, USA). To ensure genotyping quality, CRC cases, controls with MS and non-MS controls were mixed in each BeadChip. Forty-six duplicate samples (23 pairs) were genotyped. All the BeadChips were processed in the Bio-X Institute, Shanghai Jiao Tong University. Genotyping procedures were carried out according to the manufacturer's standard protocol. Different CNV calling algorithms can often produce discrepant results for the same data set. Recent CNV studies have supported a stringent discovery criterion of focusing solely on CNVs that are identified by at least two different programs [53-55]. Therefore, CNV segments were identified by both Penncnv and Quantisnp in our study [56, 57]. Both of the two algorithms are based on a Hidden Markov Model (HMM), using intensity files generated by GenomeStudio software from Illumina. QuantiSNP2.0, based on an objective Bayes HMM and takes into consideration log R Ratio (LRR) as well as B allele frequency (BAF) of each SNP. PennCNV algorithm incorporates additional information including population frequency of B allele (PFB) and the distance between adjacent SNPs. To reduce false positive calls due to genomic waves, GC-content adjustment was performed to correct the bias in both analysis [58]. Default settings for both algorithms were applied. In addition, adjacent CNV segments with same copy number were merged into a single call if the length of gap in between was shorter than half of total length of the two consecutive CNVs.

Sample quality control

To provide reliable results, samples with genotype rates of less than 95% or outliers were removed. Further criteria for the exclusion of noisy data were applied respectively for each algorithm. Samples were further excluded when meet one of the following criteria: individuals with more than 200 CNVs; an absolute value of GC wave factor (GCWF) larger than 0.05 or an standard deviation of LRR > 0.3, as recommended by PennCNV; a genome-wide LRR SD obtained from QuantiSNP greater than 3.5. Principle component analysis (PCA) was performed by Eigenstrat to examine ancestry in our study and outliers were excluded [59].

CNV quality control

To obtain high-confidence calls CNVs, we only remained CNVs satisfied all the following criteria: with a maximum Bayes factor >10 predicted by Quantisnp; possessing identical breakpoints identified by both Quantisnp and PennCNV; CNVs of larger than 10 kb and spanning ten or more contiguous probes.

Burden analysis

Considering our moderate sample size, rare CNVs were defined as those with a frequency of <0.5% in our dataset [23]. In order to evaluate the overall differences of CNV distribution between cases and controls, CNV burden analyses were conducted by PLINK [60], using 100,0000 permutations.

Functional annotation of CRC-specific CNVs

Functional annotation was explored for genes specifically disrupted by CNVs in CRC patients (CRC-specific) by an online Database for Annotation, Visualization and Integrated Discovery (DAVID) [61]. Genes were determined by RefSeq annotations (UCSC, v. July 2008, NCBI v36, hg18) and gene boundaries were extended with a 10 kb flanking region on either side as referred by Pinto D et al [28]. The gene ontology (GO) functional annotation was run with a default setting and the functional items with P value < 0.05 after Bonferroni correction were presented in the results. Datasets with gene expression profile comparing CRC or colorectal adenoma to paired adjacent normal tissue were obtained from Gene Expression Omnibus (GEO) database. Dataset GDS4382 was utilized to compare 17 paired CRC and adjacent normal tissue samples [62]. And the comparison between 32 paired colorectal adenoma and adjacent normal tissue samples were performed by dataset GDS2947 [63]. Both the two datasets were based on the Affymetrix Human GenomeU133 Plus 2.0 Array. The expression data of the probes corresponding to genes in significant GO terms from functional annotation analysis were extracted, and Wilcoxon paired test was performed for each probe. Quantitative real-time PCR (qPCR) was performed to measure the copy number of rare CNVs. RNase P was used as an endogenous reference. Ten rare CNVs were randomly selected, and two pairs of primers were designed for each CNV segment (primers are shown in Supplementary Table 7). Five samples were examined for each CNV (one with a putative deletion/duplication, the remaining four with two putative copies). qPCR was performed in triplicates on a LightCycler® 480 Instrument (Roche, Mannheim, Germany) using SYBR-green dye. Finally, the “delta delta Ct” method was used to calculate the relative copy numbers at target regions [64].
  63 in total

1.  Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method.

Authors:  K J Livak; T D Schmittgen
Journal:  Methods       Date:  2001-12       Impact factor: 3.608

2.  Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma.

Authors:  Jeremy Schwartzentruber; Andrey Korshunov; Xiao-Yang Liu; David T W Jones; Elke Pfaff; Karine Jacob; Dominik Sturm; Adam M Fontebasso; Dong-Anh Khuong Quang; Martje Tönjes; Volker Hovestadt; Steffen Albrecht; Marcel Kool; Andre Nantel; Carolin Konermann; Anders Lindroth; Natalie Jäger; Tobias Rausch; Marina Ryzhova; Jan O Korbel; Thomas Hielscher; Peter Hauser; Miklos Garami; Almos Klekner; Laszlo Bognar; Martin Ebinger; Martin U Schuhmann; Wolfram Scheurlen; Arnulf Pekrun; Michael C Frühwald; Wolfgang Roggendorf; Christoph Kramm; Matthias Dürken; Jeffrey Atkinson; Pierre Lepage; Alexandre Montpetit; Magdalena Zakrzewska; Krzystof Zakrzewski; Pawel P Liberski; Zhifeng Dong; Peter Siegel; Andreas E Kulozik; Marc Zapatka; Abhijit Guha; David Malkin; Jörg Felsberg; Guido Reifenberger; Andreas von Deimling; Koichi Ichimura; V Peter Collins; Hendrik Witt; Till Milde; Olaf Witt; Cindy Zhang; Pedro Castelo-Branco; Peter Lichter; Damien Faury; Uri Tabori; Christoph Plass; Jacek Majewski; Stefan M Pfister; Nada Jabado
Journal:  Nature       Date:  2012-01-29       Impact factor: 49.962

3.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

4.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data.

Authors:  Kai Wang; Mingyao Li; Dexter Hadley; Rui Liu; Joseph Glessner; Struan F A Grant; Hakon Hakonarson; Maja Bucan
Journal:  Genome Res       Date:  2007-10-05       Impact factor: 9.043

5.  Report of incidence and mortality in china cancer registries, 2008.

Authors:  Wan-Qing Chen; Rong-Shou Zheng; Si-Wei Zhang; Ni Li; Ping Zhao; Guang-Lin Li; Liang-You Wu; Jie He
Journal:  Chin J Cancer Res       Date:  2012-09       Impact factor: 5.087

6.  Enrichment of low penetrance susceptibility loci in a Dutch familial colorectal cancer cohort.

Authors:  Anneke Middeldorp; Shantie Jagmohan-Changur; Ronald van Eijk; Carli Tops; Peter Devilee; Hans F A Vasen; Frederik J Hes; Richard Houlston; Ian Tomlinson; Jeanine J Houwing-Duistermaat; Juul T Wijnen; Hans Morreau; Tom van Wezel
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2009-10-20       Impact factor: 4.254

7.  Quercetin Protects against Cadmium-Induced Renal Uric Acid Transport System Alteration and Lipid Metabolism Disorder in Rats.

Authors:  Ju Wang; Ying Pan; Ye Hong; Qing-Yu Zhang; Xiao-Ning Wang; Ling-Dong Kong
Journal:  Evid Based Complement Alternat Med       Date:  2012-05-29       Impact factor: 2.629

8.  Promotion of cancer cell invasiveness and metastasis emergence caused by olfactory receptor stimulation.

Authors:  Guenhaël Sanz; Isabelle Leray; Aurélie Dewaele; Julien Sobilo; Stéphanie Lerondel; Stéphan Bouet; Denise Grébert; Régine Monnerie; Edith Pajot-Augy; Lluis M Mir
Journal:  PLoS One       Date:  2014-01-08       Impact factor: 3.240

9.  Human spermatogenic failure purges deleterious mutation load from the autosomes and both sex chromosomes, including the gene DMRT1.

Authors:  Alexandra M Lopes; Kenneth I Aston; Emma Thompson; Filipa Carvalho; João Gonçalves; Ni Huang; Rune Matthiesen; Michiel J Noordam; Inés Quintela; Avinash Ramu; Catarina Seabra; Amy B Wilfert; Juncheng Dai; Jonathan M Downie; Susana Fernandes; Xuejiang Guo; Jiahao Sha; António Amorim; Alberto Barros; Angel Carracedo; Zhibin Hu; Matthew E Hurles; Sergey Moskovtsev; Carole Ober; Darius A Paduch; Joshua D Schiffman; Peter N Schlegel; Mário Sousa; Douglas T Carrell; Donald F Conrad
Journal:  PLoS Genet       Date:  2013-03-21       Impact factor: 5.917

10.  QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data.

Authors:  Stefano Colella; Christopher Yau; Jennifer M Taylor; Ghazala Mirza; Helen Butler; Penny Clouston; Anne S Bassett; Anneke Seller; Christopher C Holmes; Jiannis Ragoussis
Journal:  Nucleic Acids Res       Date:  2007-03-06       Impact factor: 16.971

View more
  6 in total

1.  Candidate predisposing germline copy number variants in early onset colorectal cancer patients.

Authors:  A J Brea-Fernandez; C Fernandez-Rozadilla; M Alvarez-Barona; D Azuara; M M Ginesta; J Clofent; L de Castro; D Gonzalez; M Andreu; X Bessa; X Llor; R Xicola; R Jover; A Castells; S Castellvi-Bel; G Capella; A Carracedo; C Ruiz-Ponte
Journal:  Clin Transl Oncol       Date:  2016-11-25       Impact factor: 3.405

2.  Genetic association study identifies a functional CNV in the WWOX gene contributes to the risk of intracranial aneurysms.

Authors:  Jin Fan; Wen Sun; Min Lin; Ke Yu; Jian Wang; Dan Duan; Bo Zheng; Zhenghui Yang; Qingsong Wang
Journal:  Oncotarget       Date:  2016-03-29

3.  Role of glucose metabolism related gene GLUT1 in the occurrence and prognosis of colorectal cancer.

Authors:  Wenming Feng; Ge Cui; Cheng-Wu Tang; Xiao-Lan Zhang; Chuang Dai; Yong-Qiang Xu; Hui Gong; Tao Xue; Hui-Hui Guo; Ying Bao
Journal:  Oncotarget       Date:  2017-05-23

4.  Germline INDELs and CNVs in a cohort of colorectal cancer patients: their characteristics, associations with relapse-free survival time, and potential time-varying effects on the risk of relapse.

Authors:  Salem Werdyani; Yajun Yu; Georgia Skardasi; Jingxiong Xu; Konstantin Shestopaloff; Wei Xu; Elizabeth Dicks; Jane Green; Patrick Parfrey; Yildiz E Yilmaz; Sevtap Savas
Journal:  Cancer Med       Date:  2017-05-23       Impact factor: 4.452

5.  A novel variant on chromosome 6p21.1 is associated with the risk of developing colorectal cancer: a two-stage case-control study in Han Chinese.

Authors:  Chunxiao Xu; Dan Zhou; Feixia Pan; Yi Liu; Dandan Zhang; Aifen Lin; Xiaoping Miao; Yaqin Ni; Duo Lv; Shuai Zhang; Xiaobo Li; Yimin Zhu; Maode Lai
Journal:  BMC Cancer       Date:  2016-10-18       Impact factor: 4.430

6.  Mutation Profiling of Premalignant Colorectal Neoplasia.

Authors:  Jakub Karczmarski; Krzysztof Goryca; Jacek Pachlewski; Michalina Dabrowska; Kazimiera Pysniak; Agnieszka Paziewska; Maria Kulecka; Malgorzata Lenarcik; Andrzej Mroz; Michal Mikula; Jerzy Ostrowski
Journal:  Gastroenterol Res Pract       Date:  2019-11-12       Impact factor: 2.260

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.