Literature DB >> 34440387

Identification of Copy Number Variants in a Southern Chinese Cohort of Patients with Congenital Scoliosis.

Wenjing Lai1, Xin Feng1, Ming Yue1, Prudence W H Cheung2, Vanessa N T Choi1, You-Qiang Song1, Keith D K Luk2, Jason Pui Yin Cheung2, Bo Gao1.   

Abstract

Congenital scoliosis (CS) is a lateral curvature of the spine resulting from congenital vertebral malformations (CVMs) and affects 0.5-1/1000 live births. The copy number variant (CNV) at chromosome 16p11.2 has been implicated in CVMs and recent studies identified a compound heterozygosity of 16p11.2 microdeletion and TBX6 variant/haplotype causing CS in multiple cohorts, which explains about 5-10% of the affected cases. Here, we studied the genetic etiology of CS by analyzing CNVs in a cohort of 67 patients with congenital hemivertebrae and 125 family controls. We employed both candidate gene and family-based approaches to filter CNVs called from whole exome sequencing data. This identified 12 CNVs in four scoliosis-associated genes (TBX6, NOTCH2, DSCAM, and SNTG1) as well as eight recessive and 64 novel rare CNVs in 15 additional genes. Some candidates, such as DHX40, NBPF20, RASA2, and MYSM1, have been found to be associated with syndromes with scoliosis or implicated in bone/spine development. In particular, the MYSM1 mutant mouse showed spinal deformities. Our findings suggest that, in addition to the 16p11.2 microdeletion, other CNVs are potentially important in predisposing to CS.

Entities:  

Keywords:  CNV; congenital scoliosis; congenital vertebral malformation; copy number variant

Mesh:

Year:  2021        PMID: 34440387      PMCID: PMC8391542          DOI: 10.3390/genes12081213

Source DB:  PubMed          Journal:  Genes (Basel)        ISSN: 2073-4425            Impact factor:   4.096


1. Introduction

Among all musculoskeletal disorders, scoliosis is one of the most common diseases, affecting around 3% of the world population, which can occur as an isolated defect or as a concomitant symptom in other diseases or syndromes [1]. Scoliosis is categorized into several main groups, including congenital scoliosis (CS), idiopathic scoliosis (IS), neuromuscular scoliosis, and degenerative scoliosis. CS, which usually has first onset at birth or shortly after birth, affects approximately 0.5–1 in 1000 live births [2,3,4,5]. Compared with IS, CS is generally more severe due to the high risk of progressive deformity and associated problems such as pulmonary compromise [6]. One of the most significant differences between CS and IS is that IS does not have an association with congenital vertebral malformation (CVM), whereas CVM is the major cause leading to CS. CVM can be classified into several subclasses, including failure of vertebral formation (e.g., hemivertebrae, wedged vertebrae), failure of vertebral segmentation (e.g., unilateral bar, block vertebrae), and mixed type. Of all CVMs, congenital hemivertebrae is the most common anomaly that causes CS [4,5]. During vertebral development, the paraxial mesoderm forms bilaterally paired blocks, named somites, along the anterior–posterior axis. The vertebral bodies are derived from somites formed in the presomitic mesoderm. This fundamental process is called somitogenesis. Once somitogenesis is disturbed, the resulting CVM may lead to spinal deformities. The most commonly accepted mechanism governing somitogenesis is the clock and wavefront model, which is controlled and coordinated by several key signaling pathways, such as Notch, Wnt, Fgf and retinoic acid signaling pathways [7,8]. Genetic studies of human patients with CVM have identified a variety of mutations in components of Notch signaling pathway (e.g., NOTCH2, DLL3, MESP2, LFNG, HES7, and RIPPLY2) and also in several key transcription factors essential for somitogenesis (e.g., TBX6, TBXT, and SOX9). Nevertheless, the genetic basis for majority of patients with CS still remains unclear [1,9]. Copy number variation (CNV) is a type of structural variation of genome. With the advancement of genome-wide analysis tools, it has been revealed that CNVs are widespread in the human genome and account for a large fraction of human genetic diversity [10]. CNVs have been, so far, implicated in many disease states including scoliosis. Although a number of CNVs were found to be associated with adolescent idiopathic scoliosis (AIS) [11,12], there have not been many reports about CS-associated CNVs. The 16p11.2 microdeletion was found to be associated with CS [13], and recent studies demonstrated that a compound inheritance of a TBX6-containing 16p11.2 microdeletion and a TBX6 mutation or hypomorphic haplotype accounted for 5–10% of patients with CS in different populations [14,15,16,17]. Additional CNVs, including 10q24.31, 17p11.2, 20p11, 22q11.2, and a few other regions, were respectively reported in individual patients with CVMs [18,19]. Besides 16p11.2 microdeletion, it is unknown whether other CNVs are prevalent in CS. Here, we analyzed CNVs in a Southern Chinese cohort of patients with congenital hemivertebrae. CNVs were called from whole-exome sequencing (WES) data of 67 cases and 125 family members (controls). We identified 12 rare CNVs in 4 known scoliosis-associated genes and eight recessive CNVs in three genes. We also found 64 novel, rare CNVs in 14 genes that occurred in multiple patients but are very rare in our control group and the general population, suggesting a potential role for genetic susceptibility in the development of CS.

2. Materials and Methods

2.1. Patient Recruitment

The patients studied in this project were recruited from the Duchess of Kent Children’s Hospital (DKCH), a tertiary scoliosis referral center in Hong Kong. The patients with CS were diagnosed by imaging such as plain standing whole-spine radiographs and computed tomography. A total of 67 patients with hemivertebrae were chosen for this study, of which 31 had single congenital hemivertebrae while 36 had multiple congenital hemivertebrae. Patients’ personal data and medical records were collected under ethical privacy guidelines and approval. Ethics was approved by the Institutional Review Board of the University of Hong Kong/Hospital Authority Hong Kong West Cluster (HKU/HA HKW IRB Ref # UW 15-216), and written informed consent was obtained from all participants and/or parents/siblings.

2.2. Control Cohort

The control cohort studied in this project consisted of 125 participating family members of the recruited patients with CS. Only unaffected parents and siblings (without CS) were included. Accordingly, 58 out of 67 patients had family member(s) participating in this study, including 2 quintets, 14 quartets, 33 trios, and 9 duos.

2.3. Genomic DNA Extraction

Genomic DNAs were extracted from peripheral blood samples of 67 patients and 125 of their family members using InvitrogenTM ChargeSwitch gDNA Serum Kit. The purified genomic DNA was quantified by NanoDrop.

2.4. Whole-Exome Sequencing (WES) and Copy Number Variations (CNVs) Calling

WES was performed for all recruited patients with congenital hemivertebrae and participating family members by Novogene Co, Ltd. (Hong Kong, China), using the Agilent SureSelect Human All Exon Kit on the Illumina sequencing platform. The WES data were processed as described previously [20]. The raw sequence data were first analyzed by fastp for quality control and filtering [21]. After filtering, the Q20 base of most samples was greater than 95%, and the Q30 base was greater than 90%. The sequence reads were mapped to the reference genome (GRCh37/hg19) by Burrows-Wheller Aligner v0.7.17 (BWA-MEM) [22] and further processed using SAMtools v1.10 to sort and index aligned reads [23]. The sam format files generated by BWA was converted to bam format files by SAMtools. CNVs were called from bam files with ExomeDepth v1.1.15, which is an R package based on a read depth algorithm [24]. ExomeDepth uses a robust statistical model to build an optimized reference set in maximizing the CNVs detection power. In this study, four healthy control family members (CS59A, CS71A, CS71B, and CS81A) were selected to generate the reference set.

2.5. CNVs Filtering

Several criteria were used to filter CNVs: (i) Bayes factor (BF) values were calculated for each variant. BF equals to the log10 likelihood ratio of the alternative hypothesis (i.e., there is a CNV) over the null hypothesis (i.e., there is no CNV). BF = log10 (alternative hypothesis/null hypothesis). BF value greater than 1 was regarded as a strong supporting evidence of CNV. CNVs with BF values smaller than 1 were excluded. (ii) As ExomeDepth cannot detect small size CNVs accurately, CNVs with size smaller than 100 bp were excluded. (iii) Because CNVs with high allele frequency in the general population are likely benign and less susceptible, the CNVs with the allele frequency greater than 0.01 were excluded (a minimum sample size of 100 is required). Database of Genomic Variants (DGV) and Genome Aggregation Database (gnomAD) were used. If available, the allele frequency in East Asian population was also checked. As different CNVs often overlap and have no clear boundaries, this filtration was conducted in a gene-based manner. If there were multiple CNVs covering the same gene, the maximum allele frequency was used for filtering. (iv) In a gene-based manner, the number of CNV recurrence was counted in patients and controls.

2.6. Real-Time Quantitative PCR (qPCR)

Real-time qPCR was performed to validate some of the detected CNVs. Briefly, ROX Reference Dye (0.4 μL, 50X), forward and reverse primers (0.4 μL each, 10 μM), TB Green Premix Ex Taq (10 μL, 2X, Tli RNaseH Plus, Takara), patients’ genomic DNA (0.5 μL, 10 ng/μL), and sterile ddH2O (8.3 μL) were mixed for qPCR, which was performed using Applied BiosystemTM StepOnePlusTM Real-Time PCR System. A locus outside of the detected CNV region of NOTCH2, DSCAM and SNTG1 was used as reference locus (P1). P1 is near the region of chromosome 16p11.2 and previously used as a reference site to detect 16p11.2/TBX6 deletion [14,17]. Each sample was analyzed in triplicate. Quantities of the copy numbers of specific locus were determined by the delta Ct method. The 2−ΔΔCT method was used to analyze the relative changes. The qPCR primer sequences: NOTCH2-F: 5′- AGGAGGCGACCGAGAAGATG-3′; NOTCH2-R: 5′-CGATACTCACCATGCGCG-GG-3′; DSCAM-F: 5′-AGCGAACGTTCCTATCGCTT-3′; DSCAM-R: 5′-TTTCACTTATGCGCCCTGGG-3′; SNTG1-F: 5′-GTCTACATGGGCTGGTGTGA-3′; SNTG1-R: 5′-CTGGAGGTGCCAGAAACTTG-3′; P1-F: 5′-GGGGAAGGAACTTACATGAC-3′; P1-R: 5′-TCGTGTTTCCCTGTTGTACC-3′.

3. Results

3.1. CS Cohort and WES

In our cohort, we recruited a total of 92 patients with CS, in which vertebral malformations, such as hemivertebrae, unilateral bar, or block vertebrae, were identified. This operational definition thus excluded other types of scoliosis such as AIS. Because hemivertebrae is the most common type of vertebral malformation in CS and has the greatest potential for rapid progression (5–10 degrees/year) [25], 67 patients with congenital hemivertebrae were first selected. Further, 125 healthy family members of 58 patients were enrolled for this study, including parents and siblings from two quintets, 14 quartets, 33 trios, and nine duos. WES was performed for all 67 patients and 125 participating family members (controls). The contaminating sequencing adaptors and low-quality reads were first removed and the filtered reads were then aligned to the reference human genome (GRCh37/hg19), sorted and indexed.

3.2. CNV Calling

CNVs were called from the sequence reads with the read-depth analysis tool ExomeDepth, which has high sensitivity and specificity at the exon level [24,26]. Four healthy parents who were not carriers of 16p11.2 microdeletion but whose children have been previously diagnosed with TBX6 compound heterozygosity [17] were selected to generate the reference set for ExomeDepth analysis. After CNV calling of the 67 patients with CS, a total of 15,671 CNVs were detected. On average, each patient carries around 234 CNVs. By counting repeatedly occurring CNVs among different cases, there were 6084 distinct CNVs. This strategy successfully identified TBX6-containing 16p11.2 microdeletion in four patients as previously reported [17]. For the control group, a total of 27,116 CNVs were detected from 125 family control members. On average, each control carried around 217 CNVs. By counting repeatedly occurring CNVs among different controls, there were 7171 distinct CNVs. Although more CNVs were detected in a few individuals (six patients and four controls), there was no significant difference between the patient group and the control group (Supplementary Figure S1). The average CNV numbers in patients and controls were similar to the previous report [24]. Afterwards, we analyzed all CNVs by employing both a candidate gene approach and family-based filtering and prioritization strategies. A workflow is shown in Figure 1.
Figure 1

The workflow of CNV analysis. This strategy detected CNVs in several candidate genes and identified recessive and novel rare CNVs enriched in patients with CS.

3.3. CNVs in Candidate Genes

To identify CNVs associated with CS, we firstly used the candidate gene approach, and focused on CNVs that contained genes known to be involved in scoliosis or somitogenesis. After checking allele frequencies of CNVs in the Database of Genomic Variants (DGV) and the Genome Aggregation Database (gnomAD), a total of 12 rare CNVs that influence four candidate genes were found in 12 patients, including known TBX6-containing 16p11.2 heterozygous deletion in four cases [17]. We also identified two rare CNVs that contained NOTCH2, a key component in the Notch signaling pathway, in two patients, and six rare CNVs in AIS-associated genes, DSCAM [27] and SNTG1 [28,29], in six patients (Table 1). We then checked these CNVs in their available family members and found that they are either novel mutations (NOTCH2 in CS043, DSCAM in CS018 and CS036, and SNTG1 in CS048) or paternally inherited (DSCAM in CS050) (Table 1). We were unable to determine the inheritance patterns of other patients (NOTCH2 in CS033, DSCAM in CS053 and CS064) due to the lack of family members.
Table 1

CNVs found with candidate gene approach (N.D., not determined; N.A., not applied).

GenePatientTypeChrStartEndSize (bp)Bayes FactorReads Ratio (Observed/ Expected)Exons Annotation (hg19) (Gene_exon)Inheritance PatternHighest Frequency in DGV (Sample Size >100)gnomAD_Structural_Variants Frequency (Heterozygous Loss)
NOTCH2 CS033deletion1120,611,949120,612,020714.620.657NOTCH2_1N.D.0.00370.00037 (0 in East Asia)
CS043deletion1120,539,621120,612,02072,3998.460.722NOTCH2_1-4De novo0.00370.00037(0 in East Aisa)
DSCAM CS018deletion2141,452,08041,452,2671874.650.429DSCAM_25De novo0.00049N.A.
CS036deletion2141,452,08041,452,2671876.020.415DSCAM_25De novo0.00049N.A.
CS050deletion2141,452,08041,452,2671875.80.468DSCAM_25Paternal0.00049N.A.
CS053deletion2141,452,08041,452,2671875.450.494DSCAM_25N.D.0.00049N.A.
CS064deletion2141,452,08041,452,2671876.10.463DSCAM_25N.D.0.00049N.A.
SNTG1 CS048deletion851,503,44051,571,22367,7836.370.43SNTG1_13-15De novo0.00020.000046(0 in East Asia)
TBX6 CS059deletion1629,674,60130,199,897525,2966440.555SPN_2,AC009133.19_2-3,QPRT_1-4,C16orf54_2,ZG16_2-4,KIF22_1-13,MAZ_1-5,PRRT2_2-3,PAGR1_1-3,CTD-2574D22.6_1-2,MVP_2-15,CDIPT_6-2,SEZ6L2_16-1,ASPHD1_1-3,KCTD13_6-1,TMEM219_1-4,TAOK2_2-16,HIRIP3_7-1,INO80E_1-7,DOC2A_11-2,C16orf92_2-3,FAM57B_5-1,ALDOA_8-16,PPP4C_2-9,TBX6_9-2,YPEL3_4-1,GDPD3_10-1,MAPK3_8-1,CORO1A_2-3,CORO1A_4-10N.D.0.00050.0001462(0 in East Asia)
CS071deletion1629,495,01130,218,221723,2107540.572NPIPL3_3-1,SPN_2,AC009133.19_2-3,QPRT_1-4,C16orf54_2,ZG16_3-4,KIF22_2-12,MAZ_1-5,PRRT2_2-3,PAGR1_1-3,CTD-2574D22.6_1-2,MVP_2-15,CDIPT_6-2,SEZ6L2_16-1,ASPHD1_1-3,KCTD13_6-1,TMEM219_1-4,TAOK2_2-16,HIRIP3_7-2,INO80E_1-7,DOC2A_11-2,C16orf92_2-3,FAM57B_5-1,ALDOA_8-16,PPP4C_2-9,TBX6_9-2,YPEL3_4-1,GDPD3_10-1,MAPK3_8-1,CORO1A_2-11,BOLA2B_3-1,SLX1A_1-5,SULT1A3_3-9,RP11-347C12.3_5-2De novo0.00050.0001462(0 in East Asia)
CS078deletion1629,498,51630,199,897701,3816900.578NPIPL3_1,SPN_2,AC009133.19_2-3,QPRT_1-4,C16orf54_2,ZG16_3-4,KIF22_2-12,MAZ_1-5,PRRT2_2-3,PAGR1_1-3,CTD-2574D22.6_1-2, MVP_2-15,CDIPT_6-2,SEZ6L2_16-1,ASPHD1_1-3,KCTD13_6-1,TMEM219_1-4,TAOK2_2-16,HIRIP3_7-2,INO80E_1-7,DOC2A_11-2,C16orf92_2-3,FAM57B_5-1,ALDOA_8-16,PPP4C_2-9,TBX6_9-2,YPEL3_4-1,GDPD3_10-1,MAPK3_8-1,CORO1A_2-10N.D.0.00050.0001462(0 in East Asia)
CS081deletion1629,498,51630,199,897701,3816450.558NPIPL3_1,SPN_2,AC009133.19_2-3,QPRT_1-4,C16orf54_2,ZG16_3-4,KIF22_2-12,MAZ_1-5,PRRT2_2-3,PAGR1_1-3,CTD-2574D22.6_1-2, MVP_2-15,CDIPT_6-2,SEZ6L2_16-1,ASPHD1_1-3,KCTD13_6-1,TMEM219_1-4,TAOK2_2-16,HIRIP3_7-2,INO80E_1-7,DOC2A_11-2,C16orf92_2-3,FAM57B_5-1,ALDOA_8-16,PPP4C_2-9,TBX6_9-2,YPEL3_4-1,GDPD3_10-1,MAPK3_8-1,CORO1A_2-10N.D.0.00050.0001462(0 in East Asia)
Among the identified rare CNVs, the TBX6-containing chromosome 16p11.2 microdeletion had been previously validated [17]. Here, we further examined CNVs that contained NOTCH2, DSCAM or SNTG1 genes. Indeed, qPCR analysis detected heterozygous deletions within the NOTCH2, DSCAM and SNTG1 loci (Supplementary Figure S2), indicating the reliability of CNVs called from WES data by ExomeDepth.

3.4. Recessive CNVs in Patients with CS

We then searched for homozygous CNVs (observed/expected reads ratio < 0.1) in 67 patients and 125 controls. After excluding the homozygous CNVs that existed in both patients and controls, we identified unique homozygous CNVs in eight patients with CS. The heterozygous deletions of these loci are rare in DGV or gnomAD database (Table 2). Considering that homozygous CNVs might be inherited from parents, we further checked their inheritance pattern and found that they were either novel mutations or unknown due to lack of parents’ data. These recessive CNVs contained three genes, NBPF20 (Neuroblastoma Breakpoint Family Member 20), FAM138C (Family with Sequence Similarity 138 Member C), and DHX40 (DEAH-Box Helicase 40). Interestingly, the DHX40-containing homozygous CNVs were detected in six patients but was not reported in DGV or gnomAD. DHX40-containing heterozygous CNVs are also very rare (Table 2). FMA138C is an RNA gene and NBPF20 is a member of NBPF family characterized by tandemly repeats of DUF1220 domain, but their functions are unclear. DHX40 encodes a member of the DExD/H-box RNA helicase superfamily that catalyzes the unwinding of double-stranded RNA and has an essential role in RNA metabolism [30].
Table 2

Recessive CNVs unique in patients with CS (N.D., not determined; N.A., not applied).

GenePatientTypeChrStartEndSize (bp)Bayes FactorReads Ratio (Observed/Expected)Exons Annotation (hg19) (Gene_exon)Inheritance PatternHighest Frequency in DGV (Sample Size > 100)gnomAD_Structural Variants Frequency (Heterozygous Loss)
NBPF20 CS047deletion1148,261,458148,262,3669085.270.04NBPF20_98-99De novo00(0 in East Asia)
FAM138C CS048deletion935,06135,5194586.510FAM138C_1-2* De novo0.0074N.A.
DHX40 CS004deletion1757,656,83457,657,2404065.380DHX40_9-10N.D.0.00009220.0025(0.008152 in East Asia)
CS0354.91N.D.
CS0436* De novo
CS0507.23* De novo
CS0537.02N.D.
CS0576.29De novo

* The CNV is also not present in the healthy siblings.

3.5. Novel CNVs in Patients with CS

We also sought to identify CS-associated novel CNVs and first analyzed the data from 49 complete families (two quintets, 14 quartets or 33 trios). The detected novel CNVs were then checked in the other 18 patients (nine singlets and nine duos). Eventually, we identified 64 CNVs in 14 genes that occurred in more than three patients but did not exist or was very rare (<1%) in family control group. Those with high CNV allele frequency (>1%) in the general population were also filtered out. This strategy successfully identified the known TBX6-containing CNVs in four patients [17] and DHX40-containing homozygous CNVs in six patients. Interestingly, we also found there are four additional heterozygous DHX40 CNVs (Table 3 and Supplementary Table S1). Most of the identified novel CNVs were heterozygous loss, and one was gain of one copy. Our CNV shortlist includes genes involved in ubiquitination (NAE1, MYSM1), enzymatic activities (MME, PHKB), ion/small molecule transportation (SCN7A, ABCA6), meiosis (MNS1, SPO11), spermatogenesis (GMCL1), GTPase activity (RASA2), TNF signaling (NSMAF), or with unknown function (LRRC40).
Table 3

Novel CNVs enriched in patients with CS (N.A., not applied).

GeneTypeChrSize (bp)Count in 67 PatientsCount in 125 ControlsHighest Frequency in DGV (Sample Size > 100)gnomad_Structural_Variants Frequency (Heterozygous Loss)gnomad_East Asia_Structural_Variants Frequency (Heterozygous Loss)
LRRC40 deletion130,080–383,938400.0095569070.00004610.000
SCN7A deletion2544–11,914400.0022573360.00004610.000
MME deletion3279–55,232400.0007987220.00004620.000
NAE1 deletion166724–71,80340N.A.b00.000
TBX6 deletion16525,296–723,210400.00050.00014620.000
DHX40 deletion17107–235410 a10.00009220.00250.008152
GMCL1 deletion211,694–24,032510.0013037810.00004790.000
MYSM1 deletion1190–10,053410.00009220.00004610.0004139
RASA2 deletion32892–100,149410.0008688100.000
NSMAF deletion8203–12,325410.00114547500.000
MNS1 deletion1552,610–323,156410.005188070.00009220.000
PHKB deletion167697–99,254410.0011980830.00004810.000
SPO11 deletion20730–33,608410.0006422600.000
ABCA6 duplication17560–13,580410.000921900.000

a DHX40 has 6 homozygous (listed in Table 2) and 4 heterozygous CNVs. b No CNV with sample size more than 100 is found within the NAE1 locus. Note: Detailed information of these novel CNVs is shown in Table S1.

4. Discussion

CS is a genetically heterogeneous disorder with evidence for multiple causative genes. However, the genetic causes of the majority of patients still remain unknown. As most cases of CS are of sporadic etiology, CNVs may have greater influence than single nucleotide variations (SNVs) [31]. This was well exemplified by the TBX6-containing 16p11.2 microdeletion in previous CS studies [14,15,16,17]. Here, we systematically investigated CNVs in a cohort of patients with congenital hemivertebrae and their family controls. We identified the well-known CNVs at chromosome 16p11.2, as well as a number of new CNVs that are potentially associated with CS. Haploinsufficiency of Notch signaling pathway has been demonstrated to cause CS [32] and mutations in NOTCH2 caused Alagille syndrome and Hajdu–Cheney syndrome, both of which showed abnormal curvature of the spine [33,34]. In our study, we found one short and one long CNVs at NOTCH2 locus in two patients, spanning one and four exons of NOTCH2, respectively. As no significant coding SNV could be detected in NOTCH2 of these two patients, it is unclear whether heterozygous loss of NOTCH2 is sufficient to cause CS or other non-coding NOTCH2 SNVs or environmental factors [32] may contribute. Interestingly, CNVs in two AIS-associated genes, DSCAM [27] and SNTG1 [28,29], were found in six patients, suggesting CS and AIS may be genetically related to each other. An intriguing finding in our analysis is the identification of CNVs spanning various exons of DHX40 in ten patients, including six homozygous and four heterozygous CNVs. Most of them are novel mutations. DHX40 belongs to the conserved DExD/H-box RNA helicase family, which facilitates the ATP-dependent unwinding of RNA secondary structures [30]. However, the biological functions of each member remained poorly understood. Interestingly, the DHX40 mutant mice were described to exhibit abnormal bone structure and bone mineralization (Mouse Genome Informatics, MGI: 1914737), indicating a role of DHX40 in bone development. The mutant of its family member DHX35 was described to have abnormal vertebrae morphology and scoliosis in mice (MGI: 1918965). Patients carrying DHX37 mutations showed developmental delay and intellectual disability as well as vertebral anomalies [30]. These observations in the mouse and human might suggest a potential link between DHX family members and CVM. Although the function of NBPF20 is unknown, it locates at chromosome 1q21.1, which microdeletion is associated with a variety of phenotypes including skeletal malformations such as scoliosis [35,36]. This region also contains other NBPF family members, such as NBPF10, whose genetic variants were implicated in Mayer-Rokitansky-Küster-Hauser (MRKH) syndrome (OMIM # 277000) [37], a disease associated with CS [38]. Among the candidate genes of the identified novel CNVs, RASA2 (RAS P21 Protein Activator 2) and MYSM1 (Myb-Like, SWIRM, and MPN domains 1) are of particular interest. RASA2 encodes a GAP (GTPase-activating protein) protein and functions as a suppressor of RAS by promoting its intrinsic GTPase activity. Rare variants in RASA2 have been found associated with Noonan syndrome [39]. As scoliosis occurs frequently in Noonan syndrome [40], RASA2 is a potential candidate gene for CS. It would be interesting to investigate the RASA2 mutant mouse phenotype. MYSM1 is a deubiquitinase reported to be essential for bone formation [41] and its mutant mice have truncated and kinky tails [42,43,44], which are often associated with vertebral malformations [45]. Indeed, an X-ray from the International Mouse Phenotyping Consortium (MGI: 2444584) exhibited a spinal deformity in the MYSM1 mutant mouse (Supplementary Figure S3), indicating a potential role of MYSM1 in spinal development and predisposition to CS. Further detailed phenotypic analysis of mutant animals is needed to validate its pathogenicity in CS. Although there are a few reports of heterozygous point mutations in CS patients [17,46,47,48], the dominant negative effect was only demonstrated by a novel TBXT mutation [17]. Considering the low familial recurrence rate in CS, recessive or compound heterozygous mutations are more likely to be the major cause of CS. In this regard, heterozygous CNVs are not sufficient to induce CS. Their pathogenicity may be explained by the following genetic models. First, in our cases, the patients carrying heterozygous CNVs may have additional risk variant or haplotype on the other allele. This possibility has been well exemplified by the 16p11.2/TBX6 mutations and haplotype. However, further analysis of risk variant/haplotype in our study is severely limited by our dataset from WES as they may reside in non-coding regions that regulate gene transcription. We did not detect significant deleterious mutations in the coding regions of these genes. Second, additional mutations in other relevant genes may increase the risk of CS (polygenic model). Other possibilities include environmental contributions and novel mutations in somatic tissues. Environmental factors, such as short-term gestational hypoxia, have been found to cause CS in combination with haploinsufficiency of Notch signaling pathway genes [32]. On the other hand, somatic mutations may serve as the “second hit” in addition to the heterozygous germline CNV mutations (first hit). This genetic model has been well demonstrated in other diseases as well as dystrophic scoliosis caused by NF1 [49,50,51,52]. Testing the above models will require whole genome sequencing, more comprehensive data analysis, and the isolation of malformed vertebral tissues in future studies.

5. Conclusions

In this study, we investigated the genetic basis of CS by analyzing CNVs in a cohort of CS families. Based on the candidate gene approach and family-based filtering of CNVs, we identified both known CS-associated genes and a set of new susceptibility genes, some of which (e.g., DHX40, RASA2, and MYSM1) warrant further investigations in larger cohorts as well as functional characterization. Given the well-defined example of the TBX6 compound inheritance and the complex genetic nature of CS, future studies examining the combined effects of SNVs and CNVs and somatic tissues may help better decipher the genetic etiology and heterogeneity of CS.
  51 in total

Review 1.  Scoliosis associated with typical Mayer-Rokitansky-Küster-Hauser syndrome.

Authors:  K Fisher; R H Esham; I Thorneycroft
Journal:  South Med J       Date:  2000-02       Impact factor: 0.954

2.  Clinical, genetic and environmental factors associated with congenital vertebral malformations.

Authors:  P F Giampietro; C L Raggio; R D Blank; C McCarty; U Broeckel; M A Pickart
Journal:  Mol Syndromol       Date:  2013-02

3.  Interstitial 1q21.1 Microdeletion Is Associated with Severe Skeletal Anomalies, Dysmorphic Face and Moderate Intellectual Disability.

Authors:  Bruno F Gamba; Roseli M Zechi-Ceide; Nancy M Kokitsu-Nakata; Siulan Vendramini-Pittoli; Carla Rosenberg; Ana C V Krepischi Santos; Lucilene Ribeiro-Bicudo; Antonio Richieri-Costa
Journal:  Mol Syndromol       Date:  2016-10-26

4.  Congenital vertebral anomalies: aetiology and relationship to spina bifida cystica.

Authors:  R Wynne-Davies
Journal:  J Med Genet       Date:  1975-09       Impact factor: 6.318

5.  Genome-wide association studies of adolescent idiopathic scoliosis suggest candidate susceptibility genes.

Authors:  Swarkar Sharma; Xiaochong Gao; Douglas Londono; Shonn E Devroy; Kristen N Mauldin; Jessica T Frankel; January M Brandon; Dongping Zhang; Quan-Zhen Li; Matthew B Dobbs; Christina A Gurnett; Struan F A Grant; Hakon Hakonarson; John P Dormans; John A Herring; Derek Gordon; Carol A Wise
Journal:  Hum Mol Genet       Date:  2011-01-07       Impact factor: 6.150

6.  Scoliosis with cognitive impairment in a girl with 8q11.21q11.23 microdeletion and SNTG1 disruption.

Authors:  E Tassano; P Ronchetto; M Severino; M T Divizia; M Lerone; S Uccella; L Nobili; E Tavella; C Morerio; D Coviello; M Malacarne
Journal:  Bone       Date:  2021-05-26       Impact factor: 4.398

7.  Large-scale copy number polymorphism in the human genome.

Authors:  Jonathan Sebat; B Lakshmi; Jennifer Troge; Joan Alexander; Janet Young; Pär Lundin; Susanne Månér; Hillary Massa; Megan Walker; Maoyen Chi; Nicholas Navin; Robert Lucito; John Healy; James Hicks; Kenny Ye; Andrew Reiner; T Conrad Gilliam; Barbara Trask; Nick Patterson; Anders Zetterberg; Michael Wigler
Journal:  Science       Date:  2004-07-23       Impact factor: 47.728

8.  Mutations in COMP cause familial carpal tunnel syndrome.

Authors:  Chunyu Li; Ni Wang; Alejandro A Schäffer; Xilin Liu; Zhuo Zhao; Gene Elliott; Lisa Garrett; Nga Ting Choi; Yueshu Wang; Yufa Wang; Cheng Wang; Jin Wang; Danny Chan; Peiqiang Su; Shusen Cui; Yingzi Yang; Bo Gao
Journal:  Nat Commun       Date:  2020-07-20       Impact factor: 14.919

9.  fastp: an ultra-fast all-in-one FASTQ preprocessor.

Authors:  Shifu Chen; Yanqing Zhou; Yaru Chen; Jia Gu
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

10.  A robust model for read count data in exome sequencing experiments and implications for copy number variant calling.

Authors:  Vincent Plagnol; James Curtis; Michael Epstein; Kin Y Mok; Emma Stebbings; Sofia Grigoriadou; Nicholas W Wood; Sophie Hambleton; Siobhan O Burns; Adrian J Thrasher; Dinakantha Kumararatne; Rainer Doffinger; Sergey Nejentsev
Journal:  Bioinformatics       Date:  2012-08-31       Impact factor: 6.937

View more
  2 in total

1.  Overview of Gene Special Issue "Genetic Conditions Affecting the Skeleton: Congenital, Idiopathic Scoliosis and Arthrogryposis".

Authors:  Philip F Giampietro; Nancy Hadley-Miller; Cathy L Raggio
Journal:  Genes (Basel)       Date:  2022-07-04       Impact factor: 4.141

2.  Prenatal Diagnosis and Outcomes in Fetuses with Hemivertebra.

Authors:  Hang Zhou; You Wang; Ruibin Huang; Fang Fu; Ru Li; Ken Cheng; Dan Wang; Qiuxia Yu; Yongling Zhang; Xiangyi Jing; Tingying Lei; Jin Han; Xin Yang; Dongzhi Li; Can Liao
Journal:  Genes (Basel)       Date:  2022-09-09       Impact factor: 4.141

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.