Literature DB >> 28379995

Investigation of mutations in the HBB gene using the 1,000 genomes database.

Tânia Carlice-Dos-Reis1, Jaime Viana1,2, Fabiano Cordeiro Moreira1,3, Greice de Lemos Cardoso1, João Guerreiro1, Sidney Santos1,3, Ândrea Ribeiro-Dos-Santos1,3.   

Abstract

Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28379995      PMCID: PMC5381778          DOI: 10.1371/journal.pone.0174637

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


1. Introduction

Understanding the relationship between phenotype and genotype in the clinical setting is one of the main objectives of traditional research [1]. However, studies on a large number of mutations are problematic, primarily due to the experimental analyses. In contrast, in silico analysis is faster and easier to execute, yields more results, and costs less, thus making it more efficient. This type of analysis is based on alterations in the sequences of nucleotides and/or amino acids and their comparison with the native sequence to correlate the effect of these alterations on the phenotype of the individual [1,2,3,4]. Mutations in the HBB gene, which is located on chromosome 11 p15.5 [5], are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Hemoglobinopathies are a set of hereditary diseases caused by the abnormal structure or insufficient production of hemoglobin. Sickle cell anemia and β-thalassemia can lead to serious anemia and other life threatening conditions [6]. Sickle cell anemia is one of the most common monogenic diseases worldwide. It is estimated that 312,000 people are born with sickle cell anemia every year, and the majority of these individuals are native to Sub-Saharan Africa [7]. Thus, it is important for the public healthcare system to detect heterozygous carriers of hemoglobinopathies, as they can produce homozygous and double heterozygous individuals with serious clinical conditions [8]. The 1,000 Genomes Project is an international consortium organized with the objective of sequencing a large number of individual genomes representative of the world’s population. The consortium has the objective of better characterizing the sequence variation of the human genome and enabling the investigation of the relationship between genotype and phenotype. Thus, the 1,000 Genomes Project enables a more precise study of variants in genome-wide association studies (GWAS) and the best localization of variants associated with diseases in different population groups [9]. The objective of this study is to track variations in the β-globin gene (HBB); to describe the frequencies of mutations in different population groups using the 1,000 Genomes databank, which provides a comprehensive resource of human genetic variation [9] relative to the HG19 reference genome [10]; and to investigate the pattern of resulting pathogenicity.

2. Methodology

To perform this study, data from 2,504 samples deposited in the 1,000 Genomes database were used; these open-access sequences were aligned with the HG19 reference genome using the SNPEFF tool [11]. This program provides and records the effects both of genetic variations as well as amino acid alterations. The resulting data were visualized in the Integrative Genomics Viewer (IGV) [12], a high-performance visualization tool for the interactive exploration of genomic datasets. The mutations were tracked at the nucleotide and amino acid levels, and the population frequencies with which these mutations occur, the type of mutation, and the respective positions were recorded. To investigate pathogenicity these mutations, five different prediction tools, including POLYPHEN [13], SIFT [14], PROVEAN [15], PANTHER [16], and E MUTPRED [17], and three databanks, including CLINVAR [18], dbSNP [19] and HbVar [20], were used, as shown in Fig 1.
Fig 1

Alignment of the 1000 Genomes and HG19 sequences of HBB using the SNPEFF tool; predictors and BD used for the investigation of pathogenic mutations.

Each predictor uses distinct characteristics to determine the effect of the mutations in relation to the information obtained regarding the structure and function of the protein. It is important to highlight that the results of all predictors provide additional evidence of pathogenicity; thus, five predictors were analyzed to improve accuracy. The determination of the pathogenicity of each mutation is based on four pieces of evidence: (i) CLINVAR, (ii) dbSNP, (iii) HbVar, and (iv) predictors. Tables 1, 2 and 3 present the following results of the alignment of sequences from 2,504 samples: (1) the positions in the genome; (2) the identification of the single nucleotide polymorphism (SNP) of each mutation; (3) the types of mutations; (4) the mutations observed at the nucleotide level; (5) the respective consequences at the amino acid level; (6) the population frequency of each mutation; and (7) the pathogenicity investigated for each mutation.
Table 1

Position and SNP ID of the mutations observed at the nucleotide level, the respective consequences at the amino acid level, the types of mutations, and the number of individuals.

PositionSNP IDNucleotide changeAA alterationType of mutationN° IndividualsRef.
5246840rs36020563G/AHis144HisSynonymous1[21]
5246870rs113082294C/GVal134ValSynonymous9[22]
5246883rs111645889G/AAla130ValMissense1[23]
5246890rs33971634G/AGln128Stop gained1[24]
5246908rs33946267C/GGlu122GlnMissense3[25]
5246947rs33958637T/GAsn109HisMissense1[26]
5246948rs193922562G/AGly108GlySynonymous1[27]
5247876rs145669504G/TLeu82LeuSynonymous5[28]
5247992–5247996rs281864900CAAAG/CPhe42fsFrameshift5[29]
5248004rs11549407G/AGln40Stop gained1[30]
5248029rs1135071C/AArg31SerSplice region and missense1[31]
5248030rs33943001C/G#Splice acceptor and intron variant1[32]
5248159rs33971440C/T#Splice donor and intron variant1[33]
5248162rs35578002G/TGlu30GlySplice region and synonymous variant1[34]
5248173rs33950507C/TGlu27LysMissense14[35]
5248200rs33986703T/ALys18Stop gained6[36]
5248205rs63750783C/TTrp16Stop gained2[37]
5248232rs334T/AGlu7ValMissense137[38]
5248233rs33930165C/TGlu7LysMissense17[39]
5248236rs33912272G/APro6SerMissense1[40]

#—Intronic variant mutations

Table 2

SNP ID, nucleotide and Amino Acid changes, number of individuals and population frequency of each mutation.

SNP IDNucleotide changeAmino Acid changeTotal individualsN°/ Freq AFRN°/ Freq AMRN°/ Freq EASN°/Freq EURN°/ Freq SASTotal Allele Frequency
rs36020563G/AHis144His11 (0.0008)00000.00019
rs113082294C/GVal134Val902 (0.0029)07 (0.007)00.00179
rs111645889G/AAla130Val11 (0.0008)00000.00019
rs33971634G/AGln128101 (0.0014)0000.00019
rs33946267C/GGlu122Gln300003 (0.0031)0.00059
rs33958637T/GAsn109His1001 (0.001)000.00019
rs193922562G/AGly108Gly11 (0.0008)00000.00019
rs145669504G/TLeu82Leu5005 (0.005)000.00099
rs281864900CAAAG/CPhe42fs5005 (0.005)000.00099
rs11549407G/AGln40101 (0.0014)0000.00019
rs1135071C/AArg31Ser10001 (0.001)00.00019
rs33943001C/G#100001 (0.001)0.00019
rs33971440C/T#101 (0.0014)0000.00019
rs35578002G/TGlu30Gly11 (0.0008)00000.00019
rs33950507C/TGlu27Lys14008 (0.0079)06 (0.0061)0.00279
rs33986703T/ALys186006 (0.006)000.00119
rs63750783C/TTrp16200002 (0.002)0.00039
rs334T/AGlu7Val137132 (0.0072)5 (0.0998)0000.02735
rs33930165C/TGlu7Lys1717 (0.0129)00000.00339
rs33912272G/APro6Ser10001 (0.001)00.00019

AFR: African.; AMR: American; EAS: Eastern Asian; EUR: European; SAS: Southern Asian.

Table 3

SNP ID; nucleotide alteration; amino acid alteration; total number of individuals; list of the results from CLINVAR, dbSNP, HbVar, POLYPHEN, PROVEAN, SIFT, PANTHER, and MUTPRED; and final analysis of pathogenicity.

SNP IDNucleotide changeAmino acid changeTotal individualsCLINVARdbSNP (NCBI)HbVarPOLYPHENPROVEANSIFTPANTHERMUTPREDConclusion pathogenicity
rs11164588G/AAla130Val1OtherOtherBenignBenignDamagingDamagingDamagingDamagingConflict
rs33971634G/AGln1281DamagingOtherDamaging*Damaging***Damaging
rs33946267C/GGlu122Gln3DamagingDamagingBenignBenignBenignDamagingBenignDamagingConflict
rs33958637T/GAsn109His1Other*BenignProbably damagingDamagingDamagingBenignDamagingConflict
rs281864900CAAAG/CPhe42fs5DamagingDamagingDamaging*Damaging***Damaging
rs11549407G/AGln401DamagingDamagingDamaging*****Damaging
rs1135071C/AArg31Ser1DamagingDamagingBenignProbably damagingDamagingDamagingDamagingDamagingDamaging
rs33943001C/G#1DamagingDamagingDamaging*****Damaging
rs33971440C/T#1DamagingDamagingDamaging*****Damaging
rs35578002G/TGlu30Gly1**DamagingBenignBenignBenignBenignBenignConflict
rs33950507C/TGlu27Lys14DamagingDamagingDamagingBenignDamagingDamagingDamagingDamagingDamaging
rs33986703T/ALys186DamagingDamagingDamaging*Damaging***Damaging
rs63750783C/TTrp162DamagingDamagingDamaging*Damaging***Damaging
rs334T/AGlu7Val137DamagingDamagingDamagingBenignDamagingDamaging*DamagingDamaging
rs33930165C/TGlu7Lys17DamagingDamagingDamagingBenignDamagingDamaging*DamagingDamaging
rs33912272G/APro6Ser1OtherOtherBenignBenignBenignBenign*DamagingConflict

* Could not be evaluated

# Intronic variant mutations

#—Intronic variant mutations AFR: African.; AMR: American; EAS: Eastern Asian; EUR: European; SAS: Southern Asian. * Could not be evaluated # Intronic variant mutations

3. Results

A total of 20 different mutations were identified in 209 individuals (8.34%) in the coding region of the HBB gene. The variants observed were classified as follows: (i) four synonymous mutations; (ii) seven missense mutations; (iii) four stop-gain mutations; (iv) one frameshift mutation; (v) one splice region and missense variant; (vi) one splice region and synonymous variant; (vii) one splice acceptor and intron variant; and (viii) one splice donor and intron variant. Missense mutations were the most frequently encountered, affecting 174 (83.2%) individuals, as shown in Table 1. All observed mutations were heterozygous and already had SNP IDs. The mutations with the highest allelic frequencies were as follows: (i) rs334 had total frequency of 0.0274 (African and American populations); (ii) rs33930165 had a frequency of 0.0034 (only in the African population); and (iii) rs33950507 had a frequency of 0.0028 (Eastern and Southern Asian populations), as shown in Table 2. Synonymous mutations were encountered in 16 (7.6%) samples and were excluded from the investigation of pathogenicity performed by the database predictors because they do not alter the amino acid sequence. Thus, the pathogenicity of missense, stop-gain, frameshift, splice region (both acceptor and donors), synonymous and intron mutations were tracked using the dbSNP, CLINVAR and HbVar databases, as well as five in silico predictors (POLYPHEN, SIFT, PROVEAN, PANTHER and MUTPRED). The results showed 11 pathogenic mutations of HBB (Table 3). In addition, five mutations—(1) rs111645889, (2) rs33946267, (3) rs33958637 (4) rs35578002 and (5) rs33912272—presented conflicting results between predictors and databases.

4. Discussion

Mutations in the HBB gene are distributed unevenly among the different population groups. The African population was the most affected, with 73.2% of individuals having mutations in this gene, while the European population was least affected, with 4.3% of individuals having such mutations. The three mutations with the greatest frequency were (1) rs334 (AFR and AMR); (2) rs33930165 (AFR); and (3) rs33950507 (EAS and SAS). The rs334 mutation is responsible for hemoglobin S, known as HbS, which causes sickle cell anemia. The rs33930165 mutation is responsible for hemoglobin C, or HbC [41], which is more frequent in the African population [42,43]. In addition, the rs3395057 mutation is responsible for hemoglobin E, or HbE [41], which is involved in β-thalassemia described in Asian populations [44]. The available data show that variants rs33986703, rs63750783, and rs281864900 are responsible for β-thalassemia and are described in Asian populations [45,46,39]. Variants rs11549407 and rs33971634 are also β-thalassemia mutations but are common in European populations [47,24]; rs33971440 and rs35578002 are commonly found in populations of the Mediterranean region [48,49,34]. Although the HBB gene is well studied, there are some mutations in this gene that are not well known and poorly described in the literature. This is the case of the variants rs111645889, rs33958637, rs1135071, rs33943001 and rs33912272, for which no scientific papers were found discussing their epidemiology. CLINVAR [18] is one of the most widely used databases in clinical and pathological analyses related to mutations. However, not all mutations of the HBB gene (rs35578002) are registered in this database, and conflicting results have been observed when comparing predictors with the CLINVAR, dbSNP and HbVar databases to estimate the pathogenicity of each mutation, or more specifically, the clinical significance of mutations rs111645889, rs33946267, rs33958637, rs35578002 and rs33912272. It is important to emphasize that all samples deposited in the 1,000 Genomes Project, an international consortium aimed at producing a public catalog of human genetic variability, belong to individuals without clinical manifestations of any disease. The SNP rs35578002 is not available in CLINVAR and has no information on clinical significance in the dbSNP database. Predictors consider this variant as benign, but the HbVar database classifies it as a damaging mutation. This variant is the β-thalassemia mutation Cd29 (C> T), which in homozygosis causes hemolytic anemia and ineffective erythropoiesis [34]. This mutation was described in Mediterranean populations. One possible explanation for the inconsistent information about the clinical significance of this variant is that it is a synonymous mutation in the splice region that is critical for RNA processing, causing thalassemia as described in HbVar. Also noteworthy is the mutation rs33946267. According to the literature, this mutation leads to the formation of Hb D-Punjab. This mutation is generally asymptomatic but may occasionally cause moderate hemolytic anemia, similar to the manifestations of sickle cell anemia when associated with other hemoglobin variants, such as HbS or β-thalassemia mutations. Its initial distribution suggests that it is more prevalent in the central region of Asia, but due to migration, it can be found in several other regions [50]. According to the results, 8.3% of the phenotypically healthy individuals of the 1,000 Genomes database have a mutation in the HBB gene in heterozygosis. This means that eighty out of 1,000 individuals have a mutant allele in the gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, meaning that approximately 7.7% had a change that affects the sequence of amino acids. Of these, 186 (7.4%) have a deleterious mutation based on available data on the clinical significance of these mutations (Table 3). Considering that the 1,000 Genomes database is representative of the world’s population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation. Independently, new studies are needed to validate the clinical consequences of the mutations with undefined pathogenicity. Considering the absence of physiopathological knowledge relative to the newly identified mutations, the use of in silico predictors (in an orderly and criteria-based manner) emerges as a possible tool to aid in decision-making with respect to diagnostic, preventative, and treatment measures.
  43 in total

1.  Efficient correction of hemoglobinopathy-causing mutations by homologous recombination in integration-free patient iPSCs.

Authors:  Mo Li; Keiichiro Suzuki; Jing Qu; Preeti Saini; Ilir Dubova; Fei Yi; Jungmin Lee; Ignacio Sancho-Martinez; Guang-Hui Liu; Juan Carlos Izpisua Belmonte
Journal:  Cell Res       Date:  2011-11-22       Impact factor: 25.617

2.  A computational approach to determine susceptibility to cancer by evaluating the deleterious effect of nsSNP in XRCC1 gene on binding interaction of XRCC1 protein with ligase III.

Authors:  Preety Kadian Singh; Kinnari N Mistry
Journal:  Gene       Date:  2015-10-09       Impact factor: 3.688

3.  Molecular basis of beta-thalassemia in Turkey: detection of rare mutations by direct sequencing.

Authors:  C Aulehla-Scholz; S Basaran; L Agaoglu; A Arcasoy; W Holzgreve; P Miny; F Ridolfi; J Horst
Journal:  Hum Genet       Date:  1990-01       Impact factor: 4.132

4.  Hemoglobin D Los Angeles in two Caucasian families: hemoglobin SD disease and hemoglobin D thalassemia.

Authors:  R G Schneider; S Ueda; J B Alperin; W C Levin; R T Jones; B Brimhall
Journal:  Blood       Date:  1968-08       Impact factor: 22.113

5.  Hemoglobin Machida [beta 6 (A3) Glu replaced by Gln], a new abnormal hemoglobin discovered in a Japanese family: structure, function and biosynthesis.

Authors:  T Harano; K Harano; S Ueda; S Shibata; K Imai; M Seki
Journal:  Hemoglobin       Date:  1982       Impact factor: 0.849

6.  A new beta chain variant, Hb Tyne [beta 5(A2)Pro-->Ser].

Authors:  J V Langdown; D Williamson; C H Beresford; I Gibb; R Taylor; R Deacon-Smith
Journal:  Hemoglobin       Date:  1994-09       Impact factor: 0.849

7.  Molecular basis of β-thalassemia in Karnataka, India.

Authors:  Gururaj D Kulkarni; Suyamindra S Kulkarni; Gurushantappa S Kadakol; Bhushan B Kulkarni; Prakashgouda H Kyamangoudar; Bhaskar V K S Lakkakula; Kumarasamy Thangaraj; Tipperudra A Shepur; Muralidhar L Kulkarni; Pramod B Gai
Journal:  Genet Test Mol Biomarkers       Date:  2011-10-06

Review 8.  The influence of host genetics on erythrocytes and malaria infection: is there therapeutic potential?

Authors:  Patrick M Lelliott; Brendan J McMorran; Simon J Foote; Gaetan Burgio
Journal:  Malar J       Date:  2015-07-29       Impact factor: 2.979

9.  Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels.

Authors:  Fabrice Danjou; Magdalena Zoledziewska; Carlo Sidore; Maristella Steri; Fabio Busonero; Andrea Maschio; Antonella Mulas; Lucia Perseu; Susanna Barella; Eleonora Porcu; Giorgio Pistis; Maristella Pitzalis; Mauro Pala; Stephan Menzel; Sarah Metrustry; Timothy D Spector; Lidia Leoni; Andrea Angius; Manuela Uda; Paolo Moi; Swee Lay Thein; Renzo Galanello; Gonçalo R Abecasis; David Schlessinger; Serena Sanna; Francesco Cucca
Journal:  Nat Genet       Date:  2015-09-14       Impact factor: 38.330

10.  ClinVar: public archive of relationships among sequence variation and human phenotype.

Authors:  Melissa J Landrum; Jennifer M Lee; George R Riley; Wonhee Jang; Wendy S Rubinstein; Deanna M Church; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2013-11-14       Impact factor: 16.971

View more
  6 in total

1.  Proteomic variations of esophageal squamous cell carcinoma revealed by combining RNA-seq proteogenomics and G-PTM search strategy.

Authors:  Pooja Ramesh; Vidhyavathy Nagarajan; Vartika Khanchandani; Vasanth Kumar Desai; Vidya Niranjan
Journal:  Heliyon       Date:  2020-08-29

2.  Pathway Maps of Orphan and Complex Diseases Using an Integrative Computational Approach.

Authors:  Kais Ghedira; Soumaya Kouidhi; Yosr Hamdi; Houcemeddine Othman; Sonia Kechaou; Sadri Znaidi; Sghaier Haïtham; Imen Rabhi
Journal:  Biomed Res Int       Date:  2020-11-27       Impact factor: 3.411

3.  Evaluation of intervention strategy of thalassemia for couples of childbearing ages in Centre of Southern China.

Authors:  Fan Jiang; Liandong Zuo; Jian Li; Guilan Chen; Xuewei Tang; Jianying Zhou; Yanxia Qu; Dongzhi Li; Can Liao
Journal:  J Clin Lab Anal       Date:  2021-09-07       Impact factor: 2.352

Review 4.  Genome-based therapeutic interventions for β-type hemoglobinopathies.

Authors:  Kariofyllis Karamperis; Maria T Tsoumpeli; Fotios Kounelis; Maria Koromina; Christina Mitropoulou; Catia Moutinho; George P Patrinos
Journal:  Hum Genomics       Date:  2021-06-05       Impact factor: 4.639

5.  The parental perspective of thalassaemia in Bangladesh: lack of knowledge, regret, and barriers.

Authors:  Mohammad Sorowar Hossain; Md Mahbub Hasan; Mary Petrou; Paul Telfer; Abdullah Al Mosabbir
Journal:  Orphanet J Rare Dis       Date:  2021-07-16       Impact factor: 4.123

6.  β‑thalassemia caused by compound heterozygous mutations and cured by bone marrow transplantation: A case report.

Authors:  Liusong Wu; Zhiyu Peng; Sen Lu; Mei Tan; Ying Rong; Runmei Tian; Yuhang Yang; Yan Chen; Jindong Chen
Journal:  Mol Med Rep       Date:  2017-09-12       Impact factor: 2.952

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.