Literature DB >> 36118275

Codon Usage Bias and Cluster Analysis of the MMP-2 and MMP-9 Genes in Seven Mammals.

Tanliang Ouyang1, Jincheng Zhong1, Zhixin Chai1, Jiabo Wang1, Ming Zhang1, Zhijuan Wu1,2, Jinwei Xin2.   

Abstract

Matrix metalloproteinase (MMP)-2 and MMP-9 are a family of Zn2+ and Ca2+-dependent gelatinase MMPs that regulate muscle development and disease treatment, and they are highly conservative during biological evolution. Despite increasing knowledge of MMP genes, their evolutionary mechanism for functional adaption remains unclear. Moreover, analysis of codon usage bias (CUB) is reliable to understand evolutionary associations. However, the distribution of CUB of MMP-2 and MMP-9 genes in mammals has not been revealed clearly. Multiple analytical software was used to study the genetic evolution, phylogeny, and codon usage pattern of these two genes in seven species of mammals. Results showed that the MMP-2 and MMP-9 genes have CUB. By comparing the content of synonymous codon bases amongst seven mammals, we found that MMP-2 and MMP-9 were low-expression genes in mammals with high codon conservation, and their third codon preferred the G/C base. RSCU analysis revealed that these two genes preferred codons encoding delicious amino acids. Analysing what factors influence CUB showed that the third base distributors of these two genes were C/A and C/T, and GC3S had a wide distribution range on the ENC plot reference curve under no selection or mutational pressure. Thus, mutational pressure is an important factor in CUB. This study revealed the usage characteristics of the MMP-2 and MMP-9 gene codons in different mammals and provided basic data for further study towards enhancing meat flavour, treating muscle disease, and optimizing codons.
Copyright © 2022 Tanliang Ouyang et al.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 36118275      PMCID: PMC9467794          DOI: 10.1155/2022/2823356

Source DB:  PubMed          Journal:  Genet Res (Camb)        ISSN: 0016-6723            Impact factor:   1.375


1. Introduction

Codon usage bias (CUB) is defined as unequal utilisation in the frequency of synonymous codons in coding amino acids (AAs), and it has been used extensively for investigating gene phylogeny [1]. The synonymous codon characteristics include universality, degeneracy, and wobble, and they should be used randomly to encode corresponding AAs with no pressure of interference pressure. However, CUB can be affected by nucleotides composition, translation, hydrophobicity, tRNA abundance, and protein structure [2-6]. Notably, natural selection and mutational pressure, which drive the correct translation process, are the major factors associated with CUB [7, 8]. Natural selection affects the pattern of codon usage in organisms, and mutational pressure may arise whilst the proportion of codon bases changes. CUB greatly increases the variability of genetic information and reflects the genetic drift of codons to a certain extent [9]. Therefore, CUB can reveal the evolution of genes or organisms and environmental adaptation [10]. CUB is assessed by using the effective number of codons (ENC), codon adaptation index (CAI), frequency of optimal codons (FOP), codon bias index (CBI), and relative usage of synonymous codons (RSCU). ENC is calculated by comparing the GC content of synonymous codon positions [11]. CAI is 0-1; the closer the value is to 1, the stronger the nucleotide bases prefer synonymous codons [12]. FOP and CBI are both 0-1. These two indicators are close to 1, and the optimal codon for encoding amino acids is preferred. However, if CBI is negative, the optimal codon usage is less than the average number of codons used [3, 13]. RSCU is the specific value between the actual observation and theoretical observation, amongst which the theoretical observation value is the observation value when the synonymous codon usage frequency is the same, namely, there is no codon bias. If RSCU = 1, there is no CUB. If RSCU > 1, the appearance frequency of the codon is higher than the other synonymous codon. By contrast, it indicates lower genes. If RSCU > 2, then the frequency of CUB is extremely high [14]. Matrix metalloproteinases (MMPs) are a family of Zn2+ and Ca2+-dependent proteolytic enzymes that are widely expressed in animal tissues and highly conservative during biological evolution [15]. MMP-2 and MMP-9 can regulate muscle growth, repair, and some relative processes that affect biochemical reactions for muscle regulation [16]. Although recent research mainly focused on exploring MMP-2 and MMP-9 function for animal skeletal muscle development, healing diseased muscle and even meat [17-23], studies on MMP codons is rare. Therefore, there is an urgent need for exploring mammals' MMP-2 and MMP-9 genetic evolution and codon usage pattern regulating muscle growth. In this study, seven mammals (Bos grunniens, Bos taurus, and Sus scrofa among Artiodactyla; Macaca mulatta in Primates; Canis lupus familiaris in Carnivora; Oryctolagus cuniculus in Lagomorpha; and Mus musculus in Rodentia) were chosen to analyse CUB and base pair composition dynamics. This study would give insight into the factors affecting CUB for MMP-2 and MMP-9 genes and provide basic data for enhancing the meat flavour and finding a promising gene treatment for muscle disease.

2. Materials and Methods

2.1. Software

MEGA 7.0, CodonW 1.4.2, pheatmap, and ggplot packages based on R 4.4.3 software were used to complete the relevant analysis.

2.2. Base Composition of MMP Genes' CDS in Different Mammals

The coding sequence (CDS) of yak MMP-2 and MMP-9 genes were obtained in our laboratory, and the NCBI accession numbers were MZ476247 and MZ476248, respectively. The CDS of other animals' genes were from NCBI GenBank, and their accession numbers are shown in Figure 1.
Figure 1

NJ trees of MMP-2 (a) and MMP-9 (b) genes of seven mammal species.

CodonW 1.4.2 software developed by J. Peden was used to analyse the MMP-2 and MMP-9 CDS in seven mammals for calculating A/T (A/T base content, the same below), G/C, T3S (third base of the codon is T content, the same below), C3S, A3S, G3S, GC3S, AT3S, ENC, CAI, CBI, FOP, and RSCU [24]. R packages pheatmap and ggplot2 were used to analyse the data.

2.3. PR2 Plot

PR2 plot could analyse the bias amongst ATCG under gene mutation [25]. If the frequency of the third base is A > T, then dots are scattered on the top of the PR2 plot. If the frequency is C > G, then dots are on the left. When the codon does not show usage bias, the dots are in the centre of the graph [26].

2.4. Codon Neutral Analysis

Codon neutral analysis was carried out by the correlation analysis of GC12 (the average of the GC content of the first and second bases) and GC3S to compare the influence of natural selection pressure and mutational pressure on CUB [27]. A significant correlation between GC12 and GC3S indicated that mutational pressure had a strong influence on codon preference; otherwise, natural selection influenced CUB [28].

2.5. ENC Plot

The relationship between ENC and GC3S without environmental selection pressure could be simulated by the following formula (1). The ENC/GC3S reference curve shows the main characteristics of codon usage patterns [24]. If CUB is more affected by natural selection, it should be below the standard curve. By contrast, it should be above the standard curve if it is more affected by other factors such as gene mutation. In general, the ENC is from 35 to 61. If ENC > 35, CUB is weak [11].

3. Results

3.1. Phylogenetic Analysis

Neighbour joining (NJ) trees were established based on the MMP-2 and MMP-9 CDS in seven mammals. The results (Figure 1) showed that the MMP-2 and MMP-9 genes of Bos grunniens were similar to those of B. taurus. These two genes of S. scrofa were similar to those of B. grunniens and B. taurus. Interestingly, the MMP-9 genes of C. lupus familiaris showed closer proximity to those of S. scrofa but those of the MMP-2 gene was farther.

3.2. Nucleotide Composition of MMP-2 and MMP-9 Genes

Compared with the content of codon bases of the MMP-2 and MMP-9 genes in seven mammals, the results showed (Tables 1 and 2) that the G/C content was higher than the A/T content. Most mammals' MMP-2 and MMP-9 GC3S were larger than AT3S, except for the MMP-2 gene of B. taurus and M. musculus. The above findings indicated that the MMP-2 and MMP-9 gene codons preferred GC3S.
Table 1

Nucleotide composition in the sequence of MMP-2 gene.

Species A/T (%) G/C (%) T 3S (%) C 3S (%) A 3S (%) G 3S (%)GC3S (%)AT3S (%)
Bos grunniens 0.4150.5850.20930.3720.20810.37030.6370.363
Bos taurus 0.3840.6160.26380.30490.340.27240.4880.512
Macaca mulatta 0.4090.5910.22470.31280.27410.30310.5510.449
Mus musculus 0.4330.5670.29740.32340.32120.25660.4820.518
Oryctolagus cuniculus 0.40.60.19970.54520.13040.36210.730.27
Sus scrofa 0.3930.6070.18710.54850.14040.34880.7310.269
Canis lupus familiaris 0.4170.5830.2840.33020.32850.24150.4820.518
Table 2

Nucleotide composition in the sequence of MMP-9 gene.

Species A/T (%) G/C (%) T 3S (%) C 3S (%) A 3S (%) G 3S (%)GC3S (%)AT3S (%)
Bos grunniens 0.3660.6340.18420.52630.19810.28910.6850.315
Bos taurus 0.3780.6220.1730.55540.12020.37810.7590.241
Macaca mulatta 0.3790.6210.17660.51350.12770.39780.7470.253
Mus musculus 0.4330.5670.27760.34360.30560.24580.5030.497
Oryctolagus cuniculus 0.3450.6550.22370.30990.19760.34310.6060.394
Sus scrofa 0.3850.6150.18140.54110.12070.38520.7520.248
Canis lupus familiaris 0.3730.6270.15540.57020.11430.390.7790.221
The codon usage results (Tables 3 and 4) showed that ENCs of the MMP-2 and MMP-9 genes in seven mammals were 40–56, indicating that these two genes had low expression and their codon conservation was high.
Table 3

Codon analysis of MMP-2 gene.

SpeciesENCCAICBIFopGC1S (%)GC2S (%)GC12 (%)
Bos grunniens 48.040.193−0.0040.4160.49320.59610.54465
Bos taurus 53.820.129−0.0060.3860.74710.5760.66155
Macaca mulatta 52.310.1570.0060.4130.49820.67510.58665
Mus musculus 55.090.1630.0340.4160.62380.54390.58385
Oryctolagus cuniculus 43.10.290.1930.5390.5650.4910.528
Sus scrofa 44.620.2610.170.5240.55790.520.53895
Canis lupus familiaris 55.290.1460.0140.40.67060.55690.61375
Table 4

Codon analysis of MMP-9 gene.

SpeciesENCCAICBIFopGC1S (%)GC2S (%)GC12 (%)
Bos grunniens 44.440.2560.240.5580.67130.52950.6004
Bos taurus 42.480.2910.2490.5660.60150.49550.5485
Macaca mulatta 44.990.2520.1910.530.59970.50450.5521
Mus musculus 56.960.160.0420.4190.61050.54480.57765
Oryctolagus cuniculus 54.050.1540.0110.4190.50650.81910.6628
Sus scrofa 43.790.2820.2220.5490.59580.48630.54105
Canis lupus familiaris 40.950.2870.2430.5630.60250.49160.54705
CAI showed that the preference for synonymous codons of the MMP-9 genes in seven mammals was significantly better than that of MMP-2, but both were lower than 0.3, indicating that it failed to reflect the preference of synonymous codons. FOP and CBI results of the MMP-2 and MMP-9 genes showed that the optimal codon usage of MMP-2 in B. grunniens and B. taurus was inferior to the five other animals, whilst the optimal codon usage of MMP-9 was better than that of MMP-2.

3.3. RSCU Analysis

The RSCU results of the MMP-2 and MMP-9 genes showed that these two genes had a preference for 27 and 20 codons, respectively (Tables 5 and 6). Amongst them, CUG (encoding leucine, Leu) of MMP-2, CUG (encoding leucine, Leu), ACC (encoding threonine, Thr), and CGC (encoding arginine, Arg) of MMP-9 had strong high CUB (RSCU > 2).
Table 5

RSCU for MMP-2 gene among seven species.

AACodonFrequencyRSCU
PheUUU660.675714
UUC1421.324286

LeuUUA280.268571
UUG710.732857
CUU1271.09
CUC1011.011429
CUA530.468571
CUG∗∗2392.428571

IleAUU340.801429
AUC671.28
AUA340.918571
MetAUG1101

ValGUU650.861429
GUC470.715714
GUA390.522857
GUG1221.902857

SerUCU891.078571
UCC1041.27
UCA700.904286
UCG230.317143

ProCCU1570.974286
CCC2561.641429
CCA1240.75
CCG1020.635714

ThrACU740.795714
ACC1341.365714
ACA1081.142857
ACG640.697143

AlaGCU1281.047143
GCC1991.585714
GCA990.735714
GCG790.631429

TyrUAU300.678571
UAC751.321429

HisCAU920.74
CAC1481.26

GlnCAA1440.864286
CAG1321.135714

AsnAAU400.787143
AAC751.212857

LysAAA580.822857
AAG1381.178571

AspGAU940.835714
GAC1311.164286

GluGAA1160.944286
GAG1211.055714

CysUGU730.69
UGC1511.31
TrpUGG2061

ArgCGU340.428571
CGC1041.327143
CGA680.772857
CGG921.177143

SerAGU660.897143
AGC1121.534286

ArgAGA1091.212857
AGG931.08

GlyGGU710.527143
GGC2071.371429
GGA1561.074286
GGG1491.03

TERUAA210.374286
UAG210.425714
UGA∗∗1322.201429

Note. RSCU > 1; RSCU > 2; AA. amino acid; TER. termination codon; the same below.

Table 6

RSCU for MMP-9 gene among seven species.

AACodonFrequencyRSCU
PheUUU720.584285714
UUC1971.415714286

LeuUUA170.17
UUG440.621428571
CUU760.848571429
CUC1071.362857143
CUA330.357142857
CUG∗∗1952.641428571

IleAUU300.811428571
AUC611.881428571
AUA110.308571429
MetAUG491

ValGUU480.684285714
GUC660.957142857
GUA290.375714286
GUG1471.978571429

SerUCU750.994285714
UCC1181.862857143
UCA410.494285714
UCG550.832857143

ProCCU1170.85
CCC2091.644285714
CCA1080.735714286
CCG1020.774285714

ThrACU740.688571429
ACC∗∗2082.145714286
ACA540.484285714
ACG820.684285714

AlaGCU840.792857143
GCC1751.731428571
GCA710.655714286
GCG870.815714286

TyrUAU350.54
UAC1191.46

HisCAU390.441428571
CAC1161.558571429

GlnCAA620.537142857
CAG1361.462857143

AsnAAU330.664285714
AAC721.335714286

LysAAA410.598571429
AAG1051.401428571

AspGAU700.571428571
GAC2101.428571429

GluGAA720.592857143
GAG1591.407142857

CysUGU460.417142857
UGC1351.582857143
TrpUGG1281

ArgCGU450.532857143
CGC∗∗1342.072857143
CGA540.667142857
CGG770.965714286

SerAGU310.418571429
AGC911.395714286

ArgAGA630.752857143
AGG791.011428571

GlyGGU770.615714286
GGC2191.738571429
GGA860.654285714
GGG1240.991428571

TERUAA180.571428571
UAG201.538571429
UGA370.89
Heat map analysis of the correlation between codon base composition and GC3S (Figure 2) showed that most of the codons of the MMP-2 and MMP-9 genes in different mammals were positively correlated with GC3S and in line with AC-, CG-, AT-, TC-, GG-, CC-, GC-, and other codons whose third base was C.
Figure 2

Heat maps of MMP-2 (a) and MMP-9. (b) Correlation coefficient of codons with GC3S.

Cluster analysis of the RSCU of the MMP-2 and MMP-9 genes showed that the MMP-2 gene preferred CUG, GUG, UCC, GAG, AUC, AAC, UAC, GCC, AGA, UUG, and AGG codons, which were mainly involved in encoding Leu (leucine), Val (valine), Ser (serine), Glu (glutamic acid), Iso (isoleucine) Asn (asparagine), Tyr (tyrosine), Gly (glycine), and Arg (arginine), respectively (Figures 3 and 4). In addition to B. taurus and C. lupus familiaris, the five other species had a strong preference for CUG and GUG (RSCU > 2), amongst which the RSCU of O. cuniculus and B. grunniens > 3. The MMP-9 gene preferred UCC, ACC, CGC, CUG, and AUC codons, which are mainly involved in Ser, Thr, Arg, Leu, and Iso, respectively. Except for M. musculus and O. cuniculus, the last five species had strong preferences similar to one another, indicating that the MMP-9 gene was more conservative than MMP-2.
Figure 3

Clustering of RSCU values of each codon in MMP-2 gene.

Figure 4

Clustering of RSCU values of each codon in MMP-9 gene.

3.4. Factors Influenced CUB

The PR2 plot result (Figure 5) showed that the ATCG base distribution of the MMP-2 and MMP-9 genes amongst seven mammals was above 0.5 on the x-axis. The bases distribution of the MMP-2 genes was mainly on the x-axis and the upper right of the y-axis and that of the MMP-9 genes was to the x-axis and the upper right of the y-axis. The above results indicated that the contents of A3S and C3S for the MMP-2 gene and the content of T3S and C3S for the MMP-9 gene were high, respectively.
Figure 5

PR2 plot among seven species.

Neutral analysis (Figure 6 and Table 7) showed that GC3S of these two genes was in the range of 0.44–0.78, whereas GC12 was from 0.52 to 0.67. The difference was that GC12 and GC3S of the MMP-2 gene were strongly negatively correlated (Pearson r = −0.851, p value < 0.05), whilst GC12 and GC3S of the MMP-9 gene were not significantly correlated, indicating that the base composition of the MMP-2 gene codons was susceptible to mutational pressure, but the factor influencing the MMP-9 gene was natural selection.
Figure 6

Codon neutral analysis.

Table 7

Pearson relative analysis with GC3S.

Pearson r (MMP-2) P values (MMP-2)Pearson r (MMP-9) P values (MMP-9)
ENC−0.9930.000∗∗−0.9630.000∗∗
GC12−0.8510.015−0.5890.164

Note. P value < 0.05; p value < 0.01; red represents strong correlation, blue represents moderate correlation, and black represents irrelevance.

The ENC plot showed (Figure 7 and Table 7) that all ENC/GC3S dots of the MMP-2 and MMP-9 genes were distributed below the reference line. ENC and GC3S had a strongly negative correlation (MMP-2: Pearson r = −0.993, p value < 0.01; MMP-9: Pearson r = −0.963, p value < 0.01), and the distribution range of GC3S was large, indicating that the CUB of these two genes was affected by mutational pressure.
Figure 7

ENC plot among seven species.

4. Discussion

This study found that gelatinase MMP genes had CUB for encoding amino acids such as Ile, Arg, Glu, and Ser related to muscle development and meat quality. Gly, Arg, and Leu can promote collagen synthesis, and animal muscle is the main way to obtain natural collagen for humans [29, 30]. Delicious amino acids (DAAs), including Glu, Gly, Ser, Asp, Arg, and Ile, are known as precursor substances that determine the flavour of meat and can improve the taste of chicken and keep the meat soft [31]. Recent research found that the quality of chicken improves and the content of DAAs increases [32]. Otherwise, Strecker amino acids (SAAs), including Phe (phenylalanine), Cys (cysteine), Ile (isoleucine), and Leu (leucine), are highly related to the production of flavour. The higher their content, the stronger the fragrance [33]. For the MMP-2 and MMP-9 genes, the RSCUs of AUC encoding Ile; UCC and AGC encoding Ser; CGC encoding Arg; GAC encoding Asp; GAG encoding Glu; UUC encoding Phe; and GGA, GGC, and GGG encoding Gly were > 1. In particular, the RSCUs of CUG encoding Leu and CGC encoding Arg > 2; this value indicated that MMP-2 and MMP-9 demonstrated CUB for DAAs and SAAs. Besides, Leu, Ile, and Val belong to branched-chain amino acids (BCAAs), and they are essential AAs in humans and animals, accounting for about 35% of muscle protein. Previous studies have found that skeletal muscle, as the initial site of BCAAs catabolism, can be activated by branched-chain keto acids (BCKAs) to increase BCAAs synthesis to relieve muscle wasting disorders [34]. Also, Leu supplementation could be the prevention and treatment of sarcopenia with aging [35]. Thus, BCAAs are important regulators of metabolism and metabolic health in in vivo [36]. The gelatinase MMP CUB associated with corresponding AAs can provide basic data for the improvement of meat quality and muscle disease of MMP molecular modification. Mutational pressure may be the main factor influencing the CUB of MMPs. This study found that the clustering results of the RSCU were different from the NJ trees of the genes, indicating that the MMP genes were highly conserved but maybe subjected to mutations during the evolution of different species. This influence caused a decline in the accuracy of single-gene species classification. Nucleotide AT (U) CG base composition is an important feature of genes, and the GC content can reflect the overall trend of gene mutation which is a decisive factor affecting the frequency of nucleotide use. Changes in the third base of the codon did not affect the encoded AAs, so GC3S could be an important reference for analysing the codon usage pattern. The gene mutation will affect the composition of the synonymous codon third bases with no natural selection, and the stronger the CUB, the more the codon is inclined to GC3S [37, 38]. Novembre et al. also found that the third base distribution of the MMP-2 and MMP-9 genes is mainly AC3S and CT3S, respectively, and the ENC/GC3S dot distribution can reach a wide range compared with the reference curve with gene mutation pressure. Thus, mutational pressure may play an important role in affecting the CUB for MMP-2 and MMP-9 genes, which also explains the difference in RSCU clustering in the seven mammals. Interestingly, we also found that the clustering results based on the RSCU of the MMP-2 gene were not completely consistent with the phylogenetic results based on the MMP-2 gene's CDS. Given that wild yak and Tibetan antelope grow in harsh environments with low altitudes and oxygen consumption, their EGLN1 gene has mutated changing nucleotide bases and leading to CUB changes [39, 40]. Therefore, we believe that the phylogenetic evolution of MMP-2 genes should not only refer to gene sequence but also CUB, which could be a supplement to species classification.

5. Conclusion

MMP-2 and MMP-9 are low-expression genes in mammals, and their codons are highly conservative. Both have a CUB at GC3S and prefer codons encoding DAAs and SAAs for improving soft meat and muscle disease treatment.
  37 in total

1.  Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position.

Authors:  N Sueoka
Journal:  Gene       Date:  1999-09-30       Impact factor: 3.688

2.  The relationship between third-codon position nucleotide content, codon bias, mRNA secondary structure and gene expression in the drosophilid alcohol dehydrogenase genes Adh and Adhr.

Authors:  D B Carlini; Y Chen; W Stephan
Journal:  Genetics       Date:  2001-10       Impact factor: 4.562

3.  Accounting for background nucleotide composition when measuring codon usage bias.

Authors:  John A Novembre
Journal:  Mol Biol Evol       Date:  2002-08       Impact factor: 16.240

Review 4.  Amino acids and muscle loss with aging.

Authors:  Satoshi Fujita; Elena Volpi
Journal:  J Nutr       Date:  2006-01       Impact factor: 4.798

Review 5.  Metalloproteinases and their inhibitors: regulators of wound healing.

Authors:  Sean E Gill; William C Parks
Journal:  Int J Biochem Cell Biol       Date:  2007-10-26       Impact factor: 5.085

6.  Evolution of amino acids and biogenic amines throughout storage in sausages made of horse, beef and turkey meats.

Authors:  Mohamed A Rabie; Cidalia Peres; F Xavier Malcata
Journal:  Meat Sci       Date:  2013-06-12       Impact factor: 5.209

7.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications.

Authors:  P M Sharp; W H Li
Journal:  Nucleic Acids Res       Date:  1987-02-11       Impact factor: 16.971

8.  An evolutionary perspective on synonymous codon usage in unicellular organisms.

Authors:  P M Sharp; W H Li
Journal:  J Mol Evol       Date:  1986       Impact factor: 2.395

9.  Codon selection in yeast.

Authors:  J L Bennetzen; B D Hall
Journal:  J Biol Chem       Date:  1982-03-25       Impact factor: 5.157

10.  On the origin of Tibetans and their genetic basis in adapting high-altitude environments.

Authors:  Binbin Wang; Yong-Biao Zhang; Feng Zhang; Hongbin Lin; Xumin Wang; Ning Wan; Zhenqing Ye; Haiyu Weng; Lili Zhang; Xin Li; Jiangwei Yan; Panpan Wang; Tingting Wu; Longfei Cheng; Jing Wang; Duen-Mei Wang; Xu Ma; Jun Yu
Journal:  PLoS One       Date:  2011-02-28       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.