| Literature DB >> 36160324 |
Ha T T Duong1,2, Hirofumi Suzuki3, Saki Katagiri1, Mayu Shibata1, Misae Arai1, Kei Yura1,3,4.
Abstract
Sequencing of individual human genomes enables studying relationship among nucleotide variations, amino acid substitutions, effect on protein structures and diseases. Many studies have found general tendencies, for instance, that pathogenic variations tend to be found in the buried regions of the protein structures, that benign variations tend to be found on the surface of the proteins, and that variations on evolutionary conserved residues tend to be pathogenic. These tendencies were deduced from globular proteins with standard evolutionary changes in amino acid sequences. In this study, we investigated the variation distribution on actin, one of the highly conserved proteins. Many nucleotide variations and three-dimensional structures of actin have been registered in databases. By combining those data, we found that variations buried inside the protein were rather benign and variations on the surface of the protein were pathogenic. This idiosyncratic distribution of the variation impact is likely ascribed to the extensive use of the surface of the protein for protein-protein interactions in actin. 2022 THE BIOPHYSICAL SOCIETY OF JAPAN.Entities:
Keywords: VUS; conservation; pathogenic variation; protein three-dimensional structure; protein-protein interaction
Year: 2022 PMID: 36160324 PMCID: PMC9465404 DOI: 10.2142/biophysico.bppb-v19.0025
Source DB: PubMed Journal: Biophys Physicobiol ISSN: 2189-4779
Actin proteins identified in human genome
| Protein | Gene | Chromosome | chain | start | end | Protein length | UniProt |
|---|---|---|---|---|---|---|---|
| Actin alpha 1, skeletal muscle | ACTA1 | 1 | R | 229,431,499 | 229,433,115 | 377 | P68133 |
| Actin alpha 2, smooth muscle | ACTA2 | 10 | R | 88,935,223 | 88,948,930 | 377 | P62736 |
| Actin alpha cardiac muscle 1 | ACTC1 | 15 | R | 34,790,412 | 34,794,808 | 377 | P68032 |
| Actin beta | ACTB | 7 | R | 5,527,748 | 5,529,657 | 375 | P60709 |
| Actin beta like 2 | ACTBL2 | 5 | R | 57,481,577 | 57,482,707 | 376 | Q562R1 |
| Actin gamma 1 | ACTG1 | 17 | R | 81,510,690 | 81,512,354 | 375 | P63261 |
| Actin gamma 2, smooth muscle | ACTG2 | 2 | F | 73,901,312 | 73,919,575 | 376 | P63267 |
| Actin like 6A | ACTL6A | 3 | F | 179,563,093 | 179,588,010 | 429 | O96019 |
| Actin like 6B | ACTL6B | 7 | R | 100,643,246 | 100,656,354 | 426 | O94805 |
| Actin like 7A | ACTL7A | 9 | F | 108,862,323 | 108,863,630 | 435 | Q9Y615 |
| Actin like 7B | ACTL7B | 9 | R | 108,854,683 | 108,855,930 | 415 | Q9Y614 |
| Actin like 8 | ACTL8 | 1 | F | 17,823,009 | 17,826,519 | 366 | Q9H568 |
| Actin like 9 | ACTL9 | 19 | R | 8,697,451 | 8,698,701 | 416 | Q8TC94 |
| Actin like 10 | ACTL10 | 20 | F | 33,667,498 | 33,668,235 | 245 | Q5JWF8 |
Figure 1 The number of amino acid substitutions in actin and actin-like proteins based on the variation data in ClinVar. The vertical axis is the original amino acid types and the horizontal axis is the result of variations. The variation is categorized into three as stated in method section. Each number in a box represents a reported case of amino acid substitutions due to changes in nucleotide sequence in actin gene. Boxes on the diagonal axis of each chart are in grey to emphasize synonymous variations. Boxes in yellow indicate variations in high number in each category.
Figure 2 Relationship between variation type and amino acid residue accessibility depicted by violin plot. The number of variations is not equal to the ones in Figure 1, because some of the variations could not be mapped to 3D structure of actin (PDB ID: 6NBW chain A). Red, green and blue shapes represent density of the data depicted by kernel density estimate with normal distribution. A black box represents the range between the first and the third quartiles, a bar in the black box is the median, and a white dot is the mean value. The number in the parentheses is the count in each significance. Note that the number of benign variations is small.
Figure 3 Locations of benign variations in actin 3D structure. The protein is shown by ribbon model with the side chains in line. Three benign variation sites are shown by space filling model in black. Two benign variation sites are located on the disordered loop depicted by dotted line. Each variation is shown with a gene name where the variant was found, residue number sandwiched by original and variant amino acid types, and accessibility of the original residue. A colored space filling model in the center is Mg2+-ATP. The orientation of actin in this figure is named Front throughout this manuscript.
Figure 4 Protein-binding sites and variation sites on the surface of actin. The orientation of the molecule in the left side (Front) is the same orientation as the one in Figure 3. The orientation in the right (Back) is a 180˚ rotation of the left. A molecule in space filling model is ATP. (A) Actin-binding sites are colored in red on actin surface. The binding site were derived from the following PDB entries; 3J82, 3JBK, 3LUE, 5JLH, 6ANU, 6CXJ, 6G2T, 6LTJ, 6UK4, 6VAO, 6VEC, and 7CCC. (B) Myosin and tropomyosin-binding sites were derived from 5JLH, 6CXJ, and 6G2T. (C) Binding sites of other proteins from 1LOT, 3BYH, 3JBK, 3LUE, 5UBO, 6ANU, 6LTJ, 6NBW, 6UC4, 6VAO, 6VEC, 7CCC, and 7NZM. (D) Pathogenic variation sites. (E) VUS sites. White arrows in (A) and (B) are the surface where no binding sites were assigned. Black dotted circle in (D) is the residue cluster discussed in the text.
Figure 5 Relationship among amino acid variation, accessibility and binding sites on actin. Horizontal axes of the graphs are residue numbers. Vertical axes on the left are the smoothed accessibility/correlation coefficient given in black in the graphs. Smoothing was carried out by window size three in accessibility and window size seven in correlation coefficient without weight. A vertical axis on the right is the smooth variability given in red and green in the graph. Variability is defined as a mean of the number of variations in the window size of five. Red box indicates ATP-binding site calculated on 6NBW and blue boxes are protein-binding sites give in Figures 4 A-C. Yellow shaded regions with 3D structure are discussed in the text.