| Literature DB >> 25374450 |
Abstract
Human short-chain dehydrogenases/reductases (SDRs) protein family has been the subject of recent studies for its critical role in human metabolism. Studies also found that single nucleotide polymorphisms of the SDR protein family were responsible for a variety of genetic diseases, including type II diabetes. This study reports the effect of sequence variation on the structural and functional integrities of human SDR protein family using phylogenetics and correlated mutation analysis tools. Our results indicated that (i) tyrosine, serine, and lysine are signature protein residues that have direct contribution to the structural and functional stabilities of the SDR protein family, (ii) subgroups of SDR protein family have their own signature protein combination that represent their unique functionality, and (iii) mutations of the human SDR protein family showed high correlation in terms of evolutionary history. In combination, the results inferred that over evolutionary history, the SDR protein family was able to diverge itself in order to adapt with the changes in human nutritional demands. Our study reveals understanding of structural and functional scaffolds of specific SDR subgroups that may facilitate the design of specific inhibitor.Entities:
Keywords: consensus sequence; correlated mutation; human short-chain dehydrogenases/reductases (SDRs); multiple sequence alignment; mutational variability; phylogeny
Year: 2014 PMID: 25374450 PMCID: PMC4213187 DOI: 10.4137/EBO.S17807
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1Complete consensus sequence of 71 human SDR sequences. Consensus sequences for the human SDR family, constructed using consensus sequence constructor. The highly conserved positions (>70% identity) are marked with bolded black letters as M, G, G, V, and L. Intermediate conservation (>30% identity) is indicated with black characters corresponding to the most commonly occurring residue. The positions marked as X are the variable positions that are occupied by any particular residue in more than 30% of sequences. As a whole, this figure displays the highly variable characteristics of the human SDR family.
Figure 2Phylogenetic tree – a phylogram constructed in SSSSg. The SDRs family can be phylogenetically grouped into five distinct clades. The protein members in each group are considered to share the similar physiochemical properties to respond to the changing environment. Thus, during evolution, these members adopt their own features that are unique compared to other groups.
Figure 3Comparison of the five consensus human SDR sequences. (A) The binding sites (K, S) and the active site (Y) of the enzymes are among the characters marked with blue color. These locations were found to be conserved residues and outline the common sequence features within the human SDR family. In contrast, the three clusters of amino acids marked in red (such as SAS, FGV, CSS, CHS, and AAA) indicate the presence of variable residues directly adjacent to the conserved residues. These locations determine the narrow specificity within each group. (B) The location of the conserved and variable residues in the template structure of group 1 of human SDR was identified by Talana. For example, conservative residues included active site and binding sites (Y-156, S-143 and K-160) both of which are located in a conserved region (grade 1 in color scheme of Talana). In contrast, the three clusters of residues (S-157, A-158 and S-159) are clearly located in a more variable region. (C) The identification of functional regions within group 1 using ConSurf and Talana. Group 1 expressed the full grade of coloring scheme in ConSurf: the continuous conservation scores are partitioned into a discrete scale of nine bins for visualization, such that bin 9 contains the most conserved position and bin 1 contains the most variable position. The color grades (1–9) are assigned as follows: the most conserved regions are on the darkest maroon color and the least variable regions are on the lightest turquoise color on the visualization. Similarly, using Talana, group 1 also expressed the full grade of coloring. Grade 1 to grade 12 show the most conserved regions in the darkest blue color to the least variable regions in the lightest rose color on visualization. Therefore, both tools displayed similar results for the identification of functional regions in protein structure.
Figure 4Variability profiles for each of the five groups of human SDRs. Total, core, and surface variability profiles are displayed for each group based on the distribution of residues on the protein structure. Group 3 displayed the most conservative level (grade 1 of the color scheme is dominant in the entire structure) compared to groups 4 and 5. Group 1 showed the most variable level (full grade of color scheme, from grade 1 to grade 12 in the structure), and group 2 showed an intermediate level of variability.
Core and surface residues in five groups of human SDRs identified by Corm.
| Group 1 | V11, I85, I87, M95, T119, H137, I141, V163, V181, T182, S183, I184, G187, A214 |
| Group 2 | I115, V117, G120, V133, V141, M166, G208, V213, L219, S245, M247 |
| Group 3 | V73, V89, V104, V117, A123, G147, I151, V173, C174, I211 |
| Group 4 | I42, V43, L67, G75, M129, K160, A195, T224, V227, T231, S234, Q248, V252 |
| Group 5 | V7, A57, V69, G79, V80, V158, V174, T178, A220 |
Surface residues in five human SDR groups are identified by Corm.
| Group 1 | E3, Q5, V8, A20, S21, I22, T25, Q29, D39, S41, R42, E45, V46, K48, I50, Q51, N53, Q55, V57, E59, S61, I62, D64, H67, E69, T72, E73, E80, Q84, I87, M95, S98, A99, I100, E102, E109, A110, M111, D113, I116, K117, G118, T119, Y121, S129, N132, H137, I144, E148, V149, T150, L155, S157, A161, V163, I166, Q168, E171, R180, V181, T182, S183, G187, M188, S194, G195, T197, W199, K204, L205, K208, I210, E212, A213, A214, I215, Y216, Q219, Q220, H223, V224, N225, E228, T230, V231, R232, P233 |
| Group 2 | K64, R70, S71, D75, E78, I81, V91, E99, R100, N103, I115, V117, M119, N122, R126, F130, A131, S132, L134, D135, L139, N147, R153, M166, T195, Y196, G208, V213, T214, M216, S220, D221, L223, A230, V234, I237, K241, F242, D244, S245, M247, A249, E251, N255, C257, G259, D266, C275, H276, S282, W285 |
| Group 3 | T35, Q59, R62, V86, V89, N102, D105, Q106, R109, E115, A123, P126, L130, S131, K133, E135, E136, T138, I145, L155, S158, R161, R162, G177, I179, Y181, I183, P184, A201, D204, K208, V219, T226, R232, P235, L237, R244, S245, I247, N248, N253, Y262, N264, I268, K271 |
| Group 4 | Q35, L36, V43, E53, K56, L67, V72, D73, G75, L77, R80, Q83, A84, V85, G87, Q90, F92, K95, A99, D100, T101, K109, D110, H117, M129, S133, A136, H142, H155, K160, E163, L175, H178, L179, R181, I182, H183, H185, E190, F192, A195, L197, H201, K211, K218, S220, T224, Y225, V227, S234, S241, I242, M243, W245, W247, Q248, F251, V252, Q258, Y266, C267, L269 |
| Group 5 | E4, L24, R28, K31, E39, Q43, S46, D47, Y48, G50, A57, T61, N62, P63, K71, A72, T74, G79, M96, S104, I106, E108, M126, K128, Q130, A149, V174, V179, K190, A191, N193, D194, E195, A202, A206, D211, P212, R213, E226, I244 |
Selected correlated mutations in human SDRs identified by Corm. Correlated mutation in group 5 was analyzed by the Talana program, indicating that if a mutation happened at one specific location, it led to mutation in other positions. For example, if mutation occurred at position 6 (I), the other mutations occurred at the same time at positions 61 (D), 73 (EKNR), 78 (ADEP), and 129 (ACFGH).
| REFERENCE POSITION | AA at | SEQUENCE COUNTS | CORRELATED MUTATIONS AND AMINO ACIDS | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| REFERENCE POSITION | ||||||||||||||||||||||
| 6 | I | 5 | 61: D | 73: EKNR | 78: ADEP | 129: ACFGH | ||||||||||||||||
| V | 4 | 61: NS | 73: LQTY | 78: GQR | QSW | |||||||||||||||||
| 70 | E | 5 | 23: EKT | 56: EMV | 71: EHKNQ | 73: KQRY | 129: CFHW | 192: PST | ||||||||||||||
| K | 4 | 23: AL | 56: A | 71: -AT | 73: ELNT | 129: AQQS | 192: NR | |||||||||||||||
| 79 | I | 4 | 30: FTV | 46: AE | 62: EKV | 68: ADLT | 71: EQT | 73: KLNY | 107: DKNQ | 129: AFSW | 201: EQS | |||||||||||
| V | 4 | 30: IK | 46: DG | 62: FP | 68: V | 71: -AKN | 73: ERT | 107: AE | 129: CGHQ | 201: AD | ||||||||||||
| 105 | I | 4 | 3: -E | 27: KR | 38: E | 45: S | 47: Y | 49: G | 60: T | 73: KNRT | 95: M | 125: M | 127: K | 177: TV | 193: D | 197: AT | 201: AS | 205: A | 212: KQR | 219: AV | 225: DE | |
| V | 4 | 3: QT | 27: ADLT | 38: AFGS | 45: ALV | 47: EQT | 49: -EIT | 60: LQS | 73: ELQY | ALV | 125: AILV | 127: -AQ | 177: A | 193: AE | 197: -EQ | 201: DEGQ | 205: LM | 212: DE | 219: GLRS | 225: GILP | ||
| I | 4 | 129: ACFHQ | 177: TV | 193: D | 197: AT | 201: AS | 205: A | 210: DEG | 212: KQR | 219: AV | 225: DE | |||||||||||
| V | 4 | 129: GSW | 177: A | 193: AE | 197: -EQ | 201: DEGQ | 205: LM | 210: QRS | 212: DE | 219: GLRS | 225: GILP | |||||||||||
| 148 | A | 4 | 42: DKQ | 73: LRT | 78: EGR | 107: DE | 129: CHQS | 201: AQ | ||||||||||||||
| T | 4 | 42: AE | 73: EKNQ | 78: ADPQ | 107: ANQ | 129: AFGW | 201: DGS | |||||||||||||||
| 157 | L | 4 | 71: EHKN | 73: QRY | 129: CHW | 189: AR | 192: PS | |||||||||||||||
| V | 5 | 71: -AQT | 73: EKLNT | 129: AFGQS | 189: EKM | 192: NRT | ||||||||||||||||
| 173 | I | 4 | 23: ET | 56: MV | 71: HKNQ | 73: KQR | 78: DEQ | 211: A | ||||||||||||||
| V | 5 | 23: AKL | 56: AE | 71: -AET | 73: ELNTY | 78: AGPR | 211: P | |||||||||||||||
| 194 | D | 4 | 73: ENR | 78: AEP | 103: DN | 129: ACGH | 243: QV | |||||||||||||||
| E | 4 | 73: KLTY | 78: DGR | 103: EFQS | 129: FQSW | 243: FIS | ||||||||||||||||
| 211 | A | 4 | 23: ET | 56: MV | 71: HKNQ | 73: KQR | 78: DEQ | 173: I | ||||||||||||||
| P | 5 | 23: AKL | 56: AE | 71: -AET | 73: ELNTY | 78: AGPR | 173: V | |||||||||||||||
Correlated mutation sets include the core and surface residues in group 5.
| GROUP 5 POSITIONS | CORE AND SURFACE RESIDUES CORE | SURFACES | ||
|---|---|---|---|---|
| 6 | Val-7 | Gin-130 | Gly-179 | |
| 70 | Asn-193 | Gin-130 | ||
| 79 | Val-80 | Val-69 | Ala-202 | Gin-130 |
| 105 | Thr-178 | Ala-220 | Glu-226 | Arg-213 |
| 148 | Gly-79 | Ala-202 | Ala-149 | |
| 157 | Val-158 | Asn-193 | Lys-190 | |
| 173 | Val-174 | Ala-57 | Pro-212 | Gly-79 |
| 194 | Thr-74 | Gly-79 | ||
Notes: (A) It displays the core residues identified by the Talana program. The residues in each group are located at the core of the protein structure. The occurrences of valine and isoleucine are more frequent compared to other amino acids, showing that these hydrophobic amino acids potentially play a more vital role in stabilizing the chemical structure of the proteins. (B) It displays the surface residues identified using Talana. These residues are located on the surface of protein structures and are distant from each other. (C) It shows the identification of correlated mutation sets and their core and surface characteristics for group 5.
| GROUPS | PDB CODE AND NAME OF REPRESENTATIVE STRUCTURES |
|---|---|
| 1 | 3edm chain A, Short-chain dehydrogenase from Agrobacterium tumefaciens |
| 2 | 1hdc chain A, 3-ALPHA-(20-BETA)-HYDROXYSTEROID DEHYDROGENASE |
| 3 | 1yb1 chain A, 17-beta hydroxysteroid dehydrogenase 11 |
| 4 | 3rd5 chain A, a putative uncharacterized oxidoreductase protein from Mycobacterium Paratuberculosis |
| 5 | 1q7b chain A, beta-ketoacyl-[ACP] reductase from |