| Literature DB >> 32045365 |
Hafiz Ishfaq Ahmad1, Jiabin Zhou1, Muhammad Jamil Ahmad2, Gulnaz Afzal3, Haiying Jiang1, Xiujuan Zhang1, Abdelmotaleb A Elokil3,4, Musarrat Abbas Khan5, Linmiao Li1, Huiming Li1, Liu Ping1, Jinping Chen1.
Abstract
Programmed cell death-1 (PD-1) and its ligands, particularly PD-L1 and PD-L2, are the most important proteins responsible for signaling T-cell inhibition and arbitrating immune homeostasis and tolerance mechanisms. However, the adaptive evolution of these genes is poorly understood. In this study, we aligned protein-coding genes from vertebrate species to evaluate positive selection constraints and evolution in the PD1, PD-L1 and PD-L2 genes conserved across up to 166 vertebrate species, with an average of 55 species per gene. We determined that although the positive selection was obvious, an average of 5.3% of codons underwent positive selection in the three genes across vertebrate lineages, and increased positive selection pressure was detected in both the Ig-like domains and transmembrane domains of the proteins. Moreover, the PD1, PD-L1 and PD-L2 genes were highly expressed in almost all tissues of the selected species indicating a distinct expression pattern in different tissues among most species. Our study reveals that adaptive selection plays a key role in the evolution of PD1 and its ligands in the majority of vertebrate species, which is in agreement with the contribution of these residues to the mechanisms of pathogen identification and coevolution in the complexity and novelties of vertebrate immune systems.Entities:
Keywords: PD-L1; PD1; adaptive selection; evolution; vertebrates
Mesh:
Substances:
Year: 2020 PMID: 32045365 PMCID: PMC7066927 DOI: 10.18632/aging.102827
Source DB: PubMed Journal: Aging (Albany NY) ISSN: 1945-4589 Impact factor: 5.682
Figure 1(A) Molecular structure of PD1 and Conserved domain analysis of PD1 protein. (B) Showing the MSA of the 20 most homologous proteins to PD1 (obtained with a BLAST+ search against the PDBAA database). Known secondary structure elements are displayed for all aligned sequences. Alternate residues are highlighted by gray. Identical and similar residues are boxed in red and yellow, respectively. (C) Location of positively selected amino acid sites identified PD1 conserved Ig domain. The crystal structure of human PD1 was used as a reference sequence and positively selected sites were drawn onto the crystal structure using Phyre tool (http://www.sbg.bio.ic.ac.uk/phyre2/html). Two residues identified under selection fall in the immunoglobulin-like domain containing the ligand-binding site. The sites which fall in the region identified as the ligand-binding site and another cluster in a region immediately following the signal sequence.
Figure 2(A) Molecular structure and Conserved domain analysis of PD-L1 protein. (B) Showing the MSA of the 20 most homologous proteins to PD-L1 (obtained with a BLAST+ search against the PDBAA database). Known secondary structure elements are displayed for all aligned sequences. Alternate residues are highlighted by gray. Identical and similar residues are boxed in red and yellow, respectively. (C) Location of positively selected amino acid sites identified PD-L1 conserved Ig domain. The crystal structure of human PD-L1 was used as a reference sequence and positively selected sites were drawn onto the crystal structure using Phyre tool (http://www.sbg.bio.ic.ac.uk/ phyre2/html). Four residues identified under selection fall in the immunoglobulin-like domain containing the ligand-binding site. The sites which fall in the region identified as the ligand-binding site and another cluster in a region immediately following the signal sequence.
Figure 3(A) Molecular structure of PD1 and Conserved domain analysis of PD-L2 protein. (B) Showing the MSA of the 20 most homologous proteins to PD-L2 (obtained with a BLAST+ search against the PDBAA database). Known secondary structure elements are displayed for all aligned sequences. Alternate residues are highlighted by gray. Identical and similar residues are boxed in red and yellow, respectively. (C) Location of positively selected amino acid sites identified PD-L2 conserved Ig domain. The crystal structure of human PD-L2 was used as a reference sequence and positively selected sites were drawn onto the crystal structure using Phyre tool (http://www.sbg.bio.ic.ac.uk/ phyre2/html). Five residues identified under selection fall in the immunoglobulin-like domain containing the ligand-binding site. The sites which fall in the region identified as the ligand-binding site and another cluster in a region immediately following the signal sequence.
Log-Likelihood Values and Test Statistics for PAML Site Models of positive selection.
| PD1 | M1a | p1: 0.51480 p2: 0.48520 | -23577.93 | 0 | 0 | 81, 471, 594 | 79,81,82,90,107,141,23 | 81,108,109,110,24 |
| ω1: 0.21574 ω2: 1.00000 | ||||||||
| M2a | p0: 0.51480 p1: 0.39898 p2: 0.08622 | -23577.93 | ||||||
| ω0: 0.21574 ω1: 1.00000 ω2: 1.00000 | ||||||||
| M7 | P: 0.98018 q: 1.49822 | -23414.34 | 5.103112* | |||||
| M8 | p0: 0.96818 p: 1.00330 q: 1.60757 | -23411.79 | ||||||
| p1: 0.03182) ω: 8.05118 | ||||||||
| PD-L1 | M1a | p1: 0.44199 p2: 0.55801 | -21407.94 | 137.60*** | 96**, 143, 292*, | 293,294,332,553,614 | 45,53,55,62,64,65,66,67 | 96,110,111,292,33 |
| ω1: 0.18081 ω2: 1.00000 | ||||||||
| M2a | p1: 0.39702 p2: 0.47651 p3: 0.12647 | -21339.13 | ||||||
| ω0: 0.18153 ω1: 1.00000 ω2: 3.26240 | ||||||||
| M7 | P:0.66997 q: 0.64572 | -21305.29 | 131.62*** | |||||
| M8 | p0: 0.86159 p: 0.77177 q: 0.94085 | -21239.48 | ||||||
| (p1: 0.13841) ω: 2.57746 | ||||||||
| PD-L2 | M1a | p1: 0.48081 p2: 0.51919 | -22727.73 | 137.60*** | 132,134,137,138,139, | 97,103,122,132,134,135 | 89,122,278,305,35 | |
| ω1: 0.20239 ω2: 1.00000 | 54**, 55**, 59**, 62**, 80**, 81**, 82**,83**, 84**, 85**,597**,598**, 599**,600**,608** | |||||||
| M2a | p0: 0.35589 p1: 0.38002 p2: 0.26409 | -22297.1 | ||||||
| ω0: 0.20533 ω1: 1.00000 ω2: 9.06683 | ||||||||
| M7 | P:0.56058 q: 0.42402 | -22649.48 | 827.14*** | |||||
| M8 | p0: 0.73476 p: 0.70098 q: 0.70029 | -22235.91 | ||||||
| (p1: 0.26524) ω: 7.39740 |
The proportion of sites under positive selection (p1), or under selective constraint (p0), and parameters p and q for the beta distribution. Parameters indicating positive selection are in bold. p: significant at 5% level; p: significant at 1% level. Sites potentially under positive selection identified under model M8 are listed according to the human sequence numbering. Positively selected sites with posterior probability .0.9 are underlined, 0.8–0.9 in bold, and 0.5– 0.7 in plain text. The test statistic 2Δl is compared to a χ2 distribution with 2 degrees of freedom, critical values 5.99, 9.21, and 13.82 at 5%, 1%, and 0.1% significance, respectively. **: significant at 1% level; *: significant at 5% level.
Figure 4Motif distribution of PD1, PD-L1 and PD-L2 genes in representative vertebrate species. Motifs of these genes from representative species from each group are predicted using MEME suite (http://meme-suite.org/) based on amino acid sequences. All sequences are separated by 5 conservative motifs with colors, including motif 1 (red), motif 2 (cyan), motif 3 (green), motif 4 (purple) and motif 5 (brown).
Figure 9Adaptive branch-site REL test for episodic diversifying selection in PD1, PD-L1 and PD-L2 genes. The phylogenetic tree scaled on the expected number of substitutions/nucleotides. The hue of each color indicates the strength of selection, with primary red corresponding to ω > 5, primary blue to ω = 0 and grey to ω=1. The width of each color component represents the proportion of sites in the corresponding class. Thicker branches have been classified as undergoing episodic diversifying selection by the sequential likelihood ratio test at corrected ≤0.05.
Figure 5(A) Chromosomal locations and positively selected sites of PD1, PDL1, and PDL2 genes. The chromosome number is indicated above each bar. The chromosome size is indicated by its relative length using the information from NCBI. The scale of the chromosome is millions of base pairs (Mb). Functional interaction network of PD1, PDL1 and PDL2 genes generated by the visualization environment of Consensus Path DB meta-database, after conserved synteny and functional enrichment analysis. The network of the PD1 gene contains 107 interactions and 62 physical entity nodes. The network of PDL1 contains 125 interaction and 55 physical entity nodes. The network of PDL2 contains 56 interaction and 22 physical entity nodes. Each node represents a physical entity (gene, protein or compound). Each edge represents an interaction.
Figure 6(A) Proteins analysis showing the results of the binding site, solvent accessibility and protein disorder predictions in the human PD1, PD-L1 and PD-L2 sequences. (B) Hydrophobic cluster analysis (HCA) plots of the human PD proteins. HCA plots were constructed with the HCA 1.0.2 program. HCA uses the standard one-letter amino acid abbreviations except for four amino acids, as shown in the key. Hydrophobic residues are outlined. Clusters of hydrophobic residues are usually associated with regular secondary structures (α helices or β sheets). Zigzagging vertical lines of hydrophobic residues indicate alternating hydrophobic and non-hydrophobic residues, typical of exposed β sheets (for example, β2, β3, β5, and β6). Continuous hydrophobic clusters are more common in internal β sheets.
Figure 7Coevolution analysis of positively selected conserved domain residues. The circular relation diagram centered on the residues with their top co-varying residues at cutoffs (A) PD1, (B) PD-L1 and (C) PD-L2. Labels on the diagram represent amino acid residues and their positions in the protein sequence. Colors of the arcs represent covariance scores between two given positions. Colors of the arcs represent covariance scores between two given positions.
Figure 8qRT-PCR analysis of PD1, PD-L1 and PD-L2 genes in different animal tissues. Expression patterns of genes in different tissues were examined. Heart, liver, spleen, lungs, kidney, pancreas, brain, were used for quantitative reverse transcription (qRT-PCR) polymerase chain reaction. Transcript levels are expressed relative to that of beta-actin. NTC: negative control.