| Literature DB >> 31221090 |
Nashaiman Pervaiz1, Nazia Shakeel1, Ayesha Qasim1, Rabail Zehra1, Saneela Anwar1, Neenish Rana1, Yongbiao Xue2, Zhang Zhang2, Yiming Bao3, Amir Ali Abbasi4.
Abstract
BACKGROUND: The hypothesis that vertebrates have experienced two ancient, whole genome duplications (WGDs) is of central interest to evolutionary biology and has been implicated in evolution of developmental complexity. Three-way and Four-way paralogy regions in human and other vertebrate genomes are considered as vital evidence to support this hypothesis. Alternatively, it has been proposed that such paralogy regions are created by small-scale duplications that occurred at different intervals over the evolution of life.Entities:
Keywords: Human; Multigene families; Paralogons; Paralogy regions; Phylogenetic analysis; Segmental duplications; Vertebrate; Whole genome duplications
Mesh:
Year: 2019 PMID: 31221090 PMCID: PMC6585022 DOI: 10.1186/s12862-019-1441-0
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Fig. 1Evolutionary history of human tetra-paralogon Hsa 1/2/8/20. A circular view of human chromosomes shows the paralogons detected among human chromosomes 1/2/8/20, including the synteny relationship among 36 distinct multigene families: 11 families from previously published data that are labeled in black [14], whereas the 25 families analyzed in the present study that are labeled in green. Blue lines connect positions on ideograms for gene families with 3-fold representation, while yellow lines connect families with four-fold representation on these chromosomes. Detailed information about each family is given in Table 1
List of human gene families used in the phylogenetic analysis
| Gene family | Members | Chr location | Human protein accession No. | Number of included taxa | Number of sequences included |
|---|---|---|---|---|---|
| Antizyme Inhibitor | AZIN2 | 1p35.1 | Q96A70 | 25 | 54 |
| ODC1 | 2p25 | P11926 | |||
| AZIN1 | 8q22.3 | O14977 | |||
| Cholinergic Receptors Nicotinic subunits | CHRNB2 | 1q21.3 | P17787 | 32 | 123 |
| CHRNG | 2q37.1 | P07510 | |||
| CHRND | 2q37.1 | Q07001 | |||
| CHRNA1 | 2q31.1 | P02708 | |||
| CHRNA2 | 8p21 | Q15822 | |||
| CHRNA6 | 8p11.21 | Q15825 | |||
| CHRNB3 | 8p11.2 | Q05901 | |||
| CHRNA4 | 20q13.33 | P43681 | |||
| CHRNA3 | 15q24 | P32297 | |||
| CHRNB4 | 15q24 | P30926 | |||
| CHRNB1 | 17p13.1 | P11230 | |||
| CHRNE | 17p13.2 | Q04844 | |||
| CHRNA5 | 15q24 | P30532 | |||
| Ciliary Rootlet Coiled-Coil Protein | CROCC | 1p36.13 | Q5TZA2 | 28 | 42 |
| CROCC2 | 2q37.3 | H7BZ55 | |||
| CEP250 | 20q11.22 | Q9BV73 | |||
| Discs, large (Drosophila) Homolog-associated Protein | DLGAP3 | 1p35.3-p34.1 | O95886 | 25 | 85 |
| DLGAP1 | 18p11.31 | O14490 | |||
| DLGAP5 | 14q22.3 | Q15398 | |||
| DLGAP2 | 8p23 | Q9P1A6 | |||
| DLGAP4 | 20q11.23 | Q9Y2H0 | |||
| E2F Transcription Factor | E2F2 | 1p36 | Q14209 | 31 | 84 |
| E2F6 | 2p25.1 | O75461 | |||
| E2F5 | 8q21.2 | Q15329 | |||
| E2F1 | 20q11.2 | Q01094 | |||
| E2F3 | 6p22 | O00716 | |||
| E2F4 | 16q22.1 | Q16254 | |||
| Family with Sequence Similarity 110 | FAM110D | 1p36.11 | Q8TAY7 | 25 | 56 |
| FAM110C | 2p25.3 | Q1W6H9 | |||
| FAM110B | 8q12.1 | Q8TC76 | |||
| FAM110A | 20p13 | Q9BQ89 | |||
| Grainyhead like Transcription factor | GRHL3 | 1p36.11 | Q8TE85 | 26 | 57 |
| TFCP2L1 | 2q14 | Q9NZI6 | |||
| GRHL1 | 2p25.1 | Q9NZI5 | |||
| GRHL2 | 8q22.3 | Q6ISB3 | |||
| TFCP2 | 12q13 | Q12800 | |||
| UBP1 | 3p22.3 | Q9NZI7 | |||
| Inhibitor of DNA Binding protein | ID3 | 1p36.13-p36.12 | Q02535 | 35 | 65 |
| ID2 | 2p25 | Q02363 | |||
| ID1 | 20q11 | P41134 | |||
| ID4 | 6p22.3 | P47928 | |||
| Maestro Heat-like Repeat-containing Protein Family | MROH9 | 1q24.3 | Q5TGP6 | 22 | 46 |
| MROH7 | 1p32.3 | Q68CQ1 | |||
| MROH6 | 8q24.3 | A6NGR9 | |||
| MROH5 | 8q24.3 | Q6ZUA9 | |||
| MROH8 | 20q11.22 | Q9H579 | |||
| Myelin Transcription Factor | MYT1L | 2p25.3 | Q9UL68 | 22 | 48 |
| ST18 | 8q11.23 | O60284 | |||
| MYT1 | 20q13.33 | Q01538 | |||
| Nuclear Receptor Coactivator | NCOA1 | 2p23 | Q15788 | 22 | 54 |
| NCOA2 | 8q13.3 | Q15596 | |||
| NCOA3 | 20q12 | Q9Y6Q9 | |||
| Na+/K+ Transporting ATPase Interacting Protein | NKAIN1 | 1p35.2 | Q4KMZ8 | 24 | 46 |
| NKAIN3 | 8q12.3 | Q8N8D7 | |||
| NKAIN4 | 20q13.33 | Q8IVV8 | |||
| NKAIN2 | 6q21 | Q5VXU1 | |||
| Potassium Voltage-Gated Channel subfamily Q | KCNQ4 | 1p34 | P56696 | 28 | 67 |
| KCNQ3 | 8q24 | O43525 | |||
| KCNQ2 | 20q13.3 | O43526 | |||
| KCNQ5 | 6q14 | Q9NR82 | |||
| KCNQ1 | 11p15.5 | P51787 | |||
| Regulator of G-protein Signalling | RGS13 | 1q31.2 | O14921 | 31 | 101 |
| RGS8 | 1q25 | P57771 | |||
| RGS1 | 1q31 | Q08116 | |||
| RGS18 | 1q31.2 | Q9NS28 | |||
| RGS16 | 1q25-q31 | O15492 | |||
| RGS21 | 1q31.2 | Q2M5E4 | |||
| RGS4 | 1q23.3 | P49798 | |||
| RGS2 | 1q31 | P41220 | |||
| RGS20 | 8q11.23 | O76081 | |||
| RGS19 | 20q13.33 | P49795 | |||
| RGS17 | 6q25.3 | Q9UGC6 | |||
| RGS3 | 9q32 | P49796 | |||
| RGS5 | 1q23.1 | O15539 | |||
| Regulating Synaptic Membrane Exocytosis Protein | RIMS3 | 1p34.2 | Q9UJD0 | 27 | 49 |
| RIMS2 | 8q22.3 | Q9UQ26 | |||
| RIMS4 | 20q13.12 | Q9H426 | |||
| RIMS1 | 6q12-q13 | Q86UR5 | |||
| R-Spondin Homolog | RSPO1 | 1p34.3 | Q2MKA7 | 31 | 60 |
| RSPO2 | 8q23.1 | Q6UXX9 | |||
| RSPO4 | 20p13 | Q2I0M5 | |||
| RSPO3 | 6q22.33 | Q9BXY4 | |||
| Solute Carrier Family | SLC30A2 | 1p35.3 | Q9BRI3 | 23 | 74 |
| SLC30A10 | 1q41 | Q6XR72 | |||
| SLC30A1 | 1q32.3 | Q9Y6M5 | |||
| SLC30A3 | 2p23.3 | Q99726 | |||
| SLC30A8 | 8q24.11 | Q8IWU4 | |||
| SLC30A4 | 15q21.1 | O14863 | |||
| Syntrophin, Gamma | SNTG2 | 2p25.3 | Q9NY99 | 28 | 81 |
| SNTG1 | 8q11.21 | Q9NSN8 | |||
| SNTB1 | 8q23-q24 | Q13884 | |||
| SNTA1 | 20q11.2 | Q13424 | |||
| SNTB2 | 16q22.1 | P49711 | |||
| GOPC | 6q21 | Q9HD26 | |||
| Sorting Nexin Family | SNX27 | 1q21.3 | Q96L92 | 29 | 43 |
| SNX17 | 2p23.3 | Q15036 | |||
| SNX31 | 8q22.3 | Q8N9S9 | |||
| Stathmin | STMN1 | 1p36.11 | P16949 | 22 | 63 |
| STMN2 | 8q21.13 | Q93045 | |||
| STMN4 | 8p21.2 | Q9H169 | |||
| STMN3 | 20q13.3 | Q9NZ72 | |||
| Serine/Threonine-Protein Kinase | STK25 | 2q37.3 | O00506 | 25 | 72 |
| STK3 | 8q22.2 | Q13188 | |||
| STK4 | 20q11.2-q13.2 | Q13043 | |||
| STK24 | 13q31.2-q32.3 | Q9Y6E0 | |||
| STK26 | Xq26.2 | Q9P289 | |||
| Transcription Elongation factor A (SII) Protein | TCEA3 | 1p36.12 | O75764 | 22 | 51 |
| TCEA1 | 8q11.2 | P23193 | |||
| TCEA2 | 20q13.33 | Q15560 | |||
| TCEANC | Xp22.2 | Q8N8B7 | |||
| UBX Domain-Containing Protein | UBXN2A | 2p23.3 | P68543 | 22 | 32 |
| UBXN2B | 8q12.1 | Q14CS0 | |||
| NSFL1C | 20p13 | Q9UNZ2 | |||
| X Kell Blood Group Precursor-related Family | XKR8 | 1p35.3 | Q9H6D3 | 24 | 101 |
| XKR9 | 8q13.3 | Q5GH70 | |||
| XKR6 | 8p23.1 | Q5GH73 | |||
| XKR4 | 8q12.1 | Q5GH76 | |||
| XKR5 | 8p23.1 | Q6UX68 | |||
| XKR7 | 20q11.21 | Q5GH72 | |||
| YTH Domain-Containing Family Protein | YTHDF2 | 1p35 | Q9Y5A9 | 24 | 50 |
| YTHDF3 | 8q12.3 | Q7Z739 | |||
| YTHDF1 | 20q13.33 | Q9BYJ9 |
Fig. 2The human genes duplicated in parallel lie in respective co-duplicated groups. Consistencies in phylogenetic tree topologies of families (analyzed in this and our previous study) with at least threefold representation on human tetra-paralogon Hsa1/2/8/20 (a) Schematic topology of MROH and STK families; b schematic topology of E2F, EYA and STMN families; c schematic topology of HCK, DLGAP, NKAIN, KCNQ and MATN gene families; d schematic topology of FAM110, NCO, KCNS, YTHDF, XKR and MYT gene families. For each case, the percentage bootstrap values of internal branches are provided in parentheses except for gene families exhibiting slightly lower bootstrap values (≤50%).The connecting bars on the left portray the close physical associations of relevant genes. Asterisk symbol * designate the relevant chromosomes
Fig. 3The relative timings of gene duplication events. For the 36 multigene families analyzed in this study, 52 gene duplications are detected before the invertebrate-vertebrate divide and 74 duplications are detected after invertebrate-vertebrate and before tetrapod-bony fish divergence. Only four tetrapod specific duplication events are detected. The numbers enclosed in the parentheses following gene family names represent the count of duplications experienced by family. Gene families are ordered alphabetically