| Literature DB >> 31208411 |
Zijie Wang1, Zili Lyu2, Ling Pan3, Gang Zeng4, Parmjeet Randhawa5.
Abstract
BACKGROUND: RNA-seq is poised to play a major role in the management of kidney transplant patients. Rigorous definition of housekeeping genes (HKG) is essential for further progress in this field. Using single genes or a limited set HKG is inherently problematic since their expression might be altered by specific diseases in the patients being studied.Entities:
Keywords: Genes with housekeeping functions; Kidney transplantation; RNA-sequencing
Mesh:
Substances:
Year: 2019 PMID: 31208411 PMCID: PMC6580566 DOI: 10.1186/s12920-019-0538-z
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Flow diagram of the steps used to identify and validate HKG genes in this study
Summary of HKG Datasets Defined in This Study Using 9 Different Normalization Methods
| Normalization methods | Expression ratio* | Bias** | Variance** | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 0–0.01 (%) | 0.01–0.05 (%) | 0.05–0.20 (%) | 0.20–0.40 (%) | 0.40–0.60 (%) | 0.60–0.80 (%) | 0.80–1.0 (%) | |||
| TC | 396 (41.60) | 473 (49.68) | 78 (8.19) | 1 (0.11) | 1 (0.11) | 0 (0) | 3 (0.32) | 0.74 | 0.55 |
| UQ | 216 (22.69) | 612 (64.29) | 115 (12.08) | 3 (0.32) | 4 (0.42) | 1 (0.11) | 1 (0.11) | 0.45 | 0.21 |
| Median | 157 (16.49) | 643 (67.54) | 142 (14.92) | 7 (0.74) | 2 (0.21) | 0 (0) | 1 (0.11) | 0.45 | 0.22 |
| Quantile | 125 (13.13) | 655 (68.80) | 161 (16.91) | 6 (0.63) | 3 (0.32) | 1 (0.11) | 1 (0.11) | 0.42 | 0.18 |
| TMM | 236 (24.79) | 599 (62.92) | 108 (11.34) | 4 (0.42) | 4 (0.42) | 0 (0) | 1 (0.11) | 0.47 | 0.23 |
| DESeq | 231 (24.26) | 610 (64.08) | 104 (10.89) | 4 (0.42) | 2 (0.21) | 0 (0) | 1 (0.11) | 0.43 | 0.19 |
| TPM | 157 (16.49) | 643 (67.54) | 142 (14.92) | 7 (0.74) | 2 (0.21) | 0 (0) | 1 (0.11) | 0.45 | 0.22 |
| RPKM | 603 (63.34) | 319 (33.51) | 26 (2.73) | 2 (0.21) | 0 (0) | 0 (0) | 2 (0.21) | 1.04 | 1.03 |
| Lib_size | 202 (21.22) | 617 (64.81) | 123 (12.92) | 7 (0.74) | 2 (0.21) | 0 (0) | 1 (0.11) | 0.43 | 0.20 |
Abbreviations: TC total counts, UQ upper quantile, TMM trimmed mean of M-values, DESeq a differential expression package implemented in R, TPM transcripts per kilobase million, RPKM reads per kilobase per million mapped reads, Lib_size library size
*The expression ratio of each housekeeping gene was calculated by its mean normalized read divided by the maximum reads in its corresponding HKG set
**The bias and variance of each normalization method was calculated by the formulae
Fig. 2Box plots showing the median, first quartile, third quartile, and range of CV (coefficient of variance) for all 952 HKG defined by nine different normalization algorithms (a) and for the subset of 42 HKG common to all nine normalization methods (b) . a The median values (range) of CV in 952 HKGs defined by RPKM and TC are 0.67 (0.65–0.69) and 0.44 (0.41–0.45), respectively; whereas the mean values of CV defined by UQ, Median, Quantile, TMM, DESeq, TPM and Library size are 0.31 (0.28–0.33), 0.29 (0.27–0.31), 0.29 (0.26–0.31), 0.31 (0.29–0.33), 0.30 (0.27–0.32), 0.29 (0.27–0.31), 0.29 (0.26–0.31), respectively. b The median values (range) of CV in 42 common HKGs defined by RPKM and TC are 0.67 (0.65–0.69) and 0.43 (0.42–0.45), respectively; whereas the mean values of CV defined by UQ, Median, Quantile, TMM, DESeq, TPM and Library size are 0.28 (0.26–0.31), 0.25 (0.23–0.28), 0.25 (0.22–0.29), 0.26 (0.24–0.30), 0.25 (0.22–0.29), 0.25 (0.23–0.28), 0.25 (0.22–0.28), respectively. TC: total counts; UQ: upper quartile; TMM: trimmed mean of M-values; TPM: transcripts per kilobase million; RPKM: reads per kilobase per million mapped reads; e (see Materials and Methods section for details)
Overlapsa Between Gene Expression Datasets Derived from Diseased Allograft Kidney & HKG Defined in This Study
| Reference | #Biopsies | #of DE transcripts | Biopsy Diagnosis | Normalization method Used to Define HKG | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TC (%) | UQ (%) | Median (%) | Quantile (%) | TMM (%) | DESeq (%) | TPM (%) | RPKM (%) | Lib_siz (%) | ||||
| [ | 703 | 453 | ABMR | 2(0.40) | 2(0.44) | 2(0.44) | 2(0.44) | 1(0.22) | 2(0.44) | 2(0.44) | 8(1.77) | 1(0.22) |
| [ | 708 | 82 | TCMR | 0(0) | 0(0) | 0(0) | 0(0) | 0(0) | 0(0) | 0(0) | 0(0) | 0(0) |
| [ | 168 | 206 | BKVN | 3(1.46) | 5(2.43) | 3(1.46) | 3(1.46) | 3(1.46) | 3(1.46) | 3(1.46) | 3(1.46) | 3(1.46) |
| [ | 204 | 82 | Chronic allograft damage | 1(1.22) | 1(1.22) | 2(2.44) | 2(2.44) | 1(1.22) | 1(1.22) | 2(2.44) | 1(1.22) | 2(2.44) |
Abbreviations: ABMR antibody mediated rejection, DE differentially expressed, TCMR T-cell mediated rejection, BKVN polyomavirus nephropathy, For other abbreviations, see legend to Table 1
aThe total number of overlapping genes with the specified datasets is enumerated
Comparison of Published housekeeping genes with HKG Datasets Defined in This Study
| Study | #samples | #HKG | #Tissues/cells studied | Technique | Normalization method | Housekeeping Gene Set Stratified by Normalization Methoda | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TC (%) | UQ (%) | Median (%) | Quantile (%) | TMM (%) | DESeq (%) | TPM (%) | RPKM (%) | Lib_siz (%) | ||||||
| (4) | 142 | 1789 | 79 | Microarray | NAb | 94 (5.25) | 91 (5.09) | 115 (6.43) | 111 (6.20) | 92 (5.14) | 89 (4.97) | 115 (6.43) | 76 (4.25) | 96 (5.37) |
| (5) | 18 | 2403 | 18 | Microarray | NA | 110 (4.58) | 145 (6.03) | 158 (6.58) | 161 (6.70) | 132 (5.49) | 124 (5.16) | 158 (6.58) | 103 (4.29) | 133 (5.53) |
| (6) | 42 | 1522 | 42 | Microarray | NA | 89 (5.84) | 112 (7.36) | 117 (7.69) | 115 (7.56) | 87 (5.72) | 88 (5.78) | 117 (7.69) | 80 (5.26) | 92 (6.04) |
| (6) | NA | 15,050 | 32 | sequencing_MPSS | TPM | 516 (3.43) | 578 (3.84) | 566 (3.76) | 581 (3.86) | 550 (3.65) | 559 (3.71) | 566 (3.76) | 489 (3.25) | 559 (3.71) |
| (7) | 2502 | 6909 | 18 | Sequencing_EST | NA | 398 (5.76) | 542 (7.84) | 533 (7.71) | 551 (7.98) | 463 (6.70) | 458 (6.63) | 533 (7.71) | 369 (5.34) | 471 (6.82) |
| (8) | NA | 12,714 | 19 | sequencing_EST | NA | 583 (4.59) | 627 (4.93) | 642 (5.05) | 656 (5.16) | 610 (4.80) | 620 (4.88) | 642 (5.05) | 546 (4.29) | 628 (4.94) |
| (9) | NA | 7896 | 32 | RNA-Seq | RPKM | 514 (6.51) | 628 (7.95) | 654 (8.28) | 656 (8.31) | 601 (7.61) | 594 (7.52) | 654 (8.28) | 441 (5.59) | 615 (7.79) |
| (10) | 16 | 3804 | 16 | RNA-Seq | RPKM | 279 (7.33) | 361 (9.49) | 372 (9.78) | 379 (9.96) | 317 (8.33) | 315 (8.28) | 372 (9.78) | 212 (5.57) | 329 (8.64) |
Abbreviations: EST expressed sequence tags, HKG housekeeping genes, MPSS Massively parallel signature sequencing, ABMR antibody mediated rejection, DE differentially expressed, TCMR T-cell mediated rejection, PVAN polyomavirus nephropathy, NA not available; for other abbreviations, see legend to Table 1
aThe total number (%) of overlapping genes with the specified datasets is enumerated. Percentage calculations are based on the total number of HKG in column 3
bThe normalization methods in these references were not mentioned, but the most common method used for microarray data is Quantile normalization
Housekeeping Genes (n = 42) Common to All Normalization Methods
| Entrez Gene ID | Transcripts | Entrez Gene Name | Molecular function |
|---|---|---|---|
| 51,433 | ANAPC5 | anaphase promoting complex subunit 5 | protein phosphatase binding |
| 25,906 | ANAPC15 | anaphase promoting complex subunit 15 | anaphase-promoting complex |
| 10,620 | ARID3B | AT-rich interaction domain 3B | transcription regulator |
| 285,598 | ARL10 | ADP ribosylation factor like GTPase 10 | small GTPase mediated signal transduction |
| 6311 | ATXN2 | ataxin 2 | epidermal growth factor receptor binding |
| 57,020 | C16orf62 | chromosome 16 open reading frame 62 | protein binding |
| 132,200 | C3orf49 | chromosome 3 open reading frame 49 | unknown |
| 55,749 | CCAR1 | cell division cycle and apoptosis regulator 1 | core promoter binding |
| 202,243 | CCDC125 | coiled-coil domain containing 125 | regulation of cell motility |
| 60,492 | CCDC90B | coiled-coil domain containing 90B | protein binding |
| 55,743 | CHFR | checkpoint with forkhead and ring finger domains | E3 ubiquitin-protein ligase |
| 207,063 | DHRSX | dehydrogenase/reductase X-linked | oxidoreductase activity |
| 83,786 | FRMD8 | FERM domain containing 8 | protein binding |
| 26,088 | GGA1 | golgi associated, gamma adaptin ear containing, ARF binding protein 1 | cellular protein metabolic process |
| 26,091 | HERC4 | HECT and RLD domain containing E3 ubiquitin protein ligase 4 | transferase activity; ubiquitin-protein ligase activity |
| 8569 | MKNK1 | MAP kinase interacting serine/threonine kinase 1 | ATP binding; calcium-dependent protein serine/threonine kinase activity |
| 4678 | NASP | nuclear autoantigenic sperm protein | histone binding; Hsp90 protein binding; |
| 4833 | NME4 | NME/NM23 nucleoside diphosphate kinase 4 | ubiquitous enzymes |
| 55,611 | OTUB1 | OTU deubiquitinase, ubiquitin aldehyde binding 1 | NEDD8-specific protease activity |
| 11,243|100,527,963 | PMF1/PMF1-BGLAP | polyamine modulated factor 1 | leucine zipper domain binding |
| 5431 | POLR2B | RNA polymerase II subunit B | chromatin binding |
| 11,128 | POLR3A | RNA polymerase III subunit A | chromatin binding |
| 84,197 | POMK | protein-O-mannose kinase | ATP binding; carbohydrate kinase activity |
| 379,025 | PSMA3-AS1 | PSMA3 antisense RNA 1 | unknown |
| 5784 | PTPN14 | protein tyrosine phosphatase, non-receptor type 14 | hydrolase activity; phosphatase activity |
| 51,735 | RAPGEF6 | Rap guanine nucleotide exchange factor 6 | GTP-dependent protein binding |
| 5966 | REL | REL proto-oncogene, NF-kB subunit | chromatin binding; DNA binding |
| 8568 | RRP1 | ribosomal RNA processing 1 | RNA binding |
| 146,923 | RUNDC1 | RUN domain containing 1 | GTPase activator activity; Rab GTPase binding |
| 55,095 | SAMD4B | sterile alpha motif domain containing 4B | mRNA binding; |
| 22,950 | SLC4A1AP | solute carrier family 4 member 1 adaptor protein | mRNA binding; protein binding |
| 7871 | SLMAP | sarcolemma associated protein | protein binding |
| 50,485 | SMARCAL1 | SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a like 1 | ATP binding; DNA-dependent ATPase activity |
| 9342 | SNAP29 | synaptosome associated protein 29 | protein binding; SNAP receptor activity |
| 23,020 | SNRNP200 | small nuclear ribonucleoprotein U5 subunit 200 | ATP binding; ATP-dependent helicase activity |
| 6827 | SUPT4H1 | SPT4 homolog, DSIF elongation factor subunit | metal ion binding; protein binding |
| 25,771 | TBC1D22A | TBC1 domain family member 22A | 14–3-3 protein binding; GTPase activator activity |
| 440,944 | THUMPD3-AS1 | THUMPD3 antisense RNA 1 | unknown |
| 100,506,779 | TSPOAP1-AS1 | TSPOAP1 antisense RNA 1 | unknown |
| 10,844 | TUBGCP2 | tubulin gamma complex associated protein 2 | gamma-tubulin binding |
| 23,038 | WDTC1 | WD and tetratricopeptide repeats 1 | enzyme inhibitor activity |
| 27,300 | ZNF544 | zinc finger protein 544 | DNA binding; metal ion binding |
Fig. 3Canonical pathways identified by IPA core analysis as over-represented amongst 42 HKG common to 9 different data normalization methods. Pathways meeting statistical confidence thresholds preset in IPA are identified on the Y-axis (−log10 p = 1.3, right-tailed Fisher’s exact test). The lower X-axis and the line diagram display the proportion of total genes in the specified pathway that meet the cutoff criteria for identification
Fig. 4Top 20 physiologic functions associated with 42 HKG common to all biopsies and normalization methods. Physiological functions meeting statistical confidence thresholds (−log10 p = 1.3, right-tailed Fisher’s exact test)
Canonical Pathways identified by IPA software for 42 Housekeeping Genes Common to All Normalization Methods
| Ingenuity Canonical Pathways | -log( | Ratio | Molecules |
|---|---|---|---|
| Pyrimidine Ribonucleotides Interconversion | 2.58 | 0.0444 | NME4,SMARCAL1 |
| Pyrimidine Ribonucleotides De Novo Biosynthesis | 2.55 | 0.0426 | NME4,SMARCAL1 |
| Salvage Pathways of Pyrimidine Ribonucleotides | 1.95 | 0.0211 | NME4,POMK |
| Pyrimidine Deoxyribonucleotides De Novo Biosynthesis I | 1.42 | 0.0435 | NME4 |
| Nucleotide Excision Repair Pathway | 1.24 | 0.0286 | POLR2B |
| Assembly of RNA Polymerase II Complex | 1.09 | 0.02 | POLR2B |
| Pyridoxal 5′-phosphate Salvage Pathway | 0.983 | 0.0154 | POMK |
| Mitotic Roles of Polo-Like Kinase | 0.977 | 0.0152 | ANAPC5 |
| Protein Kinase A Signaling | 0.836 | 0.005 | PTPN14,ANAPC5 |
| Androgen Signaling | 0.767 | 0.00901 | POLR2B |
| p38 MAPK Signaling | 0.736 | 0.00833 | MKNK1 |
| RhoA Signaling | 0.723 | 0.00806 | RAPGEF6 |
| Estrogen Receptor Signaling | 0.711 | 0.00781 | POLR2B |
| Hereditary Breast Cancer Signaling | 0.665 | 0.00694 | POLR2B |
| IL-12 Signaling and Production in Macrophages | 0.66 | 0.00685 | REL |
| Regulation of eIF4 and p70S6K Signaling | 0.632 | 0.00637 | MKNK1 |
| CREB Signaling in Neurons | 0.568 | 0.00538 | POLR2B |
| RAR Activation | 0.56 | 0.00526 | REL |
| ERK/MAPK Signaling | 0.541 | 0.005 | MKNK1 |
| Systemic Lupus Erythematosus Signaling | 0.499 | 0.00444 | SNRNP200 |
| Huntington’s Disease Signaling | 0.461 | 0.004 | POLR2B |
| Protein Ubiquitination Pathway | 0.441 | 0.00377 | ANAPC5 |
| Glucocorticoid Receptor Signaling | 0.358 | 0.00295 | POLR2B |
| Axonal Guidance Signaling | 0.27 | 0.00221 | MKNK1 |