| Literature DB >> 28830520 |
Rosario Carmona1, Macarena Arroyo2, María José Jiménez-Quesada1, Pedro Seoane3, Adoración Zafra1, Rafael Larrosa4, Juan de Dios Alché1, M Gonzalo Claros5.
Abstract
BACKGROUND: Gene expression analyses demand appropriate reference genes (RGs) for normalization, in order to obtain reliable assessments. Ideally, RG expression levels should remain constant in all cells, tissues or experimental conditions under study. Housekeeping genes traditionally fulfilled this requirement, but they have been reported to be less invariant than expected; therefore, RGs should be tested and validated for every particular situation. Microarray data have been used to propose new RGs, but only a limited set of model species and conditions are available; on the contrary, RNA-seq experiments are more and more frequent and constitute a new source of candidate RGs.Entities:
Keywords: Cancer; Normalization; Olive (Olea europaea L.); Quantitative PCR; Real-time PCR; Reference genes
Mesh:
Year: 2017 PMID: 28830520 PMCID: PMC5568602 DOI: 10.1186/s12938-017-0356-5
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Fig. 1Flow diagram as provided by AutoFlow for the detection of RGs using the SRA datasets of PRJEB9470 from Arabidopsis. a The first workflow that prepares the reads, maps them on the transcriptome and provides the read count table. b The findRGs workflow for detecting candidate RGs. In this example, several filtering parameters were tested: 10 and 20% for the maximum CV, and 10,000, 30,000 and 100,000 reads for the minimum counted reads per transcript and condition. One Venn diagram by each CV cut-off is obtained, as shown in Figs. 2, 4, 5 and 6
Fig. 2Venn diagrams summarizing the number of RGs obtained for reproductive tissues of olive tree. Two cut-off values were used for CV and three for counted reads. Reproductive RGs were obtained after combining both pollen and pistil reads
Fig. 4Venn diagrams summarizing the number of RGs obtained for Arabidopsis thaliana. Two cut-off values were used for CV and three for counted reads
Fig. 5Venn diagrams summarizing the number of RGs obtained for matched samples of normal and malignant tissues of three different human cancers: prostate, small-cell lung cancer and lung adenocarcinoma. Two cut-off values per cancer were used for CV and different counted reads depending on the tissue
Fig. 6Venn diagrams summarizing the number of RGs obtained for different combinations of lung samples: samples from only normal lung (a) or normal and malignant lung (b) were analyzed with two CV cut-off values per combination and two different counted reads
Primers used for PCR amplification
| Gene | Direction | Sequence |
|---|---|---|
| 18S | Forward | 5′-TTT GAT GGT ACC TGC TAC TCG GAT AAC C |
| Reverse | 5′-CTC TCC GGA ATC GAA CCC TAA TTC TCC | |
| Ubiquitin monomer to pentamer | Forward | 5′-ATGCAGAT(C/T)TTTGTGAAGAC |
| Reverse | 5′-ACCACCACG(G/A)AGACGGAG | |
| Actin | Forward | 5′-TTG CTC TCG ACT ATG AAC AGG |
| Reverse | 5′-CTC TCG GCC CCA ATA GTA ATA | |
| Mitogen-activated protein kinase | Forward | 5′-CCAGGCGAGATTTCAGAGAC |
| Reverse | 5′-TCGGTTTAAGGTCTCGATGG | |
| Proline transporter | Forward | 5′-TTGTAGTGAGGGGCGGTTAC |
| Reverse | 5′-CATGCAACCAAAGAAGCAGA | |
|
| Forward | 5′-ACAAAAGGCATTGCTTGGTC |
| Reverse | 5′-GGCCAAAACGAAGTTTACCA | |
| Gliceraldehyde-3-phosphate dehydrogenase | Forward | 5′-GGGCAAGATCAAGATTGGAA |
| Reverse | 5′-GTCTTCTCGCCGAACAAAAG | |
| Salicylic acid-binding protein | Forward | 5′-GCATTGACCCGAAAATCCTA |
| Reverse | 5′-AGGATGGCGGATTTGTAGTG | |
|
| Forward | 5′-AGCTTCTGGCATCAGGAAAA |
| Reverse | 5′-AGCCAGTACCCTCTCAAGCA |
Workflow execution times estimated for three datasets
| Species/tissue | No. raw reads | Mean length (nt) | No. transcripts | Pre-processing | Mapping | FindRGs | Total |
|---|---|---|---|---|---|---|---|
| Olive tree pistil | 767,963 | 525 | 9157 | 24 min 45 s | 26 s | 5 s | 25 min 16 s |
| 8 nodes, 192 cpus | 3 nodes, 72 cpus | 1 node, 9 cpus | |||||
|
| 23,821,198 (x2) | 100 (x2) | 35,386 | 43 s | 19 s | 0.2 s | 1 min 2 s |
| 8 nodes, 192 cpus | 2 nodes, 48 cpus | 1 node, 9 cpus | |||||
| Human prostate | 969,884,666 (x2) | 90 (x2) | 176,241 | 28 s | 2 min 37 s | 0.03 s | 3 min 5 s |
| 96 nodes, 2304 cpus | 24 nodes, 576 cpus | 1 node, 9 cpus |
All time values are referred to 100,000 reads when executed on SUSE® Linux Enterprise Server v12 using Opteron processors with 4 GB/core of RAM
Best RGs in reproductive tissues (combination of pollen and pistil) of olive tree according to Fig. 2c and ranked by CV
| Transcript_id | RPMM | CV (%) | Mean RPMM | Best hit | Description | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| PM | PG1 | PG5 | S2 | S3 | S4 | |||||
| rp11_olive_006695 | 205 | 209 | 206 | 208 | 177 | 154 | 10.72 | 193.2 | Q39011 | Shaggy-related protein kinase eta |
| rp11_olive_000781 | 105 | 140 | 112 | 104 | 108 | 127 | 11.36 | 116 | Q94A41 | Alpha-amylase 3, chloroplastic |
| rp11_olive_006061 | 327 | 272 | 238 | 343 | 285 | 253 | 13.16 | 286.3 | Q8VZ80 | Polyol transporter 5 |
| rp11_olive_006091 | 283 | 221 | 211 | 208 | 177 | 190 | 15.65 | 215 | A0A022R151 | Uncharacterized protein |
| rp11_olive_010107 | 228 | 213 | 184 | 250 | 295 | 199 | 15.98 | 228.2 | O23254 | Serine hydroxymethyltransferase 4 |
| rp11_olive_005197_split_1 | 366 | 430 | 381 | 343 | 423 | 552 | 16.37 | 415.8 | Q42679 | S-adenosylmethionine decarboxylase proenzyme |
| rp11_olive_003279 | 122 | 179 | 153 | 114 | 118 | 145 | 16.64 | 138.5 | Q9LV37 | Mitogen-activated protein kinase 9 |
| rp11_olive_000623 | 94 | 94 | 58 | 104 | 108 | 100 | 17.68 | 93 | A0A068V6W8 | Coffea canephora DH200 = 94 genomic scaffold, scaffold_132 |
| rp11_olive_007981 | 144 | 149 | 108 | 156 | 147 | 91 | 18.2 | 132.5 | Q93Y40 | Oxysterol-binding protein-related protein 3C |
| rp11_olive_005099 | 888 | 728 | 678 | 530 | 550 | 579 | 18.84 | 658.8 | P53492 | Actin-7 |
| rp11_olive_005815 | 311 | 272 | 256 | 322 | 364 | 444 | 19.03 | 328.2 | P17598 | Catalase isozyme 1 |
| rp11_olive_000209_split_1 | 161 | 115 | 117 | 166 | 118 | 100 | 19.16 | 129.5 | Q67YI9-2 | 2 of Clathrin interactor EPSIN 2 |
| rp11_olive_001245 | 161 | 175 | 108 | 166 | 187 | 118 | 19.16 | 152.5 | A5A7I7 | Calcium-dependent protein kinase 4 |
| rp11_olive_008079 | 239 | 204 | 197 | 343 | 275 | 263 | 19.34 | 253.5 | M1AVD3 | Uncharacterized protein |
| rp11_olive_008883 | 128 | 119 | 144 | 187 | 108 | 118 | 19.51 | 134 | Q9LZI2 | UDP-glucuronic acid decarboxylase 2 |
| rp11_olive_035033 | 178 | 166 | 224 | 177 | 285 | 199 | 19.76 | 204.8 | P62201 | Calmodulin |
| rp11_olive_029725 | 211 | 128 | 184 | 177 | 157 | 118 | 19.82 | 162.5 | O04834 | GTP-binding protein SAR1A |
They were obtained for different stages of pollen and pistil with CV < 20% and minimum counted reads of 10. Transcript_id: transcript identifiers in the ReprOlive transcriptome
Fig. 3Preliminary RT-PCR validation of RGs predicted in this work in olive tissues in comparison to 18S
Best RGs in Arabidopsis thaliana according to Fig. 4 and ranked by CV
|
| transcript_id | RPMM | CV (%) | Mean RPMM | Description | |
|---|---|---|---|---|---|---|
| Col_0 | Kil_0 | |||||
| Replicate 1 | AT1G67090.1 | 9476 | 9191 | 1.53 | 9333.5 | Ribulose bisphosphate carboxylase small chain 1A |
| AT5G38410.1 | 7760 | 8135 | 2.36 | 7947.5 | Ribulose bisphosphate carboxylase (small chain) family protein | |
| AT5G38430.1 | 7054 | 7548 | 3.38 | 7301 | Ribulose bisphosphate carboxylase (small chain) family protein | |
| AT2G39730.1 | 4051 | 4343 | 3.48 | 4197 | Rubisco activase | |
| AT5G38420.1 | 7329 | 7889 | 3.68 | 7609 | Ribulose bisphosphate carboxylase (small chain) family protein | |
| Replicate 2 | AT2G39730.1 | 3906 | 4323 | 5.07 | 4114.5 | Rubisco activase |
| AT1G67090.1 | 8523 | 9636 | 6.13 | 9079.5 | ribulose bisphosphate carboxylase small chain 1A | |
| AT1G21310.1 | 7013 | 7976 | 6.42 | 7494.5 | Extensin 3 | |
| AT5G38410.1 | 7047 | 8438 | 8.98 | 7742.5 | Ribulose bisphosphate carboxylase (small chain) family protein | |
| Replicate 3 | AT5G38420.1 | 8708 | 8526 | 1.06 | 8617 | Ribulose bisphosphate carboxylase (small chain) family protein |
| AT5G38430.1 | 8424 | 8169 | 1.54 | 8296.5 | Ribulose bisphosphate carboxylase (small chain) family protein | |
| AT2G39730.1 | 4365 | 4524 | 1.79 | 4444.5 | Rubisco activase | |
| AT5G38410.1 | 9172 | 8822 | 1.95 | 8997 | Ribulose bisphosphate carboxylase (small chain) family protein | |
| AT1G67090.1 | 11,051 | 9694 | 6.54 | 10,372.5 | Ribulose bisphosphate carboxylase small chain 1A | |
They were obtained for the three replicates with CV < 10% and minimum counted reads of 100,000. Transcript_id: transcript identifiers in TAIR database
Best candidate RGs for normal and malignant prostate tissues according to Fig. 5a and ranked by CV
| Transcript_id | CV (%) | Mean RPMM | Gene | Description |
|---|---|---|---|---|
| ENST00000510199.5 | 8.95 | 99.8 | GNB2L1 | Guanine nucleotide binding protein (G protein), beta polypeptide 2-like 1 |
| ENST00000425566.1 | 9.91 | 127.3 | RPL23AP87 | Ribosomal protein L23a pseudogene 87 |
| ENST00000314138.10 | 10.52 | 134.9 | RPL27A | Ribosomal protein L27a |
| ENST00000412331.6 | 11.18 | 108.1 | EIF3L | Eukaryotic translation initiation factor 3 subunit L |
| ENST00000494591.1 | 11.49 | 78.4 | RPSAP36 | Ribosomal protein SA pseudogene 36 |
| ENST00000519807.5 | 11.5 | 168.1 | RPS20 | Ribosomal protein S20 |
| ENST00000356769.7 | 11.58 | 92.8 | NACA | Nascent polypeptide-associated complex alpha subunit |
| ENST00000496593.5 | 12.28 | 253.5 | RPLP0P2 | Ribosomal protein, large, P0 pseudogene 2 |
| ENST00000338970.10 | 12.63 | 176.9 | RPL14 | Ribosomal protein L14 |
| ENST00000610672.4 | 12.74 | 244 | MED22 | Mediator complex subunit 22 |
| ENST00000395957.6 | 12.97 | 95.5 | YWHAZ | Tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta |
| ENST00000234831.9 | 13.69 | 108.1 | TMEM59 | Transmembrane protein 59 |
| ENST00000353047.10 | 14.01 | 156.1 | CTSB | Cathepsin B |
| ENST00000556083.1 | 14.36 | 137.5 | ACTN1 | Actinin, alpha 1 |
| ENST00000558264.5 | 14.59 | 129.8 | TPM1 | Tropomyosin 1 (alpha) |
| ENST00000394621.6 | 14.87 | 189.2 | STEAP2 | STEAP2 metalloreductase |
| ENST00000335508.10 | 15.15 | 116 | SF3B1 | Splicing factor 3b subunit 1 |
| ENST00000341423.9 | 15.26 | 134.4 | HMGB1 | High mobility group box 1 |
| ENST00000564521.6 | 15.84 | 167 | ALDOA | Aldolase, fructose-bisphosphate A |
| ENST00000398752.10 | 16.45 | 200.6 | ATP5A1 | ATP synthase, H + transporting, mitochondrial F1 complex, alpha subunit 1, cardiac muscle |
| ENST00000264657.9 | 16.84 | 137.7 | STAT3 | Signal transducer and activator of transcription 3 (acute-phase response factor) |
| ENST00000357214.5 | 17.4 | 105.5 | SFPQ | Splicing factor proline/glutamine-rich |
| ENST00000456530.6 | 17.44 | 118.5 | RPL15 | Ribosomal protein L15 |
| ENST00000495596.5 | 17.64 | 164.2 | ATP5G2 | ATP synthase, H + transporting, mitochondrial Fo complex subunit C2 (subunit 9) |
| ENST00000391959.5 | 17.67 | 125.6 | PPP1R12B | protein phosphatase 1 regulatory subunit 12B |
| ENST00000369936.2 | 17.97 | 235 | KIAA1324 | KIAA1324 |
| ENST00000300619.11 | 19.01 | 104.5 | ZNF91 | Zinc finger protein 91 |
| ENST00000401722.7 | 19.21 | 156.4 | SLC25A3 | Solute carrier family 25 (mitochondrial carrier; phosphate carrier), member 3 |
| ENST00000618621.4 | 19.43 | 405.9 | LPP | LIM domain containing preferred translocation partner in lipoma |
| ENST00000249822.8 | 19.53 | 107.3 | ARPP19 | cAMP regulated phosphoprotein 19 kDa |
| ENST00000353411.10 | 19.68 | 122.8 | SKP1 | S-phase kinase-associated protein 1 |
| ENST00000375856.4 | 19.76 | 151.8 | IRS2 | Insulin receptor substrate 2 |
| ENST00000373316.4 | 19.79 | 118.1 | PGK1 | Phosphoglycerate kinase 1 |
| ENST00000306085.10 | 19.9 | 159.2 | TRIM56 | Tripartite motif containing 56 |
| ENST00000357308.8 | 20 | 105 | GFPT1 | Glutamine–fructose-6-phosphate transaminase 1 |
They were obtained with CV < 20% and minimum counted reads of 30,000. Transcript_id: human transcript identifiers in ENSEMBL database
Best candidate RGs for normal lung and small-cell lung cancer according to Fig. 5b and ranked by CV
| Transcript_id | CV (%) | Mean RPMM | Gene | Description |
|---|---|---|---|---|
| ENST00000425566.1 | 12.68 | 76.2 | RPL23AP87 | Ribosomal protein L23a pseudogene 87 |
| ENST00000338970.10 | 12.96 | 103.3 | RPL14 | Ribosomal protein L14 |
| ENST00000442744.6 | 13.28 | 69.4 | UBA52 | Ubiquitin A-52 residue ribosomal protein fusion product 1 |
| ENST00000456530.6 | 16.02 | 76.7 | RPL15 | Ribosomal protein L15 |
| ENST00000553521.5 | 16.21 | 50.2 | SRSF5 | Serine/arginine-rich splicing factor 5 |
| ENST00000373242.6 | 16.8 | 73 | SAR1A | Secretion associated, Ras related GTPase 1A |
| ENST00000261890.6 | 16.88 | 55.3 | RAB11A | RAB11A, member RAS oncogene family |
| ENST00000510199.5 | 17.11 | 66 | GNB2L1 | Guanine nucleotide binding protein (G protein), beta polypeptide 2-like 1 |
| ENST00000234115.10 | 17.59 | 63.6 | PLEKHB2 | Pleckstrin homology domain containing B2 |
| ENST00000401722.7 | 17.69 | 83.6 | SLC25A3 | Solute carrier family 25 (mitochondrial carrier; phosphate carrier), member 3 |
| ENST00000412331.6 | 17.76 | 54.6 | EIF3L | Eukaryotic translation initiation factor 3 subunit L |
| ENST00000422514.6 | 18.83 | 80.3 | RPL23A | Ribosomal protein L23a |
| ENST00000342374.4 | 19.13 | 45.2 | SERINC3 | Serine incorporator 3 |
| ENST00000483316.1 | 19.26 | 77.6 | BAZ2B | Bromodomain adjacent to zinc finger domain 2B |
| ENST00000335508.10 | 19.41 | 72.4 | SF3B1 | Splicing factor 3b subunit 1 |
| ENST00000471227.3 | 19.62 | 66.4 | RPL23AP2 | Ribosomal protein L23a pseudogene 2 |
| ENST00000334256.8 | 19.77 | 46.9 | KPNA4 | Karyopherin alpha 4 (importin alpha 3) |
| ENST00000332361.5 | 19.79 | 64.5 | RPL23AP57 | Ribosomal protein L23a pseudogene 57 |
| ENST00000416139.1 | 19.81 | 64.5 | RPL23AP18 | Ribosomal protein L23a pseudogene 18 |
| ENST00000495596.5 | 19.84 | 71.5 | ATP5G2 | ATP synthase, H + transporting, mitochondrial Fo complex subunit C2 (subunit 9) |
| ENST00000446445.1 | 19.87 | 64.1 | RPL23AP43 | Ribosomal protein L23a pseudogene 43 |
They were obtained with CV < 20% and minimum counted reads of 10,000. Transcript_id: human transcript identifiers in ENSEMBL database
Best candidate RGs for normal normal lung and lung adenocarcinoma according to Fig. 5c and ranked by CV
| Transcript_id | CV (%) | Mean RPMM | Gene | Description |
|---|---|---|---|---|
| ENST00000411857.2 | 16.34 | 224.7 | HNRNPA1P54 | Heterogeneous nuclear ribonucleoprotein A1 pseudogene 54 |
| ENST00000270460.10 | 18.06 | 204.1 | EPN1 | Epsin 1 |
| ENST00000373191.8 | 18.17 | 195.4 | AGO3 | Argonaute 3, RISC catalytic component |
| ENST00000323443.6 | 18.2 | 218.4 | LRRC57 | Leucine rich repeat containing 57 |
| ENST00000367975.6 | 18.35 | 204.8 | SDHC | Succinate dehydrogenase complex subunit C |
| ENST00000528973.1 | 18.42 | 211 | PCSK7 | Proprotein convertase subtilisin/kexin type 7 |
| ENST00000262160.10 | 18.7 | 214 | SMAD2 | SMAD family member 2 |
| ENST00000607772.5 | 18.73 | 200.3 | CNKSR3 | CNKSR family member 3 |
| ENST00000261854.9 | 18.85 | 198.2 | SPPL2A | Signal peptide peptidase like 2A |
| ENST00000398004.3 | 19.12 | 316.1 | SLC35E3 | Solute carrier family 35 member E3 |
| ENST00000396444.7 | 19.21 | 294 | USP8 | Ubiquitin specific peptidase 8 |
| ENST00000304177.9 | 19.28 | 212.4 | C15orf40 | Chromosome 15 open reading frame 40 |
| ENST00000328654.9 | 19.31 | 241.8 | ZNF26 | Zinc finger protein 26 |
| ENST00000307635.3 | 19.34 | 218.1 | ZNF556 | Zinc finger protein 556 |
| ENST00000258711.7 | 19.38 | 323.7 | CHST12 | Carbohydrate (chondroitin 4) sulfotransferase 12 |
| ENST00000329627.11 | 19.41 | 318.1 | PEX26 | Peroxisomal biogenesis factor 26 |
| ENST00000322122.7 | 19.49 | 192.7 | TRIM72 | Tripartite motif containing 72, E3 ubiquitin protein ligase |
| ENST00000238831.8 | 19.5 | 291.1 | YIPF4 | Yip1 domain family member 4 |
| ENST00000258149.9 | 19.71 | 222.8 | MDM2 | MDM2 proto-oncogene, E3 ubiquitin protein ligase |
| ENST00000253115.6 | 19.72 | 227.3 | ZNF426 | Zinc finger protein 426 |
| ENST00000614987.4 | 19.74 | 346.8 | RPS6KA5 | Ribosomal protein S6 kinase, 90 kDa, polypeptide 5 |
They were obtained with CV < 20% and minimum counted reads of 30,000. Transcript_id: human transcript identifiers in ENSEMBL database