| Literature DB >> 31521126 |
P K Smitha1,2, K Vishnupriyan3, Ananya S Kar3,4, M Anil Kumar2, Christopher Bathula3, K N Chandrashekara5, Sujan K Dhar6, Manjula Das7.
Abstract
BACKGROUND: Cotton is one of the most important commercial crops as the source of natural fiber, oil and fodder. To protect it from harmful pest populations number of newer transgenic lines have been developed. For quick expression checks in successful agriculture qPCR (quantitative polymerase chain reaction) have become extremely popular. The selection of appropriate reference genes plays a critical role in the outcome of such experiments as the method quantifies expression of the target gene in comparison with the reference. Traditionally most commonly used reference genes are the "house-keeping genes", involved in basic cellular processes. However, expression levels of such genes often vary in response to experimental conditions, forcing the researchers to validate the reference genes for every experimental platform. This study presents a data science driven unbiased genome-wide search for the selection of reference genes by assessing variation of > 50,000 genes in a publicly available RNA-seq dataset of cotton species Gossypium hirsutum. RESULT: Five genes (TMN5, TBL6, UTR5B, AT1g65240 and CYP76B6) identified by data-science driven analysis, along with two commonly used reference genes found in literature (PP2A1 and UBQ14) were taken through qPCR in a set of 33 experimental samples consisting of different tissues (leaves, square, stem and root), different stages of leaf (young and mature) and square development (small, medium and large) in both transgenic and non-transgenic plants. Expression stability of the genes was evaluated using four algorithms - geNorm, BestKeeper, NormFinder and RefFinder.Entities:
Keywords: Cotton; Data science; Gossypium hirsutum; Reference gene; Transgenic; qPCR
Mesh:
Substances:
Year: 2019 PMID: 31521126 PMCID: PMC6744693 DOI: 10.1186/s12870-019-1988-3
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Fig. 1Cluster of genes in the three-dimensional space of CV, MAD and 1-p obtained using the PAM method. Genes marked in red represent cluster #1
Medoid Z scores of the clusters
| Cluster | Number of genes | Z-score values for the medoid in each cluster | ||
|---|---|---|---|---|
| CV | MAD | 1-p | ||
| 1 | 5973 | −0.599 | −0.676 | 0.221 |
| 2 | 4061 | 0.426 | 0.933 | 0.223 |
Fig. 2Work Flow to identify candidate reference genes with least variations and validation of the genes in experiment
List of selected candidate reference gene for expression analysis and validation
| Gene.Name | NCBI ref. seq | Description | Function |
|---|---|---|---|
| CYP76B6 | XM_016861559 | Geraniol 8-hydroxylase | Heme binding and oxidoreductase activity |
| RPK2 | XM_016855096.1 | LRR receptor-like serine/threonine-protein kinase RPK2 | Protein Kinase activity |
| At1g65240 | XM_016888563.1 | Aspartic proteinase-like protein 2 | Involved in aspartic-type endopeptidase activity |
| COV1 | XM_016863942.1 | Protein continuous vascular ring 1 | Negatively regulates the differentiation of vascular tissue in the stem. |
| AZG1 | XM_016863550.1 | Adenine/guanine permease AZG1 | Transports natural purines and purine analogs. Confers sensitivity to 8-azaadenine and 8-azaguanine |
| EMB8 | XM_016834287.1 | Embryogenesis-associated protein EMB8 | Role in embryogenesis? |
| TMK3 | XM_016813562.1 | Receptor-like kinase TMK3 | Auxin signal transduction, cell expansion, proliferation and regulation |
| UTR5B | XM_016900481.1 | UDP-galactose/UDP-glucose transporter 5B | Sugar transporter |
| TMN5 | XM_016895405.1 | Transmembrane 9 superfamily member 5 | Protein localization |
| TBL6 | XM_016880182.1 | Protein trichome birefringence-like 6 | O-acetyltransferase activity |
| PP2A1 | XM_016840233.1 | Protein Phosphatase 2A | Phosphatase activity |
| UBQ14 | XM_016867963.1 | Polyubiquitin | Ubiqutination Reaction |
Primer sequences and efficiency of the shortlisted primers used in this study
| Gene.Name | Primers (5′ to 3′) | Efficiency (%) |
|
|---|---|---|---|
| PP2A1 | F-GATCCTTGTGGAGGAGTGGA | 93.54 | 0.99 |
| R-GCGAAACAGTTCGACGAGAT | |||
| TMN5 | F-CTCACCATTCCATTACTTGTGTTG | 103.24 | 0.97 |
| R-GAGGAATCTCTCTCGGGTATCT | |||
| UBQ14 | F-CAACGCTCCATCTTGTCCTT | 103.74 | 0.99 |
| R-TGATCGTCTTTCCCGTAAGC | |||
| TBL6 | F-AGCAGATCCAGAGACAAGAAAG | 95.14 | 0.99 |
| R-CCATTGTAGGTGCAGGTGTAT | |||
| UTR5B | F-CGGTCTCTGCTGGTTCTTTAG | 94.91 | 0.99 |
| R-TGACATGTTGTGGTTAGGATGT | |||
| At1g65240 | F-GCAAACTCTACAGCTCCCATTA | 104.42 | 0.99 |
| R-GTCCAAACCCGAAGATTCCA | |||
| CYP76B6 | F-TGGCTTGGATGCCTGTTT | 103.71 | 0.99 |
| R-TCGCCGTAAGTGTTGGTTAG |
Fig. 3Observed expression values of candidate reference genes across normal and transgenic categories, with median expression value of each gene represented by middle horizontal lines in the box plot
Fig. 4Observed expression values of candidate reference genes across (a) different ages of the plant and (b) various plant parts, with median expression value of each gene represented by middle horizontal lines in the box plot
Fig. 5Observed expression values of candidate reference genes across (a) two maturity levels of the leaves and (b) different sizes of the squares, with median expression value of each gene represented by middle horizontal lines in the box plot
Fig. 6Stability Ranks of the chosen reference genes candidate using four different algorithms - geNorm, BestKeeper, NormFinder and RefFinder