Literature DB >> 34367198

Genome-Wide Identification and Evolutionary Analysis of Gossypium Tubby-Like Protein (TLP) Gene Family and Expression Analyses During Salt and Drought Stress.

Nasreen Bano1,2, Shafquat Fakhrah1, Chandra Sekhar Mohanty1,2, Sumit Kumar Bag1,2.   

Abstract

Tubby-like proteins (TLPs) possess a highly conserved closed β barrel tubby domain at C-terminal and N-terminal F-box. The role of TLP gene family members has been widely discussed in numerous organisms; however, the detailed genome-wide study of this gene family in Gossypium species has not been reported till date. Here, we systematically identified 105 TLP gene family members in cotton (Gossypium arboreum, Gossypium raimondii, Gossypium hirsutum, and Gossypium barbadense) genomes and classified them into eight phylogenetic groups. Cotton TLP12 gene family members clustered into two groups, 4 and 8. They experienced higher evolutionary pressure in comparison to others, indicating the faster evolution in both diploid as well as in tetraploid cotton. Cotton TLP gene family members expanded mainly due to segmental duplication, while only one pair of tandem duplication was found in cotton TLPs paralogous gene pairs. Subsequent qRT-PCR validation of seven putative key candidate genes of GhTLPs indicated that GhTLP11A and GhTLP12A.1 genes were highly sensitive to salt and drought stress. The co-expression network, pathways, and cis-regulatory elements of GhTLP11A and GhTLP12A.1 genes confirmed their functional importance in salt and drought stress responses. This study proposes the significance of GhTLP11A and GhTLP12A.1 genes in exerting control over salt and drought stress responses in G. hirsutum and also provides a reference for future research, elaborating the biological roles of G. hirsutum TLPs in both stress responses.
Copyright © 2021 Bano, Fakhrah, Mohanty and Bag.

Entities:  

Keywords:  expression; genome-wide analysis; network; phylogenetic analysis; salt and drought stress responses; transcription factor

Year:  2021        PMID: 34367198      PMCID: PMC8335595          DOI: 10.3389/fpls.2021.667929

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Tubby-like proteins are a family of bipartite transcription factors that were first studied in animals (Boggon et al., 1999; Santagata et al., 2001; Carroll et al., 2004) but have subsequently been identified from single-celled to multicellular organisms (Liu, 2008). Tubby-like proteins (TLPs) are characterized by the presence of the conserved C-terminal tubby domain, comprising of 12 anti-parallel closed β-barrel strands with a central hydrophobic α-helix (Boggon et al., 1999). In animals, TLPs are present in fewer numbers (ranging up to five) but have been ascribed to a wide range of cellular functions, including involvement in neuronal functions and development (Kleyn et al., 1996). In contrast to animals, the TLP family in plants is much larger with more than 10 members. Moreover, unlike animal TLPs, which possess a variable N-terminal region, plant TLP proteins possess a conserved N-terminal F-box domain besides the C-terminal tubby domain (Noben-Trauth et al., 1996; Gagne et al., 2002; Lai et al., 2004; Xu et al., 2016). F-box-comprising proteins are involved in the ubiquitin-mediated degradation of proteins (Kile et al., 2002), suggesting a role for TLPs in such processes. In plants, tubby-like proteins have been studied in some dicot and monocots, such as Arabidopsis thaliana, poplar, rice (Yang Z. et al., 2008), apple (Xu et al., 2016), and maize (Chen et al., 2016). In A. thaliana (At), 11 TLP genes have been identified while 14, 15, and 10 genes have been identified in rice, maize, and apple (Yang Z. et al., 2008; Chen et al., 2016; Xu et al., 2016). Studies of the TLP families in these plants reveal expression in different tissues and in response to different hormone treatments or under abiotic stress conditions (Lai et al., 2004; Liu, 2008; Xu et al., 2016). In Arabidopsis, AtTLP3, and AtTLP9 were found to function redundantly in response to abscisic acid (ABA) and osmotic stress treatments (Lai et al., 2004), while AtTLP9 was also demonstrated to respond in drought and salt stress (Lai et al., 2004; Bao et al., 2014). In apple, several TLP genes were upregulated in response to abiotic stress treatments, suggesting an important role for TLP genes in stress responses (Xu et al., 2016). In Cicer arietinum, CaTLP1 was expressed in response to dehydration stress, and its expression in tobacco led to enhanced tolerance to drought, oxidative, and salt stress (Wardhan et al., 2012). Collectively, these studies suggest that TLPs might have a significant function in the stress response of diverse plant species (Wang et al., 2018). However, the role of plants TLPs and their mode of action remain largely unknown (Zhang et al., 2020). Cotton (Gossypium spp.) is the most important natural fiber producing crop worldwide (Yang et al., 2020a). A lot of diversity exists in the Gossypium genus that includes six tetraploid species (2n = 52) and 45 diploids (2n = 26) (Hawkins et al., 2006; Grover et al., 2015). Interspecific hybridization events among the G. herbaceum (A1) or G. arboreum (A2) (A-genome ancestral African species) and G. raimondii or G. gossypioides (D6) (D-genome native species) have resulted in allotetraploid G. hirsutum (upland cotton) and G. barbadense (Senchina et al., 2003; Zhu and Li, 2013), which are possibly the oldest main allopolyploid crops (Paterson et al., 2012; Chalhoub et al., 2014; Marcussen et al., 2014). Diversity within the Gossypium species provides a perfect model for examining the evolution and polyploid domestication (Yang et al., 2020a) and has been facilitated further with the completion of the whole genome sequences of G. raimondii, G. arboreum, G. barbadense, and G. hirsutum in the last few years (Wang K. B. et al., 2012; Li et al., 2014; Liu X. et al., 2015; Zhang et al., 2015). The large evolutionary diversity in cotton allows it to adapt to several different types of regions with differing environmental conditions, although the molecular basis of this adaptation is not yet well-understood. We are interested in the evolution of the cotton TLP gene family and their possible roles in abiotic stress responses. So far, only a single member of the family GhTULP34 has been characterized (Li et al., 2020). In this study, we have carried out comprehensive genomic exploration of TLP protein family in G. raimondii, G. arboreum, G. barbadense, and G. hirsutum and studied the expression profiles of G. hirsutum TLPs (GhTLPs) in salt and drought stress responses. The aim of this study was to provide a comprehensive understanding of cotton TLP genes for future breeding programs for the improvement of plant quality, production, and response to abiotic stresses.

Materials and Methods

Identification of the TLP Gene Family in Gossypium Species

The protein, cDNA, gene annotation, and genome files (gff) of G. arboreum (BGI_A2 v1), G. raimondii (JGI v2), G. barbadense (HAU v2), and G. hirsutum (HAU v1) were retrieved from the CottonGen resources (Yu et al., 2014) while the protein sequence of A. thaliana was procured from the TAIR database (Lamesch et al., 2012). The Hidden Markov Model (HMM) profiles of the TLP domain (PF01167) were taken from the Pfam database (El-Gebali et al., 2019). The four cotton genomes were employed as queries, and the Pfam database was used as a reference to identify the TLP protein in the cotton dataset, using the HMMER search program (http://hmmer.wustl.edu/; Eddy, 2011). The identified TLP genes were further confirmed by BLASTP search and NCBI Batch-CD search (Lu et al., 2020).

Physiochemical Properties and Characterization of Cotton TLP Genes

The physiochemical properties, such as charge, molecular weight (Da), grand average of hydropathy (GRAVY), instability index, isoelectric points (pI) of G. arboreum (Ga), G. raimondii (Gr), G. barbadense (Gb), and G. hirsutum (Gh) TLPs, were determined through the ProtParam tool in the ExPASy web server (Gasteiger et al., 2003). The subcellular localization of cotton TLP proteins was predicted by using the software CELLO v.2.5 (Yu et al., 2006).

Analysis of the Encoded Protein Motif, Gene Structure, and miRNA Target Sites of Cotton TLP Proteins

The conserved protein motifs of cotton TLP genes were identified through the MEME (Multiple Em for Motif Elicitation) version 5.0.1 (Bailey et al., 2009) by employing the full-length proteins encoded by TLP genes in cotton. The exon-intron structure analysis was carried out with Gene Structure Display Server 2.0 (Guo et al., 2007), using the genomic and coding sequences of the identified cotton TLPs. To decipher the miRNA target sites in the cotton TLP transcripts, the complete sequences of all known and reported miRNAs of the four cotton species were fetched from the miRBase database (http://www.mirbase.org/; Kozomara et al., 2019). miRNA target site prediction analysis was performed through a plant small RNA target analysis server (psRNATarget 2017 release) (Dai et al., 2018), using the 376 cotton miRNA sequences.

Multiple Sequence Alignments (MSA) and Phylogenetic Analysis

To identify conserved regions of predicted cotton TLP proteins, multiple sequence alignments (MSA) were performed with the ClustalX2.1 program (Larkin et al., 2007), using default criteria. Phylogenetic tree construction was carried out through MEGA7 software (Kumar et al., 2016), with the maximum likelihood (ML) method, using the 1,000 bootstrap and the Jones-Taylor-Thornton (JTT) model. Visualization of the tree was carried out through Interactive Tree Of Life (iTOL; Letunic and Bork, 2019).

Gene Duplication Event, Chromosomal Distribution, and Synteny Analysis

To know the evolutionary mechanism of TLP gene in Gossypium, the paralogous TLP genes were identified in G. arboreum, G. raimondii, G. barbadense, and G. hirsutum, using a reciprocal blast with e-value 10−5. Paralogous genes are described as similarity of the aligned regions >70% and shared aligned region covering >70% of the gene length (Yang S. et al., 2008). The Ka/Ks ratio of orthologous and paralogous sequences was identified through the PAL2NAL program (Suyama et al., 2006), which was further used to compute the approximate date of duplication and divergence events with the formula T = Ks/2λ, assuming the clocklike rate (λ) of 1.5 synonymous substitutions per 10−8 years for cotton (Blanc and Wolfe, 2004; Tang et al., 2016). Moreover, the Ka/Ks ratio was also employed to show the selection pressure for the duplicated TLP genes. A Ka/Ks = 1, >1, and <1 demonstrate neutral, positive, and negative (purifying selection) evolution, respectively. Orthologous of TLP genes of cotton with A. thaliana and Theobroma cacao were identified via a reciprocal blast with an e-value 10−5. As per the result of the reciprocal blast, duplication events, and syntenic blocks of cotton TLP genes were detected through McScanX, and visualization of orthologous TLP genes between cotton (G. raimondii, G. arboreum, G. barbadense, and G. hirsutum) and two other species (A. thaliana and T. cacao) was performed through CIRCOS (Krzywinski et al., 2009; Wang Y. P. et al., 2012). The chromosomal location of all cotton TLP genes was found through the BLASTN search program on TLPs CDS sequences against the CottonGen (https://www.cottongen.org/) database. Total cotton TLP genes were mapped on the chromosome through Mapinspect software (http://mapinspect.software.informer.com/).

Expression Profile of Cotton TLP Gene Family Members Under Salt and Drought Stress Conditions

To gain insight into the expression profile of cotton TLP gene family members under salt and drought stress conditions, the Illumina RNA-Seq data of G. hirsutum (accession number: PRJNA532694) were retrieved from the NCBI database. The poor-quality reads were filtered by Fastx-toolkit (Schmieder and Edwards, 2011) and mapped to the G. hirsutum genome, using the TopHat2 (Kim et al., 2013). The estimation of transcript abundance was carried out with fragments per kilobase per million (FPKM) through Cufflinks software (Trapnell et al., 2012). The hierarchical clustered heatmap generation and visualization were done in the R program, using the pheatmap package.

RNA Isolation, cDNA Preparation, and Quantitative Real-Time PCR (qRT-PCR) Validation

The selected putative TLP genes were validated in 2-month-old drought and salt-stressed plants of G. hirsutum, grown in the field under normal photoperiodic conditions. Plants were grown in triplicate, and a single treatment of 300-mM NaCl was used to stimulate salt stress (Wei et al., 2017) and 20% PEG8000 solution to decrease the osmotic potential of the root, inducing drought stress (Shafiq et al., 2015). The non-treated plants were taken as control. Leaf tissues were collected at 6, 12, 24, 48, and 72 h after treatment, RNA isolated as per the protocol (Sigma USA), followed by cDNA (1 μg/μl) synthesis with the verso cDNA synthesis kit (Thermo scientific) as per the provided protocol. Expression of seven genes was checked by qRT-PCR fluorescent quantitative detection system (HiMedia Insta Q 48 M4), using fast the SYBERTM green master mix (Applied Biosystem) with primers designed with the help of primer express 3.0. Ubiquitin was taken as the internal control. The reaction conditions of qRT-PCR were 95°C for 10 min, followed by cycling for 40 cycles of denaturation at 95°C for 10 s, annealing at 56°C for 15 s and extension at 72°C for 30 s. Relative expression of the employed genes was calculated with mean ± SD of biological triplicate samples by the 2–ΔΔCt method (Livak and Schmittgen, 2001).

Co-expression Network and Metabolic Pathway Analysis of Negatively and Positively Co-expressed Genes With GhTLP11A and GhTLP12A.1

The co-expression network of the GhTLP11A and GhTLP12A.1 genes in salt and drought stress conditions was constituted by the FPKM values with the “expression correlation networks” module in Cytoscape version 3.8.0 (Smoot et al., 2011). The module calculated positive Pearson correlations (r ≥ 0.95) and negative correlations (r ≤ −0.95), with interacting members of the network. Network visualization and co-expression of genes were shown in Cytoscape by applying the force-directed layout. The important metabolic pathways and functional categories of positively and negatively co-expressed genes (PCoEGs and NCoEGs) with GhTLP11A and GhTLP12A.1 were estimated, using the MapMan software 3.5.1 version (Thimm et al., 2004). The average statistical test accompanied by the Benjamini Hochberg (multiple correction tests) was employed to know the functional categories.

Identification of cis-Regulatory Elements of GhTLP Genes and Homology Modeling of the Highly Expressed GhTLPs

The 2-Kb sequences upstream of GhTLP genes was analyzed for cis-regulatory elements by using the PlantCARE database (Lescot et al., 2002) by the “Signal Scan Search” program. The three-dimensional structure of GhTLPs was obtained through homology modeling, using the Phyre2 (Kelley et al., 2015) server. Structure visualization of GhTLPs was carried out with Chimera 1.11.1 version (Pettersen et al., 2004).

Statistical Analysis

The statistical analysis of qRT-PCR was carried out, using GraphPad Prism version 8.4.3 software, with two-tailed Student's T-tests in triplicate sample repeats.

Results

Genome-Wide Identification of TLP Genes in Gossypium Species

The protein sequences of G. arboreum, G. raimondii, G. hirsutum, and G. barbadense were utilized to identify cotton TLP genes. The identified TLP genes were confirmed through conserved domain searches. A total of 105 TLPs, i.e., 19 GaTLPs (G. arboreum), 18 GrTLPs (G. raimondii), 33 GhTLPs (G. hirsutum), and 35 GbTLPs (G. barabadense) were determined (Table 1). The length of cotton TLP proteins varied from 68 to 425 amino acid residues (aa) in G. arboreum, 320 to 519 aa in G. raimondii, 206 to 514 aa in G. hirsutum, and 206 to 494 aa in G. barabadense. The theoretical isoelectric point (pI) ranged from 5.1 to 9.7, 9 to 9.3, 8.3 to 9.8, and 7.6 to 9.8; the molecular weight ranged approximately from 8 to 48 kDa, 36 to 58 kDa, 23 to 58 kDa, and 23 to 55 kDa, and the number of introns ranged from 0 to 7, 4 to 8, 2 to 8, and 0 to 8 in G. arboreum, G. raimondii, G. barabadense, and G. hirsutum, respectively. Most of the identified cotton TLP proteins were predicted to be nuclear localized, and others were likely localized in extracellular space, mitochondrion, and on plasma membrane. For the annotation of 105 identified cotton TLP genes, the A. thaliana nomenclature system was pursued with numbers representing the highest sequence similarity with the corresponding AtTLP orthologous. Accordingly, the 19 GaTLPs were named as GaTLP2-GaTLP12 (GaTLP2.1, 2.2, 2.3, 2.4, 5.1, 5.2, 5.3, 6.1, 6.2, 6.3, 7.1, 7.2, 8, 11, 12.1, 12.2, 12.3, 12.4, and 12.5), GrTLPs were classified as GrTLP2-GrTLP12 (GrTLP2.1, 2.2, 2.3, 5.1, 5.2, 5.3, 6.1, 6.2, 6.3, 7.1, 7.2, 7.3, 8, 11, 12.1, 12.2, 12.3, and 12.4). Similarly, GhTLPs and GbTLPs were named GhTLP2-GhTLP12A/D and GbTLP2-GbTLP12A/D (A: At subgenome and D: Dt subgenome). The reciprocal blast demonstrated that cotton TLP genes showed greater homology with AtTLP2, AtTLP5, AtTLP6, AtTLP7, and AtTLP8 as compared with AtTLP1, AtTLP3, AtTLP4, AtTLP9, AtTLP10, and AtTLP11, respectively (Supplementary Table 1).
Table 1

Characteristics of TLP genes in cotton.

Gene nameGene idsChromosome locationLengthMolecular weight (Da)pINo. of intronSubcellular localizationNegatively charged residues (Asp + Glu)Positively charged residues (Arg + Lys)Instability indexStabilityAliphatic indexGrand average of hydropathicity (GRAVY)
GaTLP5.1Cotton_A_33944CA_chr10:6955456:6958125:–42447579.49.473Nuclear415961.21Unstable76.13−0.393
GaTLP11Cotton_A_08277CA_chr4:25409214:25412052:–40545396.099.384Nuclear345257.93Unstable75.06−0.32
GaTLP5.2Cotton_A_33887CA_chr2:80696652:80698513:+42147075.129.595Nuclear365754.47Unstable79.64−0.284
GaTLP5.3Cotton_A_00581CA_chr5:8963634:8965537:+42547770.929.665Nuclear386060.61Unstable77.53−0.327
GaTLP2.1Cotton_A_02013CA_chr6:29518659:29520756:+41446389.29.154Nuclear415361.74Unstable78.43−0.35
GaTLP12.2Cotton_A_34069CA_chr1:125845208:125847603:+37641970.329.444Nuclear284555.82Unstable80.96−0.276
GaTLP12.3Cotton_A_36302CA_chr11:88651667:88653976:+39343586.259.555Nuclear305159.31Unstable79.41−0.223
GaTLP6.2Cotton_A_06847CA_chr7:115616747:115618648:–40745540.239.565Nuclear355562.21Unstable76.88−0.335
GaTLP6.1Cotton_A_13768CA_chr8:15732526:15733950:–39343846.499.744Nuclear325654.37Unstable74.2−0.38
GaTLP12.1Cotton_A_29560CA_chr5:44377749:44379416:+34338292.169.484Nuclear264249.07Unstable82.48−0.237
GaTLP7.1Cotton_A_20151CA_chr3:24210746:24214201:+36640348.799.334Nuclear314655.84Unstable70.66−0.37
GaTLP7.2Cotton_A_29419CA_chr7:92995264:92998667:–38943679.89.163Nuclear405360.89Unstable66.97−0.49
GaTLP2.2Cotton_A_17032CA_chr9:56110460:56112674:–40345105.888.814Nuclear414962.52Unstable78.41−0.308
GaTLP6.3Cotton_A_18767CA_chr10:46424178:46425914:+42346938.879.665Nuclear366159.72Unstable74.04−0.418
GaTLP2.3Cotton_A_14780CA_chr6:87318145:87320189:–40345344.189.68Nuclear395868.88Unstable84.69−0.375
GaTLP8Cotton_A_17106CA_chr10:88944897:88947133:+41746630.129.34Nuclear395641.81Unstable71.58−0.56
GaTLP2.4Cotton_A_01603CA_ch1:135480257:135480915:+19722296.628.564Plasma Membrane151858.85Unstable77.26−0.042
GaTLP12.4Cotton_A_22539CA_chr10:13654011:13655817:+27330954.99.715Nuclear183642.07Unstable85.68−0.27
GaTLP12.5Cotton_A_22538CA_chr10:13655845:13656051:+687867.15.15Extracellular5636.17Stable89.090.213
GrTLP7.3Gorai.002G054300.1GR_chr02:4808034:4811930:+38943453.599.164Nuclear395262.86Unstable69.23−0.444
GrTLP5.2Gorai.002G207100.1GR_chr02:55351835:55355046:–42147047.059.640Nuclear365755.11Unstable79.41−0.295
GrTLP12.4Gorai.004G085800.1GR_chr04:10773031:10775214:+39744814.749.435Nuclear325252.32Unstable80.76−0.265
GrTLP7.2Gorai.004G271800.1GR_chr04:60665497:60669265:+38442453.189.34Nuclear344957.19Unstable68.62−0.437
GrTLP12.1Gorai.005G057200.1GR_chr05:5803516:5805608:+38442875.259.273Nuclear344848.73Unstable82.55−0.305
GrTLP5.3Gorai.005G259600.1GR_chr05:63517604:63520935:+42547759.919.665Nuclear386062.8Unstable75.69−0.348
GrTLP12.3Gorai.006G075400.1GR_chr06:29765762:29768708:+41346131.899.633Nuclear335461.19Unstable76.95−0.306
GrTLP2.2Gorai.007G050300.1GR_chr07:3545878:3549332:+41446419.219.194Nuclear415362.06Unstable78.19−0.352
GrTLP11Gorai.007G131500.1GR_chr07:10627794:10632340:–40545454.249.385Nuclear345257.21Unstable75.09−0.313
GrTLP2.3Gorai.008G067400.1GR_chr08:10983416:10986428:+41746564.649.188Nuclear415463.56Unstable83.26−0.294
GrTLP8Gorai.009G121900.1GR_chr09:9059982:9062702:+32036316.039.635Mitochondrial264637.99Unstable76.19−0.452
GrTLP6.1Gorai.009G201100.1GR_chr09:15567676:15569918:+40945707.869.84Nuclear325952.16Unstable77.02−0.336
GrTLP5.1Gorai.009G254100.1GR_chr09:20878526:20882901:+40044857.669.755Nuclear335762.18Unstable78.75−0.352
GrTLP6.3Gorai.009G272900.1GR_chr09:22824350:22826996:+51957823.549.694Nuclear447261.61Unstable78.94−0.287
GrTLP7.1Gorai.009G367200.1GR_ch09:49203689:49207350:+39543456.419.25Nuclear395363.6Unstable68.68−0.414
GrTLP6.2Gorai.010G009400.1GR_chr10:705149:707577:+41346131.899.634Nuclear335461.19Unstable76.95−0.306
GrTLP2.1Gorai.011G101400.1GR_chr11:11429229:11432319:+40945774.6694Nuclear415160.14Unstable78.92−0.325
GrTLP12.2Gorai.011G185600.1GR_chr11:44201423:44204423:+37641963.249323Nuclear294456.39Unstable80.19−0.294
GhTLP7A.2Ghir_A01G004680.1GhA:chr01:6048661:6051724:+38943580.669.164Nuclear405362.52Unstable66.97−0.49
GhTLP5A.1Ghir_A01G016830.1GhA:chr01:107578137:107581303:–42046917.959.594Nuclear355654.56Unstable79.83−0.274
GhTLP5A.2Ghir_A03G022780.1GhA:chr03:112389242:112392899:+42547784.99.665Nuclear386060.77Unstable76.61−0.34
GhTLP8AGhir_A05G012060.1GhA:chr05:11062311:11064976:+41746640.159.33Nuclear395640.21Unstable71.58−0.546
GhTLP6A.1Ghir_A05G019690.1GhA:chr05:18821376:18823547:+40645493.579.833Nuclear325952.23Unstable74.21−0.373
GhTLP6A.3Ghir_A05G026280.1GhA:chr05:27453966:27457122:+41846601.689.734Nuclear356256.1Unstable74−0.417
GhTLP6A.2Ghir_A06G000860.1GhA:chr06:869775:871665:+40745473.139.513Nuclear355462.9Unstable77.13−0.332
GhTLP7A.1Ghir_A08G024660.1GhA:chr08:120995211:120998711:+38142239.99.173Nuclear364956.33Unstable67.87−0.449
GhTLP12A.2Ghir_A09G006930.1GhA:chr09:53138712:53141783:+39343627.339.53Nuclear305057.78Unstable77.68−0.217
GhTLP2A.1Ghir_A10G009400.1GhA:chr10:18951476:18953688:–38343070.528.613Nuclear394463.11Unstable81.23−0.283
GhTLP12A.1Ghir_A10G010240.1GhA:chr10:21759983:21762837:–37642012.429.384Nuclear284555.28Unstable80.96−0.262
GhTLP2A.2Ghir_A11G004920.1GhA:chr11:4252195:4256299:+41446399.229.193Nuclear415363.16Unstable79.37−0.346
GhTLP11AGhir_A11G012590.1GhA:chr11:12921875:12926492:–31134589.589.272Nuclear243754.41Unstable76.5−0.316
GhTLP2A.3Ghir_A12G006600.1GhA:chr12:15242985:15244849:+39344061.579.088Nuclear395066.98Unstable81.37−0.317
GhTLP7D.3Ghir_D01G004710.1GhD:chr01:5448473:5453121:+41646531.159.264Nuclear435857.73Unstable68.73−0.454
GhTLP5D.2Ghir_D01G018390.1GhD:chr01:55366719:55369873:–42147017.029.644Nuclear365755.57Unstable79.64−0.289
GhTLP12D.1Ghir_D02G005060.1GhD:chr02:6472304:6473565:+27730665.329.624Nuclear193345.77Unstable84.19−0.17
GhTLP5D.3Ghir_D02G024220.1GhD:chr02:69191552:69195172:+42547759.919.665Nuclear386062.8Unstable75.69−0.348
GhTLP2D.4Ghir_D03G001340.1GhD:chr03:950448:951272:+20622753.188.353Plasma Membrane182057.67Unstable88.160.034
GhTLP8DGhir_D05G011810.1GhD:chr05:10003116:10005943:+41746623.229.394Nuclear365541.38Unstable72.06−0.514
GhTLP6D.1Ghir_D05G019700.1GhD:chr05:17145578:17150416:+41446373.89.844Extracellular305847.79Unstable81.74−0.246
GhTLP5D.1Ghir_D05G024720.1GhD:chr05:22834369:22839003:+42447545.389.475Nuclear415961.21Unstable77.05−0.391
GhTLP6D.3Ghir_D05G026320.1GhD:chr05:24926304:24929436:+51457545.289.734Nuclear447359.56Unstable77.06−0.301
GhTLP7D.1Ghir_D05G034870.1GhD:chr05:53632702:53635785:+38141802.589.335Nuclear355168.57Unstable70.21−0.393
GhTLP6D.2Ghir_D06G000730.1GhD:chr06:707815:710329:+406453529.594Nuclear345462.54Unstable76.82−0.325
GhTLP12D.4Ghir_D08G007820.1GhD:chr08:11289331:11291206:+30434460.849.685Nuclear224251.33Unstable78.19−0.337
GhTLP7D.2Ghir_D08G025550.1GhD:chr08:67215180:67218700:+38342389.139.394Nuclear345057.46Unstable69.32−0.449
GhTLP12D.3Ghir_D09G006640.1GhD:chr09:30300709:30303722:+39343548.169.64Nuclear295157.01Unstable78.17−0.231
GhTLP2D.1Ghir_D10G009850.1GhD:chr10:12012251:12016560:+41346283.138.734Nuclear455358.13Unstable78.43−0.327
GhTLP12D.2Ghir_D10G017760.1GhD:chr10:48294734:48297960:+37641991.299.324Nuclear294456.57Unstable80.72−0.289
GhTLP2D.2Ghir_D11G004820.1GhD:chr11:3929455:3933050:+41446393.139.194Nuclear415363.59Unstable77.25−0.363
GhTLP11DGhir_D11G012540.1GhD:chr11:11519838:11525175:–40545398.179.374Nuclear345255.54Unstable74.84−0.312
GhTLP2D.3Ghir_D12G006610.1GhD:chr12:11431068:11434161:+41746744.749.288Nuclear415563.43Unstable81.85−0.333
GbTLP7A.3Gbar_A01G004500.1GbA:chr01:5828899:5832744:+38943607.699.164Nuclear405362.02Unstable66.97−0.497
GbTLP5A.2Gbar_A01G017260.1GbA:chr01:105676110:105677971:–42147105.219.595Nuclear365754.47Unstable79.64−0.278
GbTLP12A.1Gbar_A02G004650.1GbA:chr02:6142192:6146051:+38442685.159.373Nuclear324849.99Unstable81.54−0.239
GbTLP5A.3Gbar_A03G022950.1GbA:chr03:104669910:104676138:+42547726.829.614Nuclear385960.89Unstable76.61−0.337
GbTLP7A.1Gbar_A04G003920.1GbA:chr04:9544967:9548909:–38742650.79.263Nuclear375266.36Unstable69.61−0.406
GbTLP8A.2Gbar_A05G011460.1GbA:chr05:10538169:10540756:+41746624.159.315Nuclear395640.21Unstable71.58−0.536
GbTLP6A.1Gbar_A05G019030.1GbA:chr05:18030893:18033198:+40945746.859.834Nuclear325953.05Unstable75.82−0.354
GbTLP6A.3Gbar_A05G025320.1GbA:chr05:26213749:26217047:+49455077.289.564Nuclear426659.29Unstable76.42−0.29
GbTLP8A.1Gbar_A05G043170.1GbA:Scaffold3378:17019:19596:+41746623.229.396Nuclear365541.38Unstable72.06−0.514
GbTLP5A.1Gbar_A05G043600.1GbA:Scaffold91:25302:29797:+39844752.259.494Nuclear385664.05Unstable75.98−0.373
GbTLP6A.2Gbar_A06G000760.1GbA:chr06:761708:763929:+41446249.049.563Nuclear345460.4Unstable77−0.298
GbTLP7A.2Gbar_A08G025570.1GbA:chr08:118191404:118195120:+38442596.329.233Nuclear365057.61Unstable68.36−0.449
GbTLP12A.3Gbar_A09G007080.1GbA:chr09:50278877:50281768:+39343609.39.54Nuclear305057.98Unstable78.68−0.21
GbTLP2A.1Gbar_A10G010240.1GbA:chr10:18997847:19001328:–40945839.818.894Nuclear415063.81Unstable79.17−0.296
GbTLP12A.2Gbar_A10G011070.1GbA:chr10:21723033:21724859:–28731681.299.491Nuclear193349.42Unstable71.78−0.366
GbTLP2A.2Gbar_A11G004480.1GbA:chr11:3813200:3817322:+41446399.229.192Nuclear415363.16Unstable79.37−0.346
GbTLP11AGbar_A11G012270.1GbA:chr11:12405505:12409866:–40545396.099.384Nuclear345257.93Unstable75.06−0.32
GbTLP2A.3Gbar_A12G006580.1GbA:chr12:15078028:15080997:+40645532.29.248Nuclear415469.05Unstable81.18−0.354
GbTLP7D.3Gbar_D01G004700.1GbD:chr01:5641258:5644642:+38943453.599.165Nuclear395262.86Unstable69.23−0.444
GbTLP5D.2Gbar_D01G018480.1GbD:chr01:55332523:55334385:–42147065.079.644Nuclear365755.59Unstable78.95−0.292
GbTLP12D.1Gbar_D02G005180.1GbD:chr02:6638191:6639988:+38543023.529.234Nuclear354950.31Unstable84.36−0.267
GbTLP5D.3Gbar_D02G024820.1GbD:chr02:67069705:67071604:+42547836.029.585Nuclear385961.97Unstable75.69−0.333
GbTLP2D.4Gbar_D03G001400.1GbD:chr03:914036:914860:+20622731.127.624Plasma Membrane192057.95Unstable88.160.033
GbTLP6D.1Gbar_D05G019720.1GbD:chr05:17031049:17033198:+40945823.049.883Nuclear326151.17Unstable77.73−0.341
GbTLP5D.1Gbar_D05G024660.1GbD:chr05:22759705:22764214:+42447545.389.475Nuclear415961.21Unstable77.05−0.391
GbTLP6D.3Gbar_D05G026200.1GbD:chr05:24657207:24660514:+42046692.689.757Nuclear356159.71Unstable74.55−0.393
GbTLP7D.1Gbar_D05G034930.1GbD:chr05:53368297:53371951:+39543414.339.25Nuclear395365.22Unstable68.2−0.42
GbTLP6D.2Gbar_D06G000830.1GbD:chr06:735015:737233:+41346105.859.633Nuclear335459.66Unstable77.19−0.298
GbTLP7D.2Gbar_D08G026180.1GbD:chr08:64380995:64384741:+38442454.179.244Nuclear354956.8Unstable68.62−0.437
GbTLP12D.3Gbar_D09G006810.1GbD:chr09:28809892:28812761:+39343591.229.594Nuclear295156.39Unstable77.91−0.242
GbTLP2D.1Gbar_D10G009570.1GbD:chr10:11447225:11450694:+40945693.548.925Nuclear415061.68Unstable79.41−0.309
GbTLP12D.2Gbar_D10G017400.1GbD:chr10:45988943:45991475:+37641977.279.323Nuclear294456.57Unstable80.45−0.286
GbTLP2D.2Gbar_D11G004820.1GbD:chr11:3795097:3797188:+41446427.159.194Nuclear415363.59Unstable76.3−0.366
GbTLP11DGbar_D11G012860.1GbD:chr11:11271540:11276030:–40545398.179.375Nuclear345255.54Unstable74.84−0.312
GbTLP2D.3Gbar_D12G006630.1GbD:chr12:11251940:11254140:+41746744.749.283Nuclear415563.43Unstable81.85−0.333
Characteristics of TLP genes in cotton.

Domain Structure Analysis of TLP Protein Family Members in Cotton

All the cotton TLP proteins were predicted to contain the tubby domain at the C-terminal end. With the exception of some (GaTLP2.4, GaTLP7.1, GaTLP7.2, GaTLP8, GaTLP12.1, GaTLP12.4, GaTLP12.5, GrTLP7.1, GrTLP7.2, GrTLP7.3, GrTLP8, GhTLP8A, GhTLP2D.1, GhTLP2D.4, GhTLP2A.1, GhTLP7A.1, GhTLP7A.2, GhTLP7D.1, GhTLP7D.2, GhTLP11A, GhTLP12A.1, GhTLP12D.1, GhTLP12D.2, GhTLP12D.4, GbTLP2D.4, GbTLP7A.1, GbTLP7A.2, GbTLP7A.3, GbTLP7D.1, GbTLP7D.2, GbTLP7D.3, GbTLP8A.1, GbTLP8A.2, GhTLP8D, GbTLP12A.1, GbTLP12A.2, and GbTLP12.1), the majority of the proteins encoded by the cotton TLP genes also possessed an F-box domain (Supplementary Figure 1). In A. thaliana, the TLP8 also has the tubby domain at the C-terminal side but lacked F-box at the N-terminal side (Lai et al., 2004). This finding showed that cotton TLP proteins comprised the same domain arrangements as reported earlier (Lai et al., 2004).

Multiple Sequence Alignment (MSA) and Evolutionary Analysis

The multiple sequence alignment (MSA) of all cotton TLP genes showed a highly conserved C-terminal tubby domain and F-box at the C-terminal (Supplementary Figure 2). To determine the evolutionary relationship between TLP proteins of cotton and A. thaliana, MSA of 105 identified cotton TLPs with 11 A. thaliana TLPs was carried out. Furthermore, a phylogenetic tree was constructed, using the maximum likelihood tree (ML) method. On the basis of phylogenetic relationships, cotton TLP genes were clustered into eight major groups (Groups 1–8), each containing 21, 18, 17, 17, 16, 6, 6, and 4 TLPs, respectively (Figure 1).
Figure 1

Phylogenetic relationship of Gossypium tubby-like proteins (TLPs) from A. thaliana. The phylogenetic ML tree was built, using the amino acid sequence with 1,000 bootstrap value with MEGA 7.0 software. At, A. thaliana; Ga, Gossypium arboretum; Gr, Gossypium raimondii; Gh, Gossypium hirsutum; Gb, Gossypium barbadense.

Phylogenetic relationship of Gossypium tubby-like proteins (TLPs) from A. thaliana. The phylogenetic ML tree was built, using the amino acid sequence with 1,000 bootstrap value with MEGA 7.0 software. At, A. thaliana; Ga, Gossypium arboretum; Gr, Gossypium raimondii; Gh, Gossypium hirsutum; Gb, Gossypium barbadense. The majority of the cotton TLPs were found to be clustered with A. thaliana TLPs, the exception being groups 4 and 8 cotton TLPs (GaTLPs12, GrTLPs12, GhTLPs12A, GhTLPs12D, GbTLPs12A, and GbTLPs12D). To gain insight into the groups 4 and 8 TLPs, a phylogenetic tree of these TLPs with other eudicots (Ranunculaceae, Brassicaceae, Caricaceae, Cucurbitaceae, Rutaceae, Myrtaceae, Rosaceae, Fabaceae, Euphorbiaceae, Scrophulariaceae, Salicaceae, Solanaceae, Malvaceae, and Vitaceae) was generated. Also, synteny analysis of cotton TLPs with those from A. thaliana (Brassicaceae) and T. cacao (Malvaceae) was carried out. This revealed that groups 4 and 8 TLPs were greatly conserved among G. arboreum (A genome), G. raimondii (D genome), G. hirsutum (At and Dt sub-genomes), G. barbadense (At and Dt sub-genomes), and T. cacao genomes. The groups 4 and 8 cotton TLP genes were conserved among closely related species with Gossypium (Byng et al., 2016). On the other hand, no conserved homologs were found in Brassicaceae (Figure 2, Supplementary Figure 3, and Supplementary Table 2). The phylogenetic tree of groups 4 and 8 cotton TLP gene family members (GaTLPs12, GrTLPs12, GhTLPs12A, GhTLPs12D, GbTLPs12A, and GbTLPs12D) also revealed that groups 4 and 8 cotton TLP genes were not clustered with Brassicaceae family members (Supplementary Figure 3). This finding showed that, after the divergence from the common ancestor, the groups 4 and 8 cotton TLP homologs might have been evicted from the Brassicaceae family.
Figure 2

Synteny analysis of cotton TLP genes with T. cacao (Malvaceae) and A. thaliana (Brassicaceae) genomes. A syntenic representation of the identified TLP genes between (A) G. arboreum (Chr01 to Chr13) vs. T. cacao (B) G. raimondii (Chr01 to Chr13) vs. T. cacao (C) G. hirsutum (At1 to At13 and Dt1 to Dt13) vs. T. cacao (1–10) (D) G. barbadense (At1 to At13 and Dt1 to Dt13) vs. T. cacao (1–10) (E) G. arboreum (Chr01 to Chr13) vs. A. thaliana (F) G. raimondii (Chr01 to Chr13) vs. A. thaliana (G) G. hirsutum (At1 to At13 and Dt1 to Dt13) vs. A. thaliana (H) G. barbadense (At1 to At13 and Dt1 to Dt13) vs. A. thaliana chromosomes. Red lines indicate duplicated orthologous TLP genes, and cyan and green lines show the paralogous TLPs genes in G. hirsutum, G. raimondii, G. arboreum, T. cacao, and A. thaliana.

Synteny analysis of cotton TLP genes with T. cacao (Malvaceae) and A. thaliana (Brassicaceae) genomes. A syntenic representation of the identified TLP genes between (A) G. arboreum (Chr01 to Chr13) vs. T. cacao (B) G. raimondii (Chr01 to Chr13) vs. T. cacao (C) G. hirsutum (At1 to At13 and Dt1 to Dt13) vs. T. cacao (1–10) (D) G. barbadense (At1 to At13 and Dt1 to Dt13) vs. T. cacao (1–10) (E) G. arboreum (Chr01 to Chr13) vs. A. thaliana (F) G. raimondii (Chr01 to Chr13) vs. A. thaliana (G) G. hirsutum (At1 to At13 and Dt1 to Dt13) vs. A. thaliana (H) G. barbadense (At1 to At13 and Dt1 to Dt13) vs. A. thaliana chromosomes. Red lines indicate duplicated orthologous TLP genes, and cyan and green lines show the paralogous TLPs genes in G. hirsutum, G. raimondii, G. arboreum, T. cacao, and A. thaliana.

Phylogenetic Tree, Encoded Protein Motifs, and Gene Structure Study of Gossypium TLP Genes

The evolutionary associations among the Gossypium TLPs were deduced by building a separate phylogenetic tree, using the ML method with 1,000 bootstraps value. On the basis of the topology of the tree, paralogous nodes, organization of exon–intron, and conservation of motifs, the cotton TLP genes were categorized into seven groups with higher bootstrap value. The proteins in each group had a high identity (>70%) among orthologous members but differed considerably from the members of the other groups, suggesting a divergent evolution from a common ancestor or origin from gene duplication events (Figure 3A). To determine the consistency of the exon-intron pattern in the phylogenetic groups, a gene structure comparison of the cotton TLPs was carried out. Intron number varied from 3 to 8 (GaTLPs), 0 to 8 (GrTLPs), 2 to 8 (GhTLPs), and 1 to 8 (GbTLPs) (Figure 3B and Table 1). The majority of the cotton TLP genes within the same group showed a similar pattern of exon-intron distribution. To study the conserved motif organization in TLP proteins, the MEME tool was employed for the analysis followed by annotation through InterProScan. A total of 15 conserved motifs were identified in the cotton TLP proteins. Only seven of these, motifs 1–7 (with the exception of motif 3), were found to form parts of the tubby domains. Motif 3 was annotated as the F-box domain (Figure 3D and Supplementary Table 3). Motif 1 was found in all cotton TLP proteins, except GaTLP12.4, GhTLP12D.4, and GhTLP7D.1. The majority of the cotton TLPs with close evolutionary relatives had similar motif composition and were assumed to have a similar function (Figure 3C).
Figure 3

Phylogenetic tree, gene structure, and conserved protein motifs analysis of TLPs in G. arboreum, G. raimondii, G. hirsutum, and G. barbadense. (A) Phylogenetic tree of G. arboreum, G. raimondii, G. hirsutum, and G. barbadense TLPs built with the ML (maximum likelihood) method and using 1,000 bootstrap values. Different colors of lines denoted the different species of Gossypium (Blue, G. arboreum; Red, G. raimondii; Black, G. hirsutum; and cyan, G. barbadense). Groups 1–7 distributed in green, yellow, red, royal, violet, orange, and sky blue colors, respectively. (B) Showing the exon-intron organization of TLPs genes in Gossypium (Pink boxes, exon, and black lines, introns). (C) Identified conserved protein motifs in the Gossypium TLPs, and each motif is indicated with a specific color. (D) The logos are given for functionally annotated motifs only, where the heights of logo depict the degree of conservation of amino acid within the motif. The motifs order corresponds to their positions in protein sequence. Motif descriptions are given in Supplementary Table 3.

Phylogenetic tree, gene structure, and conserved protein motifs analysis of TLPs in G. arboreum, G. raimondii, G. hirsutum, and G. barbadense. (A) Phylogenetic tree of G. arboreum, G. raimondii, G. hirsutum, and G. barbadense TLPs built with the ML (maximum likelihood) method and using 1,000 bootstrap values. Different colors of lines denoted the different species of Gossypium (Blue, G. arboreum; Red, G. raimondii; Black, G. hirsutum; and cyan, G. barbadense). Groups 1–7 distributed in green, yellow, red, royal, violet, orange, and sky blue colors, respectively. (B) Showing the exon-intron organization of TLPs genes in Gossypium (Pink boxes, exon, and black lines, introns). (C) Identified conserved protein motifs in the Gossypium TLPs, and each motif is indicated with a specific color. (D) The logos are given for functionally annotated motifs only, where the heights of logo depict the degree of conservation of amino acid within the motif. The motifs order corresponds to their positions in protein sequence. Motif descriptions are given in Supplementary Table 3.

Chromosomal Location and Gene Duplication Events of Gossypium TLP Genes

To identify the chromosomal localization of GaTLP, GrTLP, GhTLP, and GbTLP genes in the cotton genome, the BLASTN search was performed. GaTLP genes were distributed on chromosomes 1–11 (Figure 4A), GrTLP genes were localized across chromosomes 2 and 4–11 (Figure 4B), GhTLP genes were located on At chromosomes 1, 3, 5, 6, 8–12 and on Dt chromosomes 1, 2, 3, 5, 6, 8–12 with 14 and 19 genes, respectively (Figures 4C,D), GbTLP genes were distributed on At chromosomes 1–6, 8–12, and Dt chromosomes 1, 2, 3, 5, 6, 8–12 with 16 and 17 genes, respectively (Figures 4E,F). Given the expansion of the number of cotton TLP genes, gene duplication events were next studied. High amino acid sequence similarities were detected among protein encoded by TLP genes, as five pairs of paralogous TLP genes were identified in diploid cotton (G. arboreum and G. raimondii) (Figures 4A,B), while seven pairs of the paralogous genes were determined in G. hirsutum (At and Dt sub-genomes) and 12 in G. barbadense (At and Dt sub-genomes) (Figures 4C–F). These paralogous TLP gene pairs existed in the same group, and most of them showed >70% sequence similarities between the proteins encoded by these TLP gene pairs. Except for the GaTLP2.1/GaTLP2.3 gene pair, which was tandemly arranged, all other paralogous gene pairs were placed on distinct chromosomes, providing evidence that the expansion of the cotton the TLP family was mainly due to segmental duplication, not tandem duplication. In G. arboreum, four segmental gene duplications (GaTLP5.2/5.3, GaTLP2.1/2.2, GaTLP2.2/2.3, and GaTLP5.1/5.3) and one tandem duplication (GaTLP2.1/2.3) were occurred from 15.17 to 50.85 MYA. While five segmental duplications (GrTLP5.2/5.3, GrTLP2.1/2.3, GrTLP2.2/2.3, GrTLP2.2/2.1, and GrTLP7.3/7.2) were found in G. raimondii from 11.95 to 18.88 MYA (Table 2). In G. hirsutum, seven segmental gene duplications (GhTLP5A.1/5A.2, GhTLP2A.2/2A.3, GhTLP7A.2/7A.1, GhTLP5D.2/5D.3, GhTLP5D.1/5D.3, GhTLP2D.1/2D.2, and GhTLP7D.3/7D.2) occurred in At and Dt subgenomes from 14.77 to 53.60 MYA and 12 segmental duplication (GbTLP8A.1/8A.2, GbTLP5A.2/5A.3, GbTLP2A.1/2A.3, GbTLP2A.1/2A.2, GbTLP2A.2/2A.3, GbTLP7A.2/7A.1, GbTLP5D.2/5D.3, GbTLP2D.1/2D.3, GbTLP2D.1/2D.2, GbTLP2D.2/2D.3, GbTLP7D.3/7D.2, and GbTLP5D.1/5D.3 occurred in At and Dt subgenomes of G. barbadense from 0.85 to 52.6 MYA (Table 2). Most of the paralogous gene pairs showed recent duplication events (13–20 MYA) (Li et al., 2014). The non-synonymous and synonymous substitution ratios (Ka/Ks ratios) for the duplicated Gossypium TLP gene pairs were consistently <1 (Table 2). Therefore, duplicated cotton TLP genes experienced intense purifying selection, which contributes to conserving their functions and reveals that not much diversion had taken place during the course of evolution (Gabaldon and Koonin, 2013). The orthologous gene pair, having <90% sequence identity in cDNA, and amino acid sequence were analyzed further for evolutionary studies (Supplementary Table 4). The selection pressure and the potential function of Gossypium TLPs were examined by computing the Ka, Ks, and Ka/Ks ratios among orthologous (A vs. D, At vs. A, Dt vs. D, At vs. At, and Dt vs. Dt) and within the homeologs (At vs. Dt). Interestingly, the Ka value of cotton orthologous TLP2 (GaTLP2/GrTLP2), TLP5 (GaTLP5/GrTLP5), TLP6 (GaTLP6/GrTLP6), TLP7 (GaTLP7/GrTLP7), and TLP12 (groups 4 and 8 TLPs) (GaTLP12/GrTLP12, GhTLP12A/GhTLP12D, and GbTLP12A/GbTLP12D) genes were greater in inter-genomes (A vs. D and At vs. Dt) in comparison to other orthologous TLP gene pairs, suggesting that these pairs experienced faster evolution. Subsequently, during the course of evolution, orthologous TLP gene pairs often retain their corresponding function in different species (Gabaldon and Koonin, 2013). A total of 153 out of 172 orthologous TLP gene pairs have a Ka/Ks ratio <1, and the rest 16 have Ka/Ks >1 in both diploid and allotetraploid species, indicating a greater number of the TLP orthologous genes pairs experienced purifying selection pressure, and some of them experienced Darwinian selection pressure (Figure 4G and Supplementary Table 5). The Ka/Ks values of TLP2, 5, 6, 7, 8, 11, and 12 were higher in A vs. D, At vs. Dt, and Dt vs. D. Therefore, these TLPs experienced greater evolutionary pressure in diploid as well as in allotetraploid cotton and might have evolved rapidly in D subgenome as compared with A subgenome.
Figure 4

Chromosomal arrangement and paralogous gene duplication of TLP genes on the four genomes of Gossypium. A physical map of chromosomes demonstrates the position of TLP genes on A, D, and AD genomes, respectively. The paralogous TLP genes (segmental gene duplication) linked with blue lines. The locations of each TLP genes are represented through the horizontal gray lines. The numbers of chromosomes are showed at the top. (A,B) Represent the distribution of TLP genes in G. arboreum and G. raimondii genomes, (C,D) showed the arrangement of TLP genes in G. hirsutum genome, and (E,F) represented the TLP genes in G. barbadense genome, respectively. The scale is in mega bases (Mb). (G) The orthologous TLP duplicated genes pairs among four cotton genomes (G. arboreum, G. raimondii, G. hirsutum, and G. barbadense) were also shown with blue lines, and CIRCOS was used to visualize this plot. The gene name and the chromosome of G. arboreum, G. raimondii, G. hirsutum, and G. barbadense were shown with green, violet, pink, and cyan colors, respectively.

Table 2

The date of duplication and Ka/Ks ratios for duplicate TLP genes in G. arboreum, G. raimondii, G. barbadense, and G. hirsutum.

Duplicated TLP gene1Duplicated TLP gene2KaKsKa/KsDate (MYA) T = Ks/2λSelective pressureDuplicate type
GaTLP5.2GaTLP5.30.04630.61750.074920.58Purifying selectionSegmental
GaTLP2.1GaTLP2.30.1260.45510.27715.17Purifying selectionTandem
GaTLP2.1GaTLP2.20.07390.53490.138217.83Purifying selectionSegmental
GaTLP2.2GaTLP2.30.11610.58380.198919.46Purifying selectionSegmental
GaTLP5.1GaTLP5.30.09891.52630.064850.87Purifying selectionSegmental
GrTLP5.2GrTLP5.30.04760.56250.084718.75Purifying selectionSegmental
GrTLP2.1GrTLP2.30.07020.45510.154215.17Purifying selectionSegmental
GrTLP2.2GrTLP2.30.08020.35870.223511.95Purifying selectionSegmental
GrTLP2.2GrTLP2.10.07030.50490.139316.83Purifying selectionSegmental
GrTLP7.3GrTLP7.20.140.56660.247118.88Purifying selectionSegmental
GhTLP5A.1GhTLP5A.20.04890.6020.081320.06Purifying selectionSegmental
GhTLP2A.2GhTLP2A.30.11610.44310.26214.77Purifying selectionSegmental
GhTLP7A.2GhTLP7A.10.15160.66550.227822.18Purifying selectionSegmental
GhTLP5D.2GhTLP5D.30.04760.5560.085618.53Purifying selectionSegmental
GhTLP5D.1GhTLP5D.30.09551.6080.059453.6Purifying selectionSegmental
GhTLP2D.1GhTLP2D.20.14780.61540.240120.51Purifying selectionSegmental
GhTLP7D.3GhTLP7D.20.15520.580.267619.33Purifying selectionSegmental
GbTLP8A.1GbTLP8A.20.00790.02570.30790.85Purifying selectionSegmental
GbTLP5A.2GbTLP5A.30.04890.60160.081220.05Purifying selectionSegmental
GbTLP2A.1GbTLP2A.30.07260.51690.140417.23Purifying selectionSegmental
GbTLP2A.1GbTLP2A.20.07280.51320.141917.1Purifying selectionSegmental
GbTLP2A.2GbTLP2A.30.08540.38750.220412.91Purifying selectionSegmental
GbTLP7A.2GbTLP7A.10.13450.53740.250317.91Purifying selectionSegmental
GbTLP5D.2GbTLP5D.30.04980.56170.088618.72Purifying selectionSegmental
GbTLP2D.1GbTLP2D.30.06480.45430.142615.14Purifying selectionSegmental
GbTLP2D.1GbTLP2D.20.07030.49490.14216.49Purifying selectionSegmental
GbTLP2D.2GbTLP2D.30.07860.36750.21412.25Purifying selectionSegmental
GbTLP7D.3GbTLP7D.20.14450.5730.252219.1Purifying selectionSegmental
GbTLP5D.1GbTLP5D.30.10041.5780.063652.6Purifying selectionSegmental
Chromosomal arrangement and paralogous gene duplication of TLP genes on the four genomes of Gossypium. A physical map of chromosomes demonstrates the position of TLP genes on A, D, and AD genomes, respectively. The paralogous TLP genes (segmental gene duplication) linked with blue lines. The locations of each TLP genes are represented through the horizontal gray lines. The numbers of chromosomes are showed at the top. (A,B) Represent the distribution of TLP genes in G. arboreum and G. raimondii genomes, (C,D) showed the arrangement of TLP genes in G. hirsutum genome, and (E,F) represented the TLP genes in G. barbadense genome, respectively. The scale is in mega bases (Mb). (G) The orthologous TLP duplicated genes pairs among four cotton genomes (G. arboreum, G. raimondii, G. hirsutum, and G. barbadense) were also shown with blue lines, and CIRCOS was used to visualize this plot. The gene name and the chromosome of G. arboreum, G. raimondii, G. hirsutum, and G. barbadense were shown with green, violet, pink, and cyan colors, respectively. The date of duplication and Ka/Ks ratios for duplicate TLP genes in G. arboreum, G. raimondii, G. barbadense, and G. hirsutum.

Effects of Salt and Drought Stress on the Expression Profiles of GhTLP Genes

It has been previously reported that TLP gene family members are expressed and regulated by several abiotic stresses (Lai et al., 2004; Wardhan et al., 2012; Bao et al., 2014). In view of these reports, we studied the involvement of cotton TLP genes in drought and salt stress conditions by analyzing transcriptomic data of leaf transcriptomes in response to drought and salt stress conditions. Thirty GhTLP genes exhibited differential expression, and twelve (GhTLP5A.2, GhTLP5D.2, GhTLP6A.3, GhTLP7D.2, GhTLP7D.3, GhTLP8A, GhTLP11A, GhTLP12A.1, GhTLP12.2, GhTLP12D.1, GhTLP12D.2, and GhTLP12D.3) showed significant higher expression during drought stress (Figure 5P) in transcriptome data. Furthermore, 17 GhTLP genes (GhTLP2D.2, GhTLP6A.1, GhTLP6D.3, GhTLP5A.1, GhTLP5A.2, GhTLP5D.2, GhTLP6A.3, GhTLP7A.2, GhTLP7D.3, GhTLP8A, GhTLP8D, GhTLP11A, GhTLP12A.1, GhTLP12A.2, GhTLP12D.1, GhTLP12D.2, and GhTLP12D.3) demonstrated higher expression in salt stress condition (Figure 5H). The studies show that the GhTLP gene family members respond to different abiotic stresses such as drought and salt and may have a role in regulating stress responses of cotton against salt and drought.
Figure 5

The expression pattern of the TLP gene family members in the G. hirsutum (A–G,I–O) qRT-PCR expression pattern of seven putative genes in normal (0 h), 6, 12, 24, 48, and 72 h of salt (300 MM) and drought (20% PEG solutions PEG8000) stress in cotton. Ubiquitin was used as the loading control. Three replicates were used for each experiment. The statistical analysis was performed, using two-tailed Student's T-tests. The data are plotted as means ± s.d. The error bars represent standard deviations. The asterisks indicate significant differences: *P < 0 0.1; **P < 0 0.01; ***P < 0 0.001; ****P < 0.0001. (H,P) Gene expression profiles of TLP genes under salt and drought condition in RNA-Seq, using Log2 (fold change) values.

The expression pattern of the TLP gene family members in the G. hirsutum (A–G,I–O) qRT-PCR expression pattern of seven putative genes in normal (0 h), 6, 12, 24, 48, and 72 h of salt (300 MM) and drought (20% PEG solutions PEG8000) stress in cotton. Ubiquitin was used as the loading control. Three replicates were used for each experiment. The statistical analysis was performed, using two-tailed Student's T-tests. The data are plotted as means ± s.d. The error bars represent standard deviations. The asterisks indicate significant differences: *P < 0 0.1; **P < 0 0.01; ***P < 0 0.001; ****P < 0.0001. (H,P) Gene expression profiles of TLP genes under salt and drought condition in RNA-Seq, using Log2 (fold change) values. Furthermore, to validate the transcriptome data of GhTLPs in response to drought and salt stresses, the qRT-PCR validation of seven putative genes (GhTLP5A.1, GhTLP5A.2, GhTLP5D.2, GhTLP7A.2, GhTLP7D.3, GhTLP11A, and GhTLP12A.1) was carried out. The corresponding primers are listed in Table 3. The expression of seven GhTLP genes was significantly upregulated in salt-stressed plants, where the majority of the GhTLPs showed responses at 12 and 48 h (Figures 5A–G), while GhTLP11A and GhTLP12A.1 showed highest responses in terms of fold change (FC) (>80 and >90) compared with control (Figures 5F,G). Similarly, six out of seven GhTLP genes were upregulated in drought-stressed plants, where GhTLP5A.2 showed responses at 24 h (Figure 5J), GhTLP5D.2 at 72 h (Figure 5K), GhTLP7A.2 at 12 and 48 h (Figure 5L), GhTLP7D.3 at 12 h (Figure 5M), GhTLP11A at 6, 12, 48, and 72 h (Figure 5N), but no expression was detected in GhTLP5A.1 at any time scale (Figure 5I), while GhTLP12A.1 showed highest response in terms of fold change (>140) in comparison to the control (Figure 5O). Altogether, the differential responses of GhTLP gene family members to salt and drought stresses suggest that GhTLP genes may function to combat abiotic stresses in cotton. Still, further studies are required to clone the significantly higher expressed genes to establish the role of individual TLP genes in cotton for salt and drought stress resistance.
Table 3

A list of primers used for validation in qRT-PCR.

Forward primer (5′-3′)Reverse primer (5′-3′)
GhTLP5A.1TAGGCGGAGTTTTGATGTTAGATTGATTACAAGTGGCTCATCATGCAGAT
GhTLP5A.2TCTCTAAATCTCTTGACCACTCTGTTGATTTTCCCGTCCTCATCATCATAA
GhTLP5D.2TTTCAAGATCAAGCAGCAGTTACATCAGGCTGGGTATCGTATATTATGAATT
GhTLP7A.2TTGCGTATGTAAGAAGTGGAGAGAAAGTGATTTTGCCGCTATTTTGAG
GhTLP7D.3TTGCGTATGTAAGAAGTGGAGAGAAAAGTGATTTTGCCGCTATTTTGA
GhTLP11AATCAAAATCAACCCGTTCAGAGATCCTCAATACTAGCATTCCATCTTTCT
GhTLP12A.1TATCAATCAACCCCAACTAGCTTTCTCCACCATTCTTTTGATCAGATACA
A list of primers used for validation in qRT-PCR.

Co-expression Network and Pathways Analysis of Highly Expressed GhTLP11A and GhTLP12A.1 Genes at Different Time Intervals Under Drought and Salt Stress Condition

The highly expressed genes (GhTLP11A and GhTLP12A.1) were further selected for co-expression network and pathways study. Using FPKM values, the co-expressed genes with GhTLP11A and GhTLP12A.1 were explored. We identified positively co-expressed genes (PCoEGs) (2 and 4) and negatively co-expressed genes (NCoEGs) (5 and 14) in salt and drought stress with GhTLP11A (Supplementary Figure 4 and Supplementary Tables 6, 7). Similarly, PCoEGs (13 and 48) and NCoEGs (9 and 24) were determined with GhTLP12A.1 in salt and drought stress (Supplementary Figure 4 and Supplementary Tables 6, 7). PCoEGs and NCoEGs of GhTLP11A and GhTLP12A.1 were subjected for PageMan pathways analysis to understand the molecular and functional role of these genes. The pathway study revealed that NCoEGs of GhTLP11A showed higher expression of calcium signaling, and PCoEGs of GhTLP11A displayed higher expression of AS2 (lateral organ boundaries DOMAIN family protein) (Ma et al., 2006) at all time points (0, 6, 12, 24, 48, and 72 h) under salt stress condition (Supplementary Figure 5A). Earlier studies demonstrated that the elevated calcium levels help to protect plants from salt stress via SOS (salt overly sensitive) with signal transduction (Seifikalhor et al., 2019). In Medicagotruncatula, lateral organ boundaries domain (LBD1), Sorghum bicolorLBD, and Vitis viniferaLBD genes were upregulated under salt stress condition (Ariel et al., 2010; Wang et al., 2010; Grimplet et al., 2017), suggesting its role in salt stress response. Moreover, PCoEGs of GhTLP12A.1 showed higher expression of β-ketoacyl-CoA synthase (KCS), a key enzyme for the fatty acid elongation (Yang et al., 2020b), and NCoEGs of GhTLP12A.1 also displayed upregulation of a PHD-type transcriptional regulator at all time points (0, 6, 12, 24, 48, and 72 h) in salt stress response (Supplementary Figure 5B). Results revealed that β-ketoacyl-CoA synthase improves salt tolerance in A. thaliana (Yang et al., 2020b), and a PHD-type transcriptional regulator also improves salt tolerance in transgenic A. thaliana (Wei et al., 2009). Additionally, NCoEGs of GhTLP11A demonstrated little higher expression of jasmonate hormone metabolism, abiotic stresses for 12 h while the MYB domain and the G2-like transcriptional regulator highly expressed at all time points (0, 6, 12, 24, 48, and 72 h) in drought stress response (Supplementary Figure 5C). Methyl jasmonate (MeJA) has been reported to get enhanced during drought stress (Wu et al., 2012) and causes stomatal closure to save the water in wheat and enhance the antioxidant ability under the drought stress condition (Ma C. et al., 2014). Moreover, MYB has been reported to play a crucial role in providing tolerance under drought stress in A. thaliana and cotton (Zhang et al., 2012; Chen et al., 2015). These results suggested the important roles in drought stress tolerance. Alteration of its expression might improve tolerance under drought stress in cotton. Furthermore, NCoEGs of GhTLP12A.1 showed higher expression in cellulose synthase enzyme related to the cell wall; PCoEGs of GhTLP12A.1 displayed significantly highly expressed in lipid metabolism-related enzymes, the DOF zinc finger family, and MAP (mitogen-activated protein) kinases, signaling at all time points (0, 6, 12, 24, 48, and 72 h) in drought stress response (Supplementary Figure 5D). An earlier report revealed the importance of cellulose synthase in drought stress via induction of gene expression in A. thaliana (Chen et al., 2005), and lipid metabolism also showed an important role in drought stress response in A. thaliana by decreasing the lipid content progressively (Gigon et al., 2004). Moreover, overexpressed the DOF zinc finger family provides resistance under drought stress in Populustrichocarpa (Wang H. et al., 2017), and MAP kinases signaling also enables to enhance the tolerance under drought stress via the transmission of definite stimuli and regulating the antioxidant defense system (Sinha et al., 2011).

Potential miRNA Target Sites in Gossypium TLP Transcripts

A large number of transcripts are regulated by miRNAs in response to stresses, signal transduction, and in plant development (Witkos et al., 2011). To study whether the cotton TLP genes may be regulated by miRNAs, target sites for miRNA binding were analyzed in the identified cotton TLP genes through the plant small RNA target analysis server (psRNATarget). The miRBase database possesses 378 cotton miRNAs. For the identification of miRNA target sites, the cut-off threshold of 4 was set in the search parameter. We were able to identify target sites for 41 cotton miRNAs in 56 cotton TLP transcripts with an expectation score (e) varied from 0.5 to 4 (Supplementary Table 8). Only the miRNA/target site pairs with cut-off 3.5 were selected to reduce the false-positive identification. Later, 44 miRNAs from the 14 miRNA families, comprising the target sites among 28 cotton TLP genes, were considered reliable in terms of e ≤3.5 (Table 4). Although the majority of the cotton TLP genes contain target sites for a single miRNA, some genes, such as GbTLP2D.1, GbTLP7D.1, GhTLP2D.1, GhTLP7D.1, GrTLP2.1, and GrTLP7.1, have target sites for more than one miRNA. Target site accessibility was evaluated by estimating unpaired energy (UPE), an essential factor in the identification of targets. The UPE of the target sites varies from 7.274 (gra-miR7494b) to 24.749 (gra-miR7486d), where a lesser amount of energy illustrated the greater chance of interaction among a miRNA and target sites (Marin and Vanicek, 2011).
Table 4

The potential miRNA target sites in cotton TLP transcripts.

miRNA Acc.Target Acc.ExpectationTarget accessibility (UPE)miRNA lengthTarget startTarget endAlignmentInhibitionMultiplicity
gra-miR7494bGaTLP6.32.58.07423107129: .:::::: :::::.:.:::Cleavage1
gra-miR1446GaTLP2.3321.22421436457::::: :::::::::::::Cleavage1
gra-miR1446GaTLP2.1320.54621436457::::: :::::::::::::Cleavage1
gra-miR7494bGrTLP6.32.58.00223392414: .:::::: :::::.:.:::Cleavage2
gra-miR1446GrTLP2.3321.1621445466::::: :::::::::::::Cleavage1
ghr-miR399aGrTLP7.13.515.48821496516::. ::::.::.::::::Cleavage1
ghr-miR399bGrTLP7.13.515.48821496516::. ::::.::.::::::Cleavage1
ghr-miR399dGrTLP7.13.515.48821496516:::: ::::.::. ::::::Cleavage1
ghr-miR399eGrTLP7.13.515.48821496516::: ::::.::. ::::::Cleavage1
gra-miR1446GrTLP2.23.523.87521436457::::: ::::::.::::::Cleavage1
gra-miR482cGrTLP12.13.523.05422679700:::::::: ::::::::Translation1
gra-miR7486dGrTLP7.33.524.74921694713::::::::: : .:::::::Translation1
gra-miR7504kGrTLP2.13.517.772241,0811,104::.::.::::::::::.Cleavage1
gra-miR7504lGrTLP2.13.517.772241,0811,104::.::.::::::::::.Cleavage1
gra-miR7504mGrTLP2.13.517.772241,0811,104::.::.::::::::::.Cleavage1
gra-miR7494bGhTLP6D.32.510.39823392414: .:::::: :::::.:.:::Cleavage1
gra-miR1446GhTLP2D.3321.1621445466::::: :::::::::::::Cleavage1
gra-miR1446GhTLP2A.2320.54621436457::::: :::::::::::::Cleavage1
ghr-miR399aGhTLP7D.13.515.77521496516::. ::::.::.::::::Cleavage1
ghr-miR399bGhTLP7D.13.515.77521496516::. ::::.::.::::::Cleavage1
ghr-miR399dGhTLP7D.13.515.77521496516:::: ::::.::. ::::::Cleavage1
ghr-miR399eGhTLP7D.13.515.77521496516::: ::::.::. ::::::Cleavage1
ghr-miR7502GhTLP6D.13.515.52246891::. :::.:::: :::::::Cleavage1
gra-miR482cGhTLP12D.13.522.50722358379:::::::: ::::::::Translation1
gra-miR7504kGhTLP2D.13.517.772241,0931,116::.::.::::::::::.Cleavage1
gra-miR7504lGhTLP2D.13.517.772241,0931,116::.::.::::::::::.Cleavage1
gra-miR7504mGhTLP2D.13.517.772241,0931,116::.::.::::::::::.Cleavage1
gra-miR8700GhTLP7D.13.519.294241,0671,090:::: ::: ::::: ::::Translation1
gra-miR8767cGhTLP5D.13.515.286211,1581,178:.::::.: :::::.:::Translation1
gra-miR7494bGbTLP6D.32.510.51523107129: .:::::: :::::.:.:::Cleavage1
gra-miR7494bGbTLP6A.32.57.27423314336: .:::::: :::::.:.:::Cleavage1
gra-miR1446GbTLP2A.2320.54621436457::::: :::::::::::::Cleavage1
gra-miR1446GbTLP2D.3321.1621445466::::: :::::::::::::Cleavage1
gra-miR1446GbTLP2A.3321.97221412433::::: :::::::::::::Cleavage1
ghr-miR399aGbTLP7D.13.515.77521496516::. ::::.::.::::::Cleavage1
ghr-miR399bGbTLP7D.13.515.77521496516::. ::::.::.::::::Cleavage1
ghr-miR399dGbTLP7D.13.515.77521496516:::: ::::.::. ::::::Cleavage1
ghr-miR399eGbTLP7D.13.515.77521496516::: ::::.::. ::::::Cleavage1
gra-miR482cGbTLP12D.13.522.50722682703:::::::: ::::::::Translation1
gra-miR7486dGbTLP7D.33.524.74921694713::::::::: : .:::::::Translation1
gra-miR7504kGbTLP2D.13.517.772241,0811,104::.::.::::::::::.Cleavage1
gra-miR7504lGbTLP2D.13.517.772241,0811,104::.::.::::::::::.Cleavage1
gra-miR7504mGbTLP2D.13.517.772241,0811,104::.::.::::::::::.Cleavage1
gra-miR8767cGbTLP5D.13.515.286211,1581,178:.::::.: :::::.:::Translation1
The potential miRNA target sites in cotton TLP transcripts.

cis-Regulatory Element Analysis of GhTLPs

The cis-regulatory elements are crucial to controlling the regulation of transcription in several stress conditions (Nakashima et al., 2009). To identify cis-regulatory elements that may govern TLP expression in cotton, a 2-Kb region upstream from the translational start site of GhTLPs was scanned for various cis elements. A total of 1,182 proximal elements were identified in 33 GhTLPs that included 737 for abiotic stresses, 343 for hormonal responses, 30 for biotic stresses and 72 elements for the other cis-regulatory elements (Figure 6 and Table 5). A higher number of cis-regulatory elements were identified in GhTLP2D.2 (60), whereas the least number of cis-regulatory elements were detected in GhTLP12A.1 (18) (Supplementary Table 9). In abiotic stress-responsive cis-regulatory elements, the majority of the elements were involved in light responses, followed by low temperature, flavonoid biosynthesis, defense, and stress (Supplementary Table 9). The cis-regulatory elements related to hormonal responses comprised auxin, salicylic acid (SA), abscisic acid, gibberellin, and methyl jasmonate-responsive elements (Supplementary Table 9). The SA-responsive TCA element was present in a higher number in GhTLP11A. The auxin-responsive AuxRR-core element, which has an important role in salt, as well as drought stress responses (Guo et al., 2018; Kang et al., 2018), was detected in GhTLP5A.2, GhTLP5D.3, and GhTLP6A.2. Ethylene-responsive elements (EREs), which provide defense against salt and drought stress conditions (Pei et al., 2017; Sharma et al., 2019), were also detected and present in a higher number in GhTLP5D.2. The presence of these cis-regulatory elements in GhTLPs hints at their potential roles in the regulation of gene expression in cotton growth and development as well as under various environmental conditions (Nawaz et al., 2014).
Figure 6

cis-regulatory elements analysis of G. hirsutum TLP genes. The micro-parts in diverse colors are the putative elements sequence.

Table 5

Identified cis-regulatory elements in the GhTLPs genes.

Abiotic stress responsive elementsBiotic stress responsive elementsHormone responsive elementsOthers
GhTLP2A.11562
GhTLP2A.24163
GhTLP2A.3261113
GhTLP2D.122152
GhTLP2D.246194
GhTLP2D.32884
GhTLP2D.4241201
GhTLP5A.128281
GhTLP5A.21473
GhTLP5D.112162
GhTLP5D.2263131
GhTLP5D.31883
GhTLP6A.1271113
GhTLP6A.2253121
GhTLP6A.319170
GhTLP6D.12896
GhTLP6D.220151
GhTLP6D.3251111
GhTLP7A.130290
GhTLP7A.2152163
GhTLP7D.122392
GhTLP7D.22270
GhTLP7D.3211112
GhTLP8A26283
GhTLP8D2084
GhTLP11A8175
GhTLP11D1564
GhTLP12A.11341
GhTLP12A.21873
GhTLP12D.1183141
GhTLP12D.215142
GhTLP12D.326171
GhTLP12D.4242140
Total7373034372
cis-regulatory elements analysis of G. hirsutum TLP genes. The micro-parts in diverse colors are the putative elements sequence. Identified cis-regulatory elements in the GhTLPs genes.

Three-Dimensional (3D) Structural Analysis of the Putative GhTLPs Tubby Domain

The 3D structures of the GhTLPs tubby domain were determined through homology modeling of a central alpha-helix surrounded by a beta-barrel (Figure 7). All the putative GhTLPs have a typical tubby structure with a closed beta-barrel formed by 12 anti-parallel strands and a central alpha helix. While most GhTLPs contain five alpha-helices, GhTLP7A.2 and GhTLP12A.1 comprise six alpha-helices. These three-dimensional structural differences might lead to the functional diversification of different GhTLPs and suggest a slightly altered role for GhTLP7A.2 and GhTLP12A.1 as compared with the other GhTLP proteins. The higher transcript level of GhTLP12A.1 during drought and salt stress conditions further supported this hypothesis.
Figure 7

The homology 3D model of the GhTLPs tubby domain. The GhTLPs were selected for three-dimensional structure prediction and display with confidence level >99%. The alpha helix is shown with red and beta fold is shown with violet color.

The homology 3D model of the GhTLPs tubby domain. The GhTLPs were selected for three-dimensional structure prediction and display with confidence level >99%. The alpha helix is shown with red and beta fold is shown with violet color.

Discussion

The genus Gossypium includes ~45 diploid (2n = 2x = 26) and six tetraploid (2n = 4x = 52) species (Hawkins et al., 2006; Grover et al., 2015). G. hirsutum and G. barbadense are allotetraploids that have been arisen in the new world from interspecific hybridization among A-genome-like ancestral African species and D-genome-like American species (Chen et al., 2007). The closest relatives of the tetraploid progenitors are the A-genome species G. herbaceum (A1) and G. arboreum (A2) and the D-genome species, G. raimondii (D5) or G. gossypioides (D6) (Brubaker et al., 1999; Senchina et al., 2003). Approximately, 1–2 million years ago, polyploidization occurred, giving rise to allotetraploid species (Wendel and Cronn, 2003). G. hirsutum and G. barbadense are possibly the oldest main allopolyploid crops (Paterson et al., 2012; Chalhoub et al., 2014; Marcussen et al., 2014). In our efforts to study the TLP family in cotton, we have identified a total of 105 cotton TLP genes in four Gossypium genomes (G. arboreum, G. ramondii, G. hirsutum, and G. barbadense), 19 GaTLPs, 18 GrTLPs, 33 GhTLPs, and 35 GbTLPs (Table 1 and Supplementary Table 1). The genome sizes of G. arboreum and G. raimondii are 1,746 and 885 Mb (Li et al., 2014), respectively, and, expectedly, G. arboreum had higher numbers of TLP genes as compared with G. raimondii. However, although the genome size of G. hirsutum (~2.30 Gb) (Hu et al., 2019) was about the same as G. barbadense genome size (~2.22 Gb) (Hu et al., 2019), G. barbadense had a greater number of TLP genes as compared with G. hirsutum. The higher number of the TLP gene family members in G. barbadense might be due to the whole genome duplication events (Zhang et al., 2015; Qiao et al., 2019) which facilitate diversification (Clark and Donoghue, 2018). The domain study revealed that all the conserved cotton TLP proteins comprised the tubby domain at the C-terminal and F-box domain at the N- terminal end while some of the cotton TLPs lack the F-box (Supplementary Figure 1). This was also observed in Arabidopsis thaliana, indicating the functional role of TLP proteins lacking the F box (Lai et al., 2004). The phylogenetic analysis of cotton TLPs and A. thaliana protein sequences grouped the cotton TLP proteins into eight major groups (Groups 1–8). However, groups 4 and 8 cotton TLP genes were not clustered with any of A. thaliana genes (Figure 1). Further analysis of groups 4 and 8 cotton TLP genes with other eudicots showed the loss of these genes from the brassicaceae family only. The orthologous gene pair analysis of the cotton TLP12 genes (Groups 4 and 8) with A. thaliana and T. cacao (closest relative of Gossypium). The outcomes of synteny analysis showed that cotton the TLP12 gene family members (Groups 4 and 8) have orthologous duplicated genes in T. cacoa (Figures 2A–D) while no orthologous duplicated genes were detected in A. thaliana (Figures 2E–H). Therefore, it may be hypothesized that groups four and eight cotton TLP genes (TLP12) were the consequence of recent species-specific duplication events that led to independent functional diversification. Groups four and eight TLPs orthologous pairs experienced faster evolution as compared with the other TLP gene family members, indicating their functional divergence in Gossypium, proposing that groups four and eight TLPs might have a specific function in cotton species. The identified orthologous TLP12 gene pairs in G. hirsutum and G. barbadense are approximately double in comparison to G. arboreum and G. raimondii, respectively, showing the effect of polyploidy. This leads to more orthologous gene pairs in GhTLP12 and GbTLP12 genes than GaTLP12 and GrTLP12 genes (Qanmber et al., 2019a). The evolutionary analysis within the Gossypium TLP genes showed most of them were greatly conserved during evolution, showed introns of these genes were not lost during evolution, and, at the early expansion stages of evolution, these genes diverged, whereas, over evolutionary time, other genes lost their introns (Qanmber et al., 2019b), indicating that group specific genes may have similar functions. According to a previous report on gene structure, introns performed essential functions during the course of evolution in several plant species (Roy and Gilbert, 2006). During the early expansion period, there was more intron, which subsequently decreased over the passage of time (Roy and Penny, 2007). GrTLP5.2 comprises no intron in gene structure; lack of intron indicates that the TLP gene is advanced where introns were disappeared over the evolutionary time period (Qanmber et al., 2019a); this gene might have some conserved evolutionary function in cotton. These findings demonstrated more advanced species comprise fewer introns in their genomes (Roy and Gilbert, 2005). Higher number of introns led to new functions (Qanmber et al., 2019a). Moreover, several gene families comprise no intron or with fewer introns in their genes (Zhang et al., 2015; Qanmber et al., 2018). Insertions or deletions events participate in the structural differences of exon/intron that might be useful to calculate the evolutionary mechanisms (Lecharny et al., 2003). Introns are absent in some genes that might be due to a rapid evolution rate, whereas a greater number of introns comprising genes leads to a gain of function in evolution (Qanmber et al., 2019b). The loss or gain of genes through segmental duplication or incomplete sequencing of genomes is the major cause for TLP genes distribution in cotton (Qanmber et al., 2019b). Chromosomal allocation studies demonstrated that cotton TLP genes expansion has arisen due to segmental duplication except for GaTLP2.1/GaTLP2.3 (Figures 4A–F). The purifying selection probably excludes the deleterious loss-of-function mutations, refining functional alleles at both duplicate loci and fixing recent duplicate genes (Tanaka et al., 2009). All the identified paralogous cotton TLP gene pairs indicated the purifying selection (Table 2). The recent duplication events in Gossypium TLPs have had implicit ecological, morphological, and physiological diversification (Wendel and Cronn, 2003). The diploid genomes of G. arboreum and G. raimondii were diverged 2–13 MYA, and allotetraploid cotton (G. hirsutum and G. barbadense) was originated about 1–2 MYA (Li et al., 2014; Wang M. J. et al., 2017; Wang et al., 2019). The duplication time of GaTLPs (15.17–50.87 MYA), GrTLPs (11.95–18.88 MYA), GhTLPs (14.77–53.60 MYA), and GbTLPs (0.85–52.6 MYA) implied that duplication events in Gossypium TLP gene families were more ancient than that of both polyploid formation and divergence of diploid species. This duplication might facilitate the unique role of TLP genes in Gossypium, i.e., cotton stress responses. The average duplication time of GaTLPs and GrTLPs was around 24.78 and 16.31 MYA, which probably took place after their divergence from T. cacoa (33 MYA) (Li et al., 2014) and A. thaliana (93 MYA) (Ma et al., 2016); before the reunification of A and D diploid genomes that lead to allotetraploid cotton (Zhang et al., 2015; Wang et al., 2019) (Table 2). These observations suggested that TLP2.2 and TLP2.3 in G. arboreum, G. raimondii, and G. barbadense might have arisen from the same duplication event of cotton TLP2.1 genes. All paralogous cotton TLP gene pairs except GaTLP2.1/GaTLP2.3 experienced segmental duplication (Figures 4A–F). Here, both segmental and tandem duplication helped in the TLP gene family expansion, but segmental duplication might have some significant role in the expansion of the TLP gene family members (Liu et al., 2018; Qanmber et al., 2019a; Ali et al., 2020). The orthologous gene pairs had the sequence identity >90% in cDNA and also in amino acid compositions (Supplementary Table 4), which were carried out for further evolutionary study. Among orthologous-duplicated pairs, the Ka/Ks values of TLP2, TLP5, TLP6, TLP7, TLP8, TLP11, and TLP12 were higher in A vs. D, At vs. Dt, and Dt vs. D. The divergence analysis showed that cotton TLPs experienced greater evolutionary pressure in diploid as well as in allotetraploid cotton and might have evolved rapidly in D subgenome as compared with A subgenome (Supplementary Table 5). All the identified orthologous cotton TLP gene pairs show purifying selection (Figure 4G and Supplementary Table 5). The TLP genes are known to play important roles in stress responses in various plant species (Lai et al., 2004; Liu, 2008; Xu et al., 2016). The transcriptome analysis data of G. hirsutum showed the high expression of GhTLP genes in salt and drought stresses. The expression analysis showed that GhTLP5A.1 and GhTLP5D.2 genes have a significantly higher relative expression in salt stress response, but not in drought stress; therefore, these genes might have a major role in salt-stress tolerance. Moreover, GhTLP11A and GhTLP12A.1 showed higher expression in both salt and drought-stress responses. Therefore, these two genes might have an important role in salt and drought tolerance and could be appropriate targets for further manipulation to protect the cotton from salt and drought stress. To further characterize the function of GhTLP11A and GhTLP12A.1 genes, the co-expression network of these two genes (Supplementary Figure 4) was studied. This study revealed that PCoEGs of GhTLP11A comprised the lateral organ boundaries (LOB) domain (LBD), which were upregulated via ABA treatment in Vitis vinifera under salt-stress response (Grimplet et al., 2017). NCoEGs of GhTLP11A contained ABC transporter-like protein, actively involved in salt-stress recovery in Populuseuphratica (Gu et al., 2004) and calcium protein, which was considered as one of the important molecules in response to salinity (Seifikalhor et al., 2019), and, in a seedling of rice, Ca2+ induces antioxidant enzyme activity and retains cellular redox potential under salt stress (Rahman et al., 2016; Supplementary Tables 6, 7). In salt stress, PCoEGs of GhTLP12A.1 comprised the SANT/MYB domain and the sugar-phosphate transporter domain. Sugarcane MYB18, containing the SANT/MYB DNA-binding domain, remarkably improved tolerance to salt stress (Shingote et al., 2015) and phosphate transporter PHT4;6 of A. thaliana function in cell wall biosynthesis and protein N-glycosylation, which are crucial to salt tolerance (Cubero et al., 2009). NCoEGs of GhTLP12A.1 contained a FYVE/PHD-type zinc finger and MADS-box in salt stress. A. thaliana RING/FYVE/PHD ZFP (AtRZFP) is found to bind with zinc and provides tolerance to salt stress (Zang et al., 2016), and MADS-box considered a positive regulator of salt-stress response via regulating the maintenance of ABA signaling, primary metabolism, detoxification, and ROS homeostasis through antioxidant enzymatic activities (Castelán-Muñoz et al., 2019; Supplementary Tables 6, 7). Results demonstrated that PCoEGs and NCoEGs of GhTLP11A and GhTLP12A.1 genes might be crucial in salt-stress responses. In drought stress, PCoEGs of GhTLP11A comprised a protein kinase domain and haloacid dehalogenase-like hydrolase (HAD hydrolase). Calcium-dependent protein kinase may function as calcium sensors and have an important role in drought-stress response. In A. thaliana, calcium-dependent protein kinase 10 (CPK10) provides tolerance under drought stress via ABA and Ca2+-mediated stomatal regulation (Zou et al., 2010). A. thaliana trehalose-6-phosphate phosphatases (AtTPPs) encodes a protein in the HAD hydrolase superfamily that is involved in the biosynthesis of trehalose. Overexpressed AtTPPF leads to the accumulation of trehalose in response to drought stress and can increase the tolerance under drought stress (Lin et al., 2019). NCoEGs of GhTLP11A contained a B3 domain, which improves drought-stress tolerance via reducing the stomatal density and changed the shape of stomata in Zea mays (Liu Y. H. et al., 2015) and a late embryogenesis abundant (LEA) gene, whose higher expression provides tolerance under drought stress in upland cotton (Magwanga et al., 2018; Supplementary Table 7). PCoEGs of GhTLP12A.1 comprised expansin and a Dof-type zinc finger in drought-stress response. Transgenic wheat expansin 2 (TaEXPA2) positively regulates tolerance under drought stress (Yang et al., 2020), and Brassica rapa expansin-like B1 (BrEXLB1) also associated with drought stress tolerance (Muthusamy et al., 2020), while the overexpressed DOF zinc finger family provides resistance under drought stress in P. trichocarpa (Wang H. et al., 2017). NCoEGs of GhTLP12A.1 contained UBA-like superfamily and dirigent protein under drought stress response (Supplementary Table 7). In wheat, a UBA protein (TaUBA), a negative regulator of drought stress, might function via downregulating some stress responsive transcription factors (Li et al., 2015), and, in Boeahygrometrica, dirigent proteins provide a protective role under drought-stress response via changing the physical characters of lignin, which further affects the flexibility and mechanical strength of the plant cell wall (Wu et al., 2009). Taken together, our results showed that PCoEGs and NCoEGs of GhTLP11A and GhTLP12A.1 genes might have a crucial role in drought-stress tolerance. Moreover, we determined 41 miRNA target sites in 56 cotton TLP transcripts with an expectation score (E) varied from 0.5 to 4 (Supplementary Table 8). In this study, 15 miRNA families, comprising target sites in 28 cotton TLP genes, were detected (Table 4). An earlier report showed that some of the miRNA families were conserved among the plants, which displayed their function in the adaptation of plants to various stress responses (Jones-Rhoades and Bartel, 2004; Yuan et al., 2013). In Vitis vinifera, miR7494 has a prominent role in plants under abiotic stresses, and, in Zea mays, the expression of miR399 gets induced during abiotic stress response (Zhang et al., 2008; Kumar, 2014; Pagliarani et al., 2017; Snyman et al., 2017; Inal et al., 2020). These miRNAs that have been detected in this study are with lower UPE value (7.27–15.77) (Table 4). These outcomes suggested that cotton miRNAs might also be involved in abiotic stress responses to enhance drought- and salt-stress tolerance. Moreover, cis-regulatory element analysis demonstrated that, among the selected putative genes for validation, only GhTLP12A.1 cis-regulatory elements comprised an MBSI cis-regulatory element related to flavonoid biosynthetic regulatory genes, which are very crucial to provide drought tolerance in wheat (Ma D. Y. et al., 2014). Overexpressing many of the genes of flavonoid pathways also provides tolerance under salt stress (Ashraf, 2009; Yang et al., 2009; Matus et al., 2010; Le Martret et al., 2011; Bharti et al., 2015). GhTLP11A comprised the higher number of salicylic acid (SA)-responsive TCA elements. Salicylic acid was identified as a potential hormone to provide tolerance against salinity (Khan et al., 2012) and improves the drought tolerance in rice (Farooq et al., 2009). GhTLP5A.2, GhTLP5D.2, GhTLP11A, and GhTLP12A.1 also showed significant relative expression in qRT-PCR. From these observations, it could be speculated that the proximal elements of GhTLP11A and GhTLP12A.1 might have an important role in controlling the regulation and improvement of salt- and drought-stress responses in cotton. The results of the metabolic pathway study of PCoEGs and NCoEGs of GhTLP11A and GhTLP12A.1 genes and cis-regulatory elements also provided evidence of the involvement of two of these genes in salt- and drought-stress responses. However, detailed molecular explorations are required to understand the structural-functional relationship of cotton TLP genes and the involvement of GhTLPs to enhance the tolerance against drought and salt stresses.

Conclusion

In this study, a total of 105 cotton TLP proteins with a highly conserved tubby domain at C-terminal and N-terminal F-box were identified in four cotton species (G. arboreum, G. raimondii, G. hirsutum, and G. barbadense). Their protein domains, conserved motifs, and gene structure within the same groups shared a notable similarity, which leads to some similar functions. Furthermore, the cotton TLP12 gene family members clustered into 4 and 8 groups and experienced higher evolutionary pressure in comparison to others, showing their functional divergence in Gossypium species. Several G. hirsutum TLP genes showed significantly high expression in both drought- and salt-stress conditions. Two genes GhTLP11A and GhTLP12A.1 demonstrated comparatively higher expression and provided strong evidence that these genes can play a predominant role during drought and salt stress. Our investigation enhances the understanding of TLP genes in cotton at the level of function, evolution, and structure, which further highlights the intriguing field of TLP genes that have immense prospects for future manipulation.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: NCBI (https://www.ncbi.nlm.nih.gov/geo/) under accession numbers PRJNA532694.

Author Contributions

NB carried out the bioinformatics analysis and design and drafted the manuscript. SF performed quantitative expression analysis. CM and SB participated to supervise the study. All the authors have read and approved the final manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  4 in total

1.  Identification of miRNA and their target genes in Cestrum nocturnum L. and Cestrum diurnum L. in stress responses.

Authors:  Nasreen Bano; Shafquat Fakhrah; Sagar Prasad Nayak; Sumit Kumar Bag; Chandra Sekhar Mohanty
Journal:  Physiol Mol Biol Plants       Date:  2022-02-04

Review 2.  Tubby-like proteins (TLPs) transcription factor in different regulatory mechanism in plants: a review.

Authors:  Nasreen Bano; Shahre Aalam; Sumit Kumar Bag
Journal:  Plant Mol Biol       Date:  2022-10-18       Impact factor: 4.335

3.  Genome-Wide Identification, Characterization, and Expression Analysis of Tubby-like Protein (TLP) Gene Family Members in Woodland Strawberry (Fragaria vesca).

Authors:  Shuangtao Li; Guixia Wang; Linlin Chang; Rui Sun; Ruishuang Wu; Chuanfei Zhong; Yongshun Gao; Hongli Zhang; Lingzhi Wei; Yongqing Wei; Yuntao Zhang; Jing Dong; Jian Sun
Journal:  Int J Mol Sci       Date:  2022-10-08       Impact factor: 6.208

4.  Genome-wide identification, phylogenetic and expression pattern analysis of Dof transcription factors in blueberry (Vaccinium corymbosum L.).

Authors:  Tianjie Li; Xiaoyu Wang; Dinakaran Elango; Weihua Zhang; Min Li; Fan Zhang; Qi Pan; Ying Wu
Journal:  PeerJ       Date:  2022-10-03       Impact factor: 3.061

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.