| Literature DB >> 33977216 |
Michael Groszmann1, Annamaria De Rosa1, Jahed Ahmed2, François Chaumont2, John R Evans1.
Abstract
Aquaporins (AQPs) are membrane-spanning channel proteins with exciting applications for plant engineering and industrial applications. Translational outcomes will be improved by better understanding the extensive diversity of plant AQPs. However, AQP gene families are complex, making exhaustive identification difficult, especially in polyploid species. The allotetraploid species of Nicotiana tabacum (Nt; tobacco) plays a significant role in modern biological research and is closely related to several crops of economic interest, making it a valuable platform for AQP research. Recently, De Rosa et al., (2020) and Ahmed et al., (2020), concurrently reported on the AQP gene family in tobacco, establishing family sizes of 76 and 88 members, respectively. The discrepancy highlights the difficulties of characterizing large complex gene families. Here, we identify and resolve the differences between the two studies, clarify gene models, and yield a consolidated collection of 84 members that more accurately represents the complete NtAQP family. Importantly, this consensus NtAQP collection will reduce confusion and ambiguity that would inevitably arise from having two different descriptive studies and sets of NtAQP gene names. This report also serves as a case study, highlighting and discussing variables to be considered and refinements required to ensure comprehensive gene family characterizations, which become valuable resources for examining the evolution and biological functions of genes.Entities:
Keywords: Aquaporins; gene family characterization; gene structure and evolution; major intrinsic proteins; orthologs; phylogenetics
Year: 2021 PMID: 33977216 PMCID: PMC8104905 DOI: 10.1002/pld3.321
Source DB: PubMed Journal: Plant Direct ISSN: 2475-4455
Comparing NtAQP gene annotations from De Rosa et al. (2020) and Ahmed et al. (2020) and deriving a consensus list of 84 members that more accurately represents the complete NtAQP family
| This study | De Rosa et al. ( | Ahmed et al. ( | Description of discrepancy | |||||
|---|---|---|---|---|---|---|---|---|
| Consensus NtAQP family | Gene ID | NCBI Accession (1) | NCBI Accession (2) | Status | Gene ID | NCBI Accession (3) | Status | |
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
| — | — | — | — |
| AAB04757.1 | Duplicated annotation. Additional copy NtPIP1;1t from Wisconsin 38 cultivar. Polymorphisms exist compared to cv. TN90. | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
| NP_001313131.1 | NP_001313131.1 sequence is from cv. Petit Havana SR1; polymorphisms exist with cv. TN90. | |||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
| — | — | — | — |
| CAA04750.1 | Duplicated annotation. Identical to NtPIP1;5s (DeRosa, 2020) and NtPIP1;4 (Ahmed, 2020), a lab based submission with a likley non‐synonomous sequencing error of a strongly conserved functional residue in the widely studied NtAQP1; detailed in De Rosa et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
| NP_001312334.1 | Unlikely extended N‐terminus. Not supported by orthologs in other Solanaceae species and upstream AUG is an unfavorable Kozak's context. | |||
|
|
|
|
|
| ||||
|
|
|
|
| XP_016486700.1 | Unlikely extended N‐terminus. Not supported by orthologs in other Solanaceae species and upstream AUG is an unfavorable Kozak's context. | |||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
| — | — | — | — |
| AAL33586.1 | Duplicated annotation. Additional copy of NtPIP2;8t from Petit Havana SR1 cultivar. | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
| Full length gene not identified in De Rosa et al. ( | ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
| XP_016515710.1 | Identified as a pseudo gene (NtPIP1;2bspseudo) with C‐terminal truncations in De Rosa et al. ( | ||||||
|
|
|
|
|
| ||||
|
|
|
|
| NP_001312131.1 | NP_001312131.1 sequence is from cv. Bright Yellow 2; polymorphisms exist with cv. TN90. | |||
| — | — | — | — |
| BAF95576.1 | Duplicated annotation. Additional copy of NtTIP1;1t from Bright Yellow 2 cultivar. | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
| — | — | — |
|
| Not identified in De Rosa et al. ( | ||
|
|
|
|
| — | — | Not identified in Ahmed et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
| — | — | — |
|
| Not identified in De Rosa et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
| — | — | — | — |
| P24422.2 | Duplicated annotation. Additional copy of NtTIP2;3s from Wisconsin 38 cultivar. | ||
|
|
|
|
|
| ||||
|
| — | — | — |
|
| Not identified in De Rosa et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
| — | — | — | — |
| XP_016491554.1 | Duplicated annotation. Additional copy of NtTIP3;2t from MSK326 cultivar. | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
| — | — | Not identified in Ahmed et al. ( | |||
|
|
|
|
|
| ||||
|
|
|
|
|
| XP_016445609.1 | Extended N‐terminal splice variant of NtNIP1;2t; not supported by RNA‐seq and models of parental and Solanaceae orthologs. Browser link: | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
| — | — | — |
|
| Not identified in De Rosa et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
| — | — | — |
|
| Not identified in De Rosa et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
| — | — | — | — |
| XP_016493176.1 | Duplicated annotation. Misidentified gene copy of an unlikely splice variant of NtNIP5;1s not supported by RNA‐seq distribution patterns. Browser link: | ||
|
|
|
|
|
| XP_016435920.1 | Extended N‐terminal splice variant of NtNIP6;1s; not supported by RNA‐seq and models of parental and Solanaceae orthologs. Browser link: | ||
|
|
|
|
|
| XP_016438237.1 | Extended N‐terminal splice variant of NtNIP6;1t; not supported by RNA‐seq and models of parental and Solanaceae orthologs. Browser link: | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
| — | — | — |
|
| Not identified in De Rosa et al. ( | ||
|
| — | — |
|
| XP_016451938.1 | Not identified in De Rosa et al. ( | ||
|
|
|
|
|
| XP_016439604.1 | C‐terminal splice variant of NtSIP1;1t; not supported by RNA‐seq and models of parental and Solanaceae orthologs. Browser link: | ||
|
|
|
|
| — | — | Not identified in Ahmed et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
| — | — | Not identified in Ahmed et al. ( | ||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| ||||
|
|
|
|
|
| β splice variant version—NCBI accession XP_016446695.1. | |||
|
|
|
|
|
| β splice variant version—NCBI accession HM475295.1. | |||
, Accurate annotation; , Minor inconsistency with annotation; , Significant inconsistency with annotation.
The consensus gene names and their correct corresponding NCBI accession identification codes are in bold.
NCBI Accession (1): Third‐party annotation submissions to NCBI representing curated gene/protein models in De Rosa et al.; (2): NCBI RefSeq records supporting De Rosa et al., TPA submissions in instances of incorrect/unlikely models proposed in Ahmed et al. (2020); (3): NCBI RefSeq records reported in Ahmed et al. (2020). Protein and coding sequences for all 84 AQP members are in Data File S1.
Abbreviation: cv, cultivar.
NtAQP gene identifiers as reported in De Rosa et al. (2020).
NtAQP gene identifiers as reported in Ahmed et al. (2020).
FIGURE 1Phylogeny of the 86 protein products produced by the consensus 84 genes of the tobacco aquaporin family. Branches are color coded in reference to the five sub‐families of PIPs (blue), XIPs (purple), TIPs (green), NIPs (red), and SIPs (Orange). Phylogenetic tree was generated using the neighbor‐joining method with pair‐wise deletions (via MEGA10) from MUSCLE aligned protein sequences. Confidence levels (%) of branch points generated through bootstrapping analysis (n = 1,000). Suffix identifiers “s” or “t” to denote the sub‐genome origins of the sister genes (i.e., “s” = N. sylvestris and “t” = N. tomentosiformis) that reside as discrete pairings across the phylogeny. The six pseudogenes identified in De Rosa et al. (2020) have not been included in this phylogeny. Protein and coding sequences for all NtAQP members are in Data File S1