| Literature DB >> 25635042 |
Johannes A Hofberger1, David L Nsibo2, Francine Govers3, Klaas Bouwmeester4, M Eric Schranz2.
Abstract
The comparative analysis of plant gene families in a phylogenetic framework has greatly accelerated due to advances in next generation sequencing. In this study, we provide an evolutionary analysis of the L-type lectin receptor kinase and L-type lectin domain proteins (L-type LecRKs and LLPs) that are considered as components in plant immunity, in the plant family Brassicaceae and related outgroups. We combine several lines of evidence provided by sequence homology, HMM-driven protein domain annotation, phylogenetic analysis, and gene synteny for large-scale identification of L-type LecRK and LLP genes within nine core-eudicot genomes. We show that both polyploidy and local duplication events (tandem duplication and gene transposition duplication) have played a major role in L-type LecRK and LLP gene family expansion in the Brassicaceae. We also find significant differences in rates of molecular evolution based on the mode of duplication. Additionally, we show that LLPs share a common evolutionary origin with L-type LecRKs and provide a consistent gene family nomenclature. Finally, we demonstrate that the largest and most diverse L-type LecRK clades are lineage-specific. Our evolutionary analyses of these plant immune components provide a framework to support future plant resistance breeding.Entities:
Keywords: Brassicaceae; L-type lectin receptor kinases; comparative genomics; gene duplication; plant innate immunity; polyploidy
Mesh:
Substances:
Year: 2015 PMID: 25635042 PMCID: PMC5322546 DOI: 10.1093/gbe/evv020
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FPhylogeny and classification of A. thaliana L-type LecRKs and LLPs. (A) Phylogeny of 43 full-length L-type LecRKs and 11 LLPs in A. thaliana. We identified two LLP clades; LecPs (lacking transmembrane domains) and LecRPs (with transmembrane domains) which are highlighted in dark gray and ochre, respectively. Color-coding was adopted according to Bouwmeester and Govers (2009). TD events are indicated by light blue stars. The tree was rooted using the A. thaliana G-type LecRKs CES101 and ARK1, the C-type LecRK AT1G52310, and the Wall-associated kinases WAK1 and PERK1. Clade-support bootstrap values range from 0.80 to 0.94. (B) Clade assignment of 309 LecRKs identified across nine analyzed genome annotations. Colors represent the nine clades originally described by Bouwmeester and Govers (2009). “A” refers to ambiguous genes (singletons).
Classification of LLP Loci in A. thaliana Including Information on Encoded Proteins
| Gene Information | Protein Information | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Proposed LLP Clade Classification | Proposed Gene Name | Locus | Tandem Duplicate? | Length (bp) | Uniprot Accession | Length (AA) | Signal Peptide | No. of TM Motifs | Domain Configuration | Reference |
| LecRP | AT3G09035 | yes | 1017 | Q3EBA4 | 338 | yes | 1 | L-type lectin-TM | ||
| AT3G09190 | yes | 1038 | Q9SS71 | 345 | yes | 1 | L-type lectin-TM | This manuscript | ||
| AT3G54080 | no | 1053 | Q9M395 | 350 | yes | 1 | L-type lectin-TM | |||
| AT5G01090 | no | 1062 | Q9LFC7 | 353 | yes | 1 | L-type lectin-TM | |||
| LecP | AT1G07460 | no | 777 | Q4PT39 | 258 | no | 0 | L-type Lectin | ||
| AT1G53060 | yes | 729 | Q9LNN1 | 242 | no | 0 | L-type Lectin | |||
| AT1G53070 | yes | 819 | Q9LNN2 | 272 | yes | 0 | L-type Lectin | |||
| AT1G53080 | yes | 852 | Q9LNN3 | 283 | yes | 0 | L-type Lectin | |||
| AT3G15356 | no | 816 | Q9LJR2 | 271 | yes | 0 | L-type Lectin | |||
| AT3G16530 | no | 831 | Q9LK72 | 276 | yes | 0 | L-type Lectin | |||
| AT5G03350 | no | 825 | Q9LZF5 | 274 | yes | 0 | L-type Lectin | |||
aAlias SAI-LLP1 (Armijo et al. 2013).
FClassification of L-type LecRKs and LLPs identified in nine plant species. (A) Domain composition of 309 L-type LecRKs and 84 LLPs across Brassicaceae, Brassicales, T. cacao, and V. vinifera. L-type LecRKs containing two kinase domains are present in all analyzed species except C. papaya. Note that T. cacao lacks LecPs. (B) Cladogram based on the legume-like lectin domains of Brassicaceae L-type LecRKs from A. thaliana, A. lyrata, B. rapa, Th. halophila, and Ae. arabicum. Further included are 63 legume-like lectin domain sequences from four other families: Ta. hasslerania (Cleomaceae), T. cacao (Malvaceae), C. papaya (Caricaceae), and V. vinifera (Vitaceae) with support values indicated on key nodes. Number-only IDs refer to expressed genes present in the “Araly1”-annotation (A. lyrata). The phylogenetic tree was rooted with the extracellular domains of the G-type LecRKs CES101 and ARK1, the C-type LecRK AT1G52310, and the Wall-associated kinases WAK1 and PERK1 as outgroup sequences. Clade support bootstrap values range from 0.70 to 0.95. For all species, the L-type LecRKs cluster to nine distinct clades (colored) corresponding to the clade assignment of the A. thaliana L-type LecRKs including those without clear affiliation to a distinct clade (ambiguous). Symbols placed on nodes represent the different duplication modes: that is, At-α WGD event (orange circles), At-α ohnologs subjected to TD (TD-α genes) (orange circle with black square), TD event (light blue stars), gene transposition duplicates (black triangle), and more ancient polyploidy events: At-β (blue square) and At-γ (green circles). Symbols mark last common duplication events. Six of nine clades are specific to Brassicaceae, Cleomaceae and Caricaceae, whereas the rest of the clades are shared between Brassicales and Vitales. Ambiguous LecRKs are spread across the tree and across the families.
Duplicate LLP Gene Pairs in Arabidopsis thaliana and Mode of Duplication
| Duplicate 1 | Duplicate 2 | |||
|---|---|---|---|---|
| AGI | Name | AGI | Name | Duplication Mode |
| AT1G53060 | AT1G53070 | TD | ||
| AT1G53070 | AT1G53080 | TD | ||
| AT1G53080 | AT1G53070 | TD | ||
| AT3G09035 | AT3G09190 | GTD | ||
| AT3G09190 | AT3G09035 | GTD | ||
| AT3G15356 | AT3G16530 | GTD | ||
| AT3G16530 | AT3G15356 | GTD | ||
| AT5G03350 | AT3G15356 | GTD | ||
| AT3G54080 | AT5G01090 | Ohnolog | ||
| AT5G01090 | AT3G54080 | Ohnolog | ||
| AT1G07460 | AT2G29220 | Tandem and ohnolog (TD-α) | ||
Ohnolog Duplicate Fractions among Genes Encoding a Protein with a L-type Lectin Domain
| Genome-Wide | Genes Encoding a L-type Lectin Domain | ||||||
|---|---|---|---|---|---|---|---|
| Species | Number of Genes | Ohnolog Fraction (%) | Sum | Ohnolog Fraction (%) | Enrichment | ||
| 27,416 | 22 | 45 | 11 | 56 | 29 | Yes | |
| 32,670 | 28 | 50 | 16 | 66 | 35 | Yes | |
| 40,367 | 53 | 59 | 21 | 80 | 40 | No | |
| 25,191 | 32 | 35 | 8 | 43 | 40 | Yes | |
| 22,230 | 29 | 20 | 5 | 25 | 56 | Yes | |
| 31,580 | 48 | 22 | 9 | 31 | 48 | No | |
| 27,793 | 7 | 14 | 4 | 18 | 11 | No | |
| 29,452 | 32 | 38 | 2 | 40 | 33 | No | |
| 23,092 | 22 | 26 | 8 | 34 | 38 | Yes | |
| Σ/Average | 30 | 309 | 84 | 393 | 37 | Yes | |
aAccording to Fisher’s exact test (P < 0.01).
Tandem Duplicate Fractions among Genes Encoding a Protein with a L-type Lectin Domain
| Species | Genes Encoding an L-Type Lectin Domain | Number of Tandem Duplicates | Fraction of Tandem Duplicates (%) | Number of Tandem Arrays | Average Size of Arrays | Number of Genes in Largest Array |
|---|---|---|---|---|---|---|
| 56 | 31 | 55 | 10 | 3.1 | 6 | |
| 66 | 19 | 29 | 8 | 2.4 | 4 | |
| 80 | 34 | 43 | 11 | 3.1 | 10 | |
| 43 | 19 | 44 | 7 | 2.7 | 5 | |
| 25 | 0 | 0 | 0 | 0 | 0 | |
| 31 | 10 | 32 | 4 | 2.5 | 4 | |
| 14 | 3 | 16 | 1 | 3.0 | 3 | |
| 40 | 27 | 68 | 8 | 3.4 | 7 | |
| 34 | 11 | 32 | 5 | 2.2 | 3 | |
| Σ/Average | 389 | 154 | 40 | 54 | 2.9 | 4.57 |
aTandem array refers to a locus containing one distinct cluster of tandemly arrayed genes.
bEarly-build genome annotation.
FVenn-diagrams illustrating genome-wide average and L-type LecRK gene duplication fractions. Tandem duplicates (red), ohnolog duplicates (green), and gene transposition duplicates (blue). (A) Duplicates among all protein-coding genes present in the A. thaliana genome. (B) Duplicates among all L-type LecRKs present in the A. thaliana genome.
Molecular Evolution Rates Following Different Modes of LecRK Duplication
| Duplication Mode | Ka | Ks | Ka/Ks |
|---|---|---|---|
| Gene transposition duplicates | 2.6 | 2.78 | 0.94 |
| Ohnolog duplicates | 2.98 | 2.68 | 1.11 |
| Tandem and ohnolog duplicates (TD-α genes) | 2.58 | 2.29 | 1.13 |
| Tandem duplicates | 2.72 | 2.42 | 1.23 |
Note.—Ka, nonsynonymous substitutions per nonsynonymous site; Ks, synonymous substitutions per synonymous site.
FAnalysis of divergence of L-type LecRKs based on mode of gene duplication in A. thaliana. (A) Molecular evolution rates of L-type LecRK gene pairs based on Ka/Ks values following TD (red), GTD (blue), divergence of ohnologs due to WGD (green), and divergence of ohnologs that have been subjected to TD (TD-α genes) (ochre). (B) Divergence of duplicate gene coding sequence length following the aforementioned duplication modes with identical color-coding.