| Literature DB >> 16820045 |
Martijn F Schenk1, Ludovicus Jwj Gilissen, Gerhard D Esselink, Marinus Jm Smulders.
Abstract
BACKGROUND: Pollen of the European white birch (Betula pendula, syn. B. verrucosa) is an important cause of hay fever. The main allergen is Bet v 1, member of the pathogenesis-related class 10 (PR-10) multigene family. To establish the number of PR-10/Bet v 1 genes and the isoform diversity within a single tree, PCR amplification, cloning and sequencing of PR-10 genes was performed on two diploid B. pendula cultivars and one interspecific tetraploid Betula hybrid. Sequences were attributed to putative genes based on sequence identity and intron length. Information on transcription was derived by comparison with homologous cDNA sequences available in GenBank/EMBL/DDJB. PCR-cloning of multigene families is accompanied by a high risk for the occurrence of PCR recombination artifacts. We screened for and excluded these artifacts, and also detected putative artifact sequences among database sequences.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16820045 PMCID: PMC1552068 DOI: 10.1186/1471-2164-7-168
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Phylogenetic profiles for detection of recombination. Phylogenetic profile of the sequences from B. pendula 'Long Trunk' obtained after a PCR of (a) 30 cycles (n = 72 sequences) and (b) 20 cycles (n = 53). (c) Phylogenetic profile of the GenBank PR-10 sequences from B. pendula (n = 66). The x-axis represents the sequence position (5'-3' including only informative positions). The y-axis indicates the phylogenetic correlation. Low values are indicative for recombination [27]. Low values at the edges are artifacts of the employed method.
Cloned and sequenced PR-10 sequences from B. pendula. Overview of the individual clones of PR-10 sequences from B. pendula 'Schneverdinger Goldbirke', 'Tristis', and 'Long Trunk'. Different primers (BpI, BpII) were used. De number of cycles varied between 22 and 30. Confirmed sequences are found in multiple independent PCRs. Unique sequences differ at least by three base pairs from any other sequence from the same cultivar. The remaining sequences are either recombination artifacts or presumably result from base mis-incorporations. The number of alleles included for further analysis is also indicated.
| Primer combination | BpI | BpII | BpI | No. of different | |||
| Confirmed sequences | 43 | (86%) | 50 | (69%) | 24 | (45%) | 20 |
| Unique sequences | 2 | (4%) | 5 | (7%) | 5 | (9%) | 8 |
| Recombination artifacts | 1 | (2%) | 8 | (11%) | 18*1 | (34%) | |
| Base mis-incorporation artifacts | 4 | (8%) | 9 | (13%) | 6 | (11%) | |
| Total no. of clones | 50 | 72 | 53 | 28 | |||
| Primer combination | BpI | BpII | BpI | No. of different | |||
| Confirmed sequences | 17 | (77%) | 20 | (53%) | 26 | (54%) | 11 |
| Unique sequences | 1 | (5%) | 5 | (13%) | 0 | - | 4 |
| Recombination artifacts | 2 | (9%) | 2 | (5%) | 22*2 | (46%) | |
| Base mis-incorporation artifacts | 2 | (9%) | 11 | (29%) | 0 | - | |
| Total no. of clones | 22 | 38 | 48 | 15 | |||
| Primer combination | BpI | BpII | BpI | No. of different | |||
| Confirmed sequences | 21 | (53%) | 13 | (43%) | 21 | (43%) | 10 |
| Unique sequences | 0 | - | 3 | (10%) | 3 | (6%) | 4 |
| Recombination artifacts | 12 | (30%) | 8 | (27%) | 15 | (29%) | |
| Base mis-incorporation artifacts | 7 | (18%) | 6 | (20%) | 11 | (22%) | |
| Total no. of clones | 40 | 30 | 51 | 14 | |||
*1 one sequence had both a recombination and a base mis-incorporation artifact
*2 two sequences had both a recombination and a base mis-incorporation artifact
Figure 2Bayesian phylogenetic tree of the . Bayesian phylogenetic tree of the PR-10 sequences from B. pendula 'Schneverdinger Goldbirke' (Sv), 'Tristis' (Tr), and the B. pendula alleles from 'Long Trunk' (Lt). The 'Long Trunk' alleles that belong to the unknown parental species are not included in this figure. Numbers on the branches represent posterior probabilities after running a Markov chain Monte Carlo search for 1,000,000 generations. Sequences of PR-10 genes from Malus domestica (apple, X83672, Z72425, Z72427), Prunus armeniaca (apricot, AF020784), P. avium (cherry, U66076), and Pyrus communis (pear, AF057030) were used as outgroup. Each cluster that is identified as a putative gene has maximally two alleles per cultivar. Genes are classified into five major groups. The intron length is indicated on the right. If multiple introns of the same length exist within one group, the different types are shown between brackets. *1 PR-10.03B02.01 from 'Tristis' was an in vivo recombination of the PR-10.03D gene and the original PR-10.03B gene.
Classification and nomenclature of B. pendula PR-10 sequences from our cultivars and GenBank. Indicated are the subfamily (I to V), gene designations (PR-10.01A to PR-10.05), allergen designation if the genes are known to be pollen expressed (Bet v 1.01A to Bet v 1.02C), and allele names as defined in Figure 2. Known isoforms [19] are shown, followed, between brackets, by the GenBank accession number. The tissue of origin is shown in case of mRNA-derived GenBank sequences (L = leaves, R = roots, P = pollen).
| Sub-family | Gene | Allergen | Allele (GenBank no.)*1 | Known isoforms (GenBank no.)*1 | Location of transcription | Sequence identity to reference sequence*2 |
| I | PR-10.01A | Bet v 1.01A | Bet v 1.01A01.01*3 ( | Bet v 1a = Bet v 1.0101 ( | P, P | 100% |
| Bet v 1.01A02.01 ( | - | 99.1% | ||||
| Bet v 1.01A03.01 ( | - | 99.1% | ||||
| Bet v 1.1501 ( | - | 99.8% | ||||
| Bet v 1.1502 ( | - | 99.8% | ||||
| Bet v 1.0102 ( | P | 99.8% | ||||
| Bet v 1.0103 ( | P | 99.8% | ||||
| Bet v 1.2501 ( | P | 98.8% | ||||
| Bet v 1.2801 ( | P | 99.5% | ||||
| Bet v 1.3001 ( | P | 99.8% | ||||
| - ( | - | 98.6% | ||||
| - ( | P | 99.8% | ||||
| - ( | P | 99.3% | ||||
| - ( | P | 99.8% *4 | ||||
| - ( | P | 98.6% | ||||
| - ( | P | 99.5% *4 | ||||
| PR-10.01B | Bet v 1.01B | Bet v 1.01B01.01 ( | - ( | P, P | 100% | |
| Bet v 1d = Bet v 1.0401 ( | P | 99.3% | ||||
| Bet v 1h = Bet v 1.0402 ( | P | 99.8% | ||||
| - ( | P | 99.8% | ||||
| PR-10.01C | Bet v 1.01C | Bet v 1.01C01.01 ( | - | 100% | ||
| Bet v 1.01C02.01 ( | - | 99.8% | ||||
| Bet v 1f = Bet v 1.0601 ( | P | 99.8% | ||||
| Bet v 1i = Bet v 1.0602 ( | P | 99.8% | ||||
| - ( | P | 99.5% | ||||
| PR-10.01D | Bet v 1.01D | Bet v 1.01D01.01 ( | - ( | P | 100% | |
| Bet v 1.01D02.01 ( | - | 99.1% | ||||
| Bet v 1.1701 ( | - | 98.4% | ||||
| - ( | - | 98.4% | ||||
| II | PR-10.02A | Bet v 1.02A | Bet v 1.02A01.01 ( | - | 100% | |
| Bet v 1.02A02.01 ( | - | 99.8% | ||||
| - ( | P | 99.8% | ||||
| - ( | P | 99.8% | ||||
| PR-10.02B | Bet v 1.02B | Bet v 1.02B01.01 ( | - | 100% | ||
| Bet v 1.02B02.01a*5 ( | - | 99.3% | ||||
| Bet v 1.02B02.01b ( | - | 99.3% | ||||
| Bet v 1.1801 ( | - | 99.1% | ||||
| PR-10.02C | Bet v 1.02C | Bet v 1.02C02.01 ( | Bet v 1k = Bet v 1.0901 ( | P | 100% | |
| Bet v 1.02C02.02 ( | - | 99.8% | ||||
| Bet v 1.02C01.01 ( | - | 99.8% | ||||
| Bet v 1.20101 ( | - | 99.3% | ||||
| Bet v 1c = Bet v 1.0301 ( | P | 99.8% | ||||
| Bet v 1.1901 ( | - | 98.6% | ||||
| Bet v 1m = Bet v 1.1401 ( | P | 99.3% | ||||
| Bet v 1n = Bet v 1.1402 ( | P | 99.3% | ||||
| III | PR-10.03A | - | PR-10.03A01.01 ( | - | 100% | |
| PR-10.03A02.01 ( | - | 99.8% | ||||
| PR-10.03B | - | PR-10.03B01.01 ( | - | 100% | ||
| PR-10.03B-p01*5 ( | - | 100%*4 | ||||
| PR-10.03B*6 | PR-10.03B02.01 ( | - | 96.5% | |||
| PR-10.03C | - | PR-10.03C01.01 ( | - ( | - | 100% | |
| PR-10.03C02.01 ( | - | 99.5% | ||||
| PR-10.03C02.02 ( | - | 99.3% | ||||
| Bet v 1.1201 ( | L, R | 99.5% | ||||
| PR-10.03D | - | PR-10.03D01.01 ( | - | 100% | ||
| PR-10.03D02.01 ( | - | 99.8% | ||||
| Bet v 1.1101 ( | L, R | 99.5% | ||||
| IV | PR-10.04 | - | PR-10.0401.01 ( | - | 100% | |
| PR-10.0402.01 ( | - | 98.9% | ||||
| V | PR-10.05 | - | PR-10.0501.03 ( | Bet v1 1.1301 ( | L, R | 100% |
| PR-10.0501.02 ( | - | 99.5% | ||||
| PR-10.0501.01 ( | - | 99.8% | ||||
| PR-10.0501.04 ( | - | 99.8% | ||||
| PR-10.0502.01 ( | - | 99.8% | ||||
*1 The known mRNA-derived GenBank sequences contain no intron, while the new gDNA sequences do, aiding in the gene identification.
*2 The upper most allele was taken as a reference sequence, identities are calculated for an aligned stretch of 425 bp from base 28 to 452 of the consensus.
*3 The last two numerals indicate silent mutations (see Results section for further explanation of the nomenclature)
*4 These sequences contain an indel; Sequence identity is calculated excluding the indel.
*5 Pseudogene allele.
*6In vivo recombination.
Figure 3Amino acid sequences, amino acids that affect IgE-reactivity, and T-cell epitopes of the PR-10 proteins. Amino acid sequences of the PR-10 proteins from B. pendula 'Tristis' (Tr), 'Schneverdinger Goldbirke' (Sv), and the B. pendula alleles from 'Long Trunk' (Lt). Amino acids associated with high allergenicity are marked with grey boxes and those associated with low IgE-reactivity (located within B-cell epitopes) are marked with black boxes [12, 14]. The locations of the two major T-cell activating regions are indicated above the consensus [22].