| Literature DB >> 24666710 |
Laura Näätsaari, Florian W Krainer, Michael Schubert, Anton Glieder, Gerhard G Thallinger1.
Abstract
BACKGROUND: Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24666710 PMCID: PMC3987668 DOI: 10.1186/1471-2164-15-227
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Overview of the sequencing and assembly of the transcriptome. (A) Size distribution of the quality-filtered reads. Total number of reads: 556269, average/median length 342.9/379.0 (B) Length distribution of the 18511 contigs. Average/median length of the contigs: 717.7/667.0 (C) Coverage distribution of the 18511 contigs. Average/median coverage of the contigs: 11.4/7.8 (D) Length distribution of the14871 isotigs. Average/median length of the isotigs: 1133.3/1113.0.
Summary of the transcriptome sequencing, assembly and enzyme discovery results
| Total number of reads | 592,507 | - |
| Clipped reads | 556,269 | - |
| Reads after Newbler quality control | 543,439 | 100.0 |
| Reads aligned | 490,285 | 90.2 |
| Reads assembled | 433,179 | 88.4 |
| Reads partially assembled | 57,064 | 10.5 |
| Singletons | 35,950 | 7.3 |
| Contigs | 18,511 | - |
| Isogroups | 10,619 | - |
| Isotigs | 14,871 | - |
| Average isotig length | 1,133 | - |
| Largest isotig length | 3,659 | - |
| Average contig coverage | 11.4 | - |
| # of isotigs with a secretory peroxidase domain | 18 | - |
| # of full length peroxidase genes | 18 | - |
| # of peroxidase genes after manual revision | 20 | - |
| Total # of isoenzymes (including allelic variants) | 28 | 100 |
| Successfully verified from gDNA | 26 | 89.7 |
| Synthetic genes for production | 26 | 89.7 |
| Successful production of an active isoenzyme | 22 | 75.9 |
Comparison of the HRP isoenzyme sequences between GenBank, UniProt, transcriptome and verified genome sequences
| C1A | TA | Y37 | - | - | AT | I37 | Y37 |
| 109-110 | 109-110 | ||||||
| C1159 | intron | - | intron | G1159 | intron | - | |
| C1B | T/C253 | intron | - | intron | T253 | intron | - |
| T/C859 | intron | - | intron | C859 | intron | - | |
| C1C | ss | ss | ss | ss | * | * | * |
| nt1-60 | aa1-20 | nt1-60 | aa1-20 | ||||
| C178 | R60 | C178 | R60 | A118 | S40 | S40 | |
| A/T1335 | intron | - | intron | - | - | - | |
| A/G1888 | T/A165 | A/G493 | T/A165 | G433 | A145 | A145 | |
| C1889 | A165 | C/T1889 | A/V165 | | | | |
| C/G1921 | Q/E176 | C/G526 | Q/E176 | G466 | E156 | E156 | |
| C2 | CT | intron | - | intron | * | intron | - |
| 1250-1251 | |||||||
| A1334 | intron | - | intron | * | intron | - | |
| C3 | G/T1294 | intron | - | intron | G1294 | intron | - |
| A/T1323 | intron | - | intron | A1323 | intron | - | |
| T/C1484 | L231 | - | - | T1484 | L231 | L231 | |
| C/T1541 | F250 | - | - | C1541 | F250 | F250 | |
| A2 | ss | ss | ss | ss | - | - | * |
| nt1-93 | aa1-31 | 1-93 | aa1-31 | ||||
| AAT | N78 | AAT | N47 | - | - | D47 | |
| 231-234 | 231-234 | ||||||
| GGA | G220 | GGA | G220 | - | - | N189 | |
| 996-998 | 661-663 | ||||||
| AAT | N221 | AAT | N221 | - | - | G190 | |
| 999-1001 | 664-666 | ||||||
| ACG | T284 | ACG | T284 | - | - | L253 | |
| 1185-1187 | 850-852 | ||||||
| G/A1203 | A/T290 | G868 | A290 | - | - | A259 | |
| AAT | N334 | AAT | N334 | - | - | D303 | |
| 1335-1337 | 999-1002 | ||||||
| E5 | ss | ss | ss | ss | - | - | * |
| nt1-81 | aa1-27 | nt1-81 | aa1-27 | ||||
| T419 | L82 | C/T246 | L82 | - | - | L55 | |
| C422 | D83 | T/C249 | D83 | - | - | D56 | |
| C545 | C124 | T/C372 | C124 | - | - | C97 | |
| 01805 | None | none | none | none | - | - | - |
| 22684 | G1611 | R337 | A1010 | K337 | - | - | - |
| TGA | D343 | CGG | G343 | - | - | - | |
| 1627-1629 | 1026-1028 | ||||||
| 01350 | None | none | none | none | - | - | - |
| 02021 | None | none | none | none | - | - | - |
| 03523 | None | none | none | none | - | - | - |
| 06117 | T30 | V10 | C/T30 | V10 | - | - | - |
| C1088 | I269 | T807 | I269 | - | - | - | |
| 17517 | T190 | Y64 | C190 | H64 | - | - | - |
| C1157 | G282 | T846 | G282 | - | - | - | |
| A1232 | K307 | G921 | K307 | - | - | - | |
| 08562.1 | None | none | none | none | - | - | - |
| 08562.4 | None | none | none | none | - | - | - |
| 23190 | T1345 | S109 | G1345 | S109 | - | - | - |
| C1423 | G135 | T1423 | G135 | - | - | - | |
| T1842 | S222 | T/C1842 | S/P222 | | | | |
| C1850 | T224 | A/C1850 | T224 | - | - | - | |
| A2221 | E348 | T/A2221 | V/E348 | - | - | - | |
| 04663 | None | none | none | none | - | - | - |
| 06351 | None | none | none | none | - | - | - |
| 05508 | G/A346 | A/T116 | G/A346 | A/T116 | - | - | - |
| 22489 | - | - | G/A597 | T199 | - | - | - |
| . | . | G/T715 | A/S239 | - | - | - | |
All nucleotide (nt) and amino acid (aa) positions were calculated from the start ATG. Variations between the nt positions of the transcriptome sequence compared to other sequence sources are either due to deletions or intronic sequences in the other sources. “-“ indicates that no sequence information is available from the respective source. “ss” indicates a putative signal sequence. Deletions or missing sequences are marked with “*”. “N/NX” indicates a variation at position X. “None” indicates that no polymorphisms or differences between transcriptome and genome sequence were found. The isoenzymes C1A and C3 were detected in the transcriptome raw reads with only partial/low coverage (0-2x), thus no consensus sequence was formed.
Figure 2GC content distribution of the isotigs. GC content distribution of the A. rusticana isotigs varies from 28% (min) to 62% (max) with a range of 35%. The average GC content of all isotigs is 42.72%. Mode (x-axis value) 43, mode value (y-axis value) 2376.
Summary of the horseradish peroxidase isoenzymes and associated data produced during this study
| - | C1A | 1062 | 43.69 | HE963800 | 0.812 | 3 | 30 | [11–91] [44-49] [97–301] [177–209] | 5.59 |
| 15901 | C1B | 1056 | 43.94 | HE963801 | 0.808 | 3 | 28 | [11–91] [44-49] [97–301] [177–209] | 5.84 |
| 25148 | C1C* | 1059 | 45.14 | HE963802 | 0.809 | 3 | 29 | [11–91] [44-49] [97–301] [177–209] | 6.49 |
| 25148_2 | C1D* | 1059 | 45.04 | HE963803 | 0.810 | 3 | 29 | [11–91] [44-49] [97–301] [177–209] | 7.04 |
| 04627 | C2 | 1044 | 42.91 | HE963804 | 0.800 | 3 | 24 | [11–91] [44-49] [97–301] [177–209] | 8.56 |
| - | C3 | 1050 | 46.76 | HE963805 | 0.781 | 3 | 29 | [11–91] [44-49] [97–300] [177–209] | 7.71 |
| Manual assembly | A2A* | 1011 | 46.79 | HE963806 | 0.761 | 3 | 31 | [11–91] [44-49] [97–299] [176–208] | 4.93 |
| Manual assembly | A2B* | 1011 | 46.69 | HE963807 | 0.761 | 3 | 31 | [11–91] [44-49] [97–299] [176–208] | 4.93 |
| 04382 | E5 | 1044 | 46.07 | HE963808 | 0.771 | 3 | 27 | [11–91] [44-49] [97–300] [177–209] | 8.84 |
| 01805 | 1805* | 1065 | 44.41 | HE963809 | 0.797 | 3 | 31 | [11–91] [44-49] [97–301] [177–209] | 5.97 |
| 22684 | 22684.1* | 1050 | 46.76 | HE963810 | 0.770 | 3 | 29 | [11–91] [44-49] [97–300] [177–209] | 6.98 |
| 22684_2 | 22684.2* | 1050 | 46.67 | HE963811 | 0.772 | 3 | 29 | [11–91] [44-49] [97–300] [177–209] | 6.37 |
| 01350 | 1350* | 975 | 50.97 | HE963812 | 0.707 | 3 | 28 | [11–91] [44-49] [97–292] [176–201] | 8.67 |
| 02021 | 2021* | 996 | 46.08 | HE963813 | 0.788 | 3 | 29 | [11–89] [44-49] [95–297] | 9.46 |
| 23190 | 23190.1* | 1080 | 49.26 | HE963817 | 0.724 | 2 | 31 42 | [11–92] [44-49] [98–293] [178–205] or [22–103] [55-60] [109–304] [189–216] | 8.40 6.58 |
| 23190_2 | 23190.2* | 1080 | 49.17 | HE963817 | 0.722 | 2 | 31 42 | [11–92] [44-49] [98–293] [178–205] or [22–103] [55-60] [109–304] [189–216] | 8.60 7.09 |
| 04663 | 4663* | 1077 | 47.82 | HE963814 | 0.748 | 3 | 31 | [11–91] [44-49] [97–299] [176–208] | 4.48 |
| 06351 | 6351* | 945 | 43.39 | HE963816 | 0.786 | 3 | 18 | [17–96] [50-55] [102–292] [180–206] | 6.37 |
| 03523 | 3523* | 960 | 44.58 | HE963820 | 0.761 | 0 | 22 | [11–92] [44-49] [98–293] [177–203] | 8.99 |
| 05508 | 5508.1* | 966 | 49.28 | HE963815 | 0.735 | 2 | 24 30 | [11-87] [44-49] [93–287] [171–198] or [17–93] [50-55] [99–293] [177–204] | 8.49 8.47 |
| 05508_2 | 5508.2* | 966 | 49.38 | HE963815 | 0.735 | 2 | 24 30 | [11-87] [44-49] [93–287] [171–198] or [17–93] [50-55] [99–293] [177–204] | 8.49 8.47 |
| 22489_1 | 22489.1* | 978 | 51.02 | HE963818 | 0.726 | 2 | 23 34 | [22–98] [55-60] [104–298] [182–209] or [11-87] [44-49] [93–287] [171–198] | 8.93 8.51 |
| 22489_2 | 22489.2* | 978 | 50.82 | HE963819 | 0.727 | 2 | 23 34 | [22–98] [55-60] [104–298] [182–209] or [11-87] [44-49] [93–287] [171–198] | 8.93 8.51 |
| 06117 | 6117* | 1008 | 47.02 | HE963821 | 0.802 | 3 | 22 32 | [11–91] [44-49] [97–298] [176–208] or [21–101] [54-59] [107–308] [186–218] | 5.52 6.16 |
| 17517_1 | 17517.1* | 972 | 48.46 | HE963822 | 0.737 | 2 | 23 24 | [12–88] [45-50] [94–296] [171–203] or [11-87] [44-49] [93–295] [170–202] | 9.49 9.39 |
| 17517_2 | 17517.2* | 972 | 48.56 | HE963823 | 0.739 | 2 | 23 24 | [12–88] [45-50] [94–296] [171–203] or [11-87] [44-49] [93–295] [170–202] | 9.52 9.41 |
| 08562_1 | 08562.1* | 996 | 47.09 | HE963824 | 0.779 | 3 | 22 28 | [11–91] [44-49] [97–298] [176–208] or [17–97] [50-55] [103–304] [182–214] | 9.01 9.03 |
| 08562_4 | 08562.2* | 996 | 47.79 | HE963825 | 0.788 | 3 | 22 28 | [11–91] [44-49] [97–298] [176–208] or [17–97] [50-55] [103–304] [182–214] | 9.00 9.02 |
The nucleotide sequences of 28 isoenzymes were submitted to EMBL. “*” indicates novel. “CAI” = codon adaptation index. If signal sequence predictions gave more than one alternative result, both signal sequence lengths are shown, separated by “/”. Disulfide bridges were predicted using both alternatives of the mature protein (EDBCP = Ensemble-based Disulfide Bonding Connectivity Pattern prediction server).
Figure 3Cladogram of all isoenzymes known and discovered during this study. The dendrogram was cut at a branch length of 0.21 and the resulting sub-trees were colored. All previously described isoenzymes are located in the red and light green trees, respectively; whereas all novel isoenzymes (except for 01805, 22684.1, 22684.2, and 04663) cluster in distinct sub-trees (black, blue, dark green and orange) indicating a larger sequence diversity.
Summary of isoenzyme expression and characterization
| - | C1A | + | + | + | + |
| 15901 | C1B | - | + | - | - |
| 25148 | C1C* | + | + | (+) | + |
| 25148_2 | C1D* | + | + | (+) | + |
| 04627 | C2 | + | + | + | + |
| - | C3 | + | + | + | + |
| Manual assembly | A2A* | + | + | + | + |
| Manual assembly | A2B* | + | + | (+) | (+) |
| 04382 | E5 | + | + | + | + |
| 01805 | 01805* (B1?) | + | + | - | - |
| 22684 | 22684.1* (B2A?) | + | + | - | - |
| 22684_2 | 22684.2* (B2B?) | + | + | - | - |
| 01350 | 01350* | + | + | - | - |
| 02021 | 02021* | - | - | - | - |
| 23190 | 23190.1* | - | - | - | - |
| 23190_2 | 23190.2* | n.d. | n.d. | n.d. | n.d. |
| 04663 | 04663.1* | + | (+) | - | - |
| 06351 | 06351* | + | + | + | + |
| 03523 | 03523* | - | - | - | - |
| 05508 | 05508.1* | + | + | + | + |
| 05508_2 | 05508.2* | n.d. | n.d. | n.d. | n.d. |
| 22489_1 | 22489.1* | + | + | (+) | + |
| 22489_2 | 22489.2* | + | + | (+) | + |
| 06117 | 06117* | - | - | - | - |
| 17517_1 | 17517.1* | + | - | - | - |
| 17517_2 | 17517.2* | + | + | + | + |
| 08562_1 | 08562.1* | + | + | - | - |
| 08562_4 | 08562.2* | + | + | - | + |
Isoenzymes showing obvious peroxidase activity with the assay used are marked with “+”. Isoenzymes showing very low but detectable peroxidase activity with the assay used are marked with “(+)”. Isoenzymes with no activity detected during an observation period of 2 h are marked with “-“. Allelic variants not produced heterologously are marked with n.d. (no data available). Isoenzymes discovered during this study are marked with “*”.