| Literature DB >> 18291034 |
Freddy Boutrot1, Nathalie Chantret, Marie-Françoise Gautier.
Abstract
BACKGROUND: Plant non-specific lipid transfer proteins (nsLTPs) are encoded by multigene families and possess physiological functions that remain unclear. Our objective was to characterize the complete nsLtp gene family in rice and arabidopsis and to perform wheat EST database mining for nsLtp gene discovery.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18291034 PMCID: PMC2277411 DOI: 10.1186/1471-2164-9-86
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
NsLtp genes identified in the Oryza sativa subsp. japonica genome and features of the deduced proteins. Identical proteins refer to their relative redundant form. A cluster of tandem duplication repeats is indicated by a vertical line before the gene names (see also Figure 1).
| locus/model | intron | signal peptide | mature protein | |||
| bp | AA | AA | MM | pI a | ||
| Os01g12020.1 | 103 | 24 | 99 | 10212 | 4.36 | |
| Os01g60740 b | 86 | 27 | 93 | 9464 | 10.55 | |
| Os03g59380.1 | 94 | 33 | 91 | 9085 | 12.07 | |
| Os05g40010.1 | 372 | 30 | 99 | 9780 | 12.05 | |
| Os06g06340.1 | 100 | 28 | 98 | 10069 | 9.84 | |
| Os06g34840.1 | 2740 | 27 | 120 | 12297 | 3.92 | |
| Os08g03690.1 | 547 | 27 | 93 | 9621 | 9.90 | |
| | | Os11g02330 c | 106 | 27 | 92 | 9336 | 10.25 |
| | | Os11g02350.1 | 90 | 28 | 93 | 9437 | 10.89 |
| | | Os11g02379.1 d | 114 | 25 | 91 | 8895 | 11.81 |
| | | Os11g02379.2 | 89 | 26 | 92 | 8916 | 10.55 |
| | | Os11g02400.1 | 106 | 26 | 92 | 9031 | 11.50 |
| | | Os11g02424.2 | 709 | 26 | 92 | 9104 | 11.50 |
| Os11g24070.1 | 116 | 25 | 92 | 9147 | 12.20 | |
| | | Os12g02290.1 | 133 | 27 | OsLTPI.8 | ||
| | | Os12g02300.1 | 90 | OsLTPI.9 | |||
| | | Os12g02310.1 | 102 | 25 | 92 | 8930 | 10.55 |
| | | Os12g02320.1 | 138 | 25 | 91 | 8909 | 11.81 |
| | | Os12g02330.1 | 106 | 26 | OsLTPI.12 | ||
| | | Os12g02340.1 | 713 | 26 | OsLTPI.13 | ||
| | | Os01g49640.1 | none | 26 | 77 | 8119 | 11.98 |
| | | Os01g49650.1 | none | 36 | 76 | 7987 | 11.28 |
| Os03g02050.1 | none | 20 | 76 | 7549 | 11.90 | |
| Os05g47700.1 | none | 27 | 67 | 7066 | 10.16 | |
| Os05g47730.1 | none | 27 | 69 | 7270 | 10.66 | |
| Os06g49190.1 | none | 27 | 67 | 6967 | 10.64 | |
| | | Os10g36070.1 | none | 26 | 74 | 7613 | 9.84 |
| | | Os10g36090.1 | none | 26 | 74 | 7659 | 9.84 |
| | | Os10g36100 e | none | 26 | 75 | 7774 | 9.84 |
| | | Os10g36110.1 | none | 25 | 75 | 7926 | 9.84 |
| | | Os10g36160.1 | none | 25 | 69 | 7382 | 7.06 |
| | | Os10g36170.1 | none | 24 | 67 | 6890 | 11.90 |
| Os11g40530.1 | none | 36 | 74 | 7665 | 12.14 | |
| Os08g43290.1 | 84 | 26 | 68 | 6744 | 7.84 | |
| Os09g35700.1 | 107 | 26 | 69 | 6839 | 7.84 | |
| | | Os01g68580.1 | none | 29 | 82 | 8908 | 10.65 |
| | | Os01g68589.1 | none | 25 | 78 | 8291 | 9.90 |
| Os07g18750.1 | none | 28 | 76 | 8073 | 7.84 | |
| Os07g18990.1 | none | 23 | 81 | 8420 | 9.86 | |
| Os01g62980.1 | 97 | 27 | 91 | 9390 | 12.05 | |
| | | Os04g33920.1 | 290 | 22 | 94 | 9608 | 10.22 |
| | | Os04g33930.2 | 419 | 26 | 97 | 9940 | 11.28 |
| Os05g06780.1 | 676 | 24 | 93 | 9497 | 9.69 | |
| | | Os01g58650.1 | 2851 | 20 | 103 | 10909 | 4.48 |
| | | Os01g58660.1 | 92 | 23 | 89 | 9876 | 9.45 |
| Os10g05720.2 | 440 | 28 | 81 | 8724 | 9.56 | |
| Os11g29420.1 | 791 | 29 | 96 | 10176 | 6.00 | |
| Os11g37280.1 | 595 | 27 | 105 | 10781 | 5.32 | |
| Os06g49770 f | 221 | 30 | 102 | 9594 | 9.79 | |
| Os03g44000.1 | 1088 | 24 | 109 | 12073 | 9.69 | |
| Os07g27940.1 | 148 | 27 | 107 | 10892 | 11.98 | |
| Os11g34660 g | 825 | 27 | 104 | 11394 | 5.50 | |
AA, number of amino acids; MM, molecular mass in Dalton; pI, isoelectric point.
a cysteine residues were not taken into account in the pI calculation.
b using the transcript structure Os01g60740.2.
c annotations curated (strand: +1; exon 1 start: 679124, end: 679473; exon 2 start: 679580, end: 679589).
d annotations curated (strand: +1; exon 1 start: 702105, end: 702445; exon 2 start: 702560, end: 702569).
e annotations curated (strand: +1; exon start: 18974249, end: 18974554).
f annotations curated (strand: +1; exon 1 start: 30113033, end: 30113426; exon 2 start: 30113648, end: 30113652).
g annotations curated (strand: +1; exon 1 start: 19789864, end: 19790209; exon 2 start: 19791035, end: 19791084).
NsLtp genes identified in the Arabidopsis thaliana genome and features of the deduced proteins. A cluster of tandem duplication repeats is indicated by a vertical line before the gene names (see also Figure 1).
| locus/model | intron | signal peptide | mature protein | |||
| bp | AA | AA | MM | pI a | ||
| At2g15050.2 | 653 | 25 | 90 | 9489 | 12.13 | |
| At2g15325.1 | 127 | 27 | 94 | 10312 | 4.83 | |
| At2g18370.1 | 438 | 24 | 92 | 9092 | 4.36 | |
| | | At2g38530.1 | 111 | 23 | 95 | 9661 | 11.90 |
| | | At2g38540.1 | none | 25 | 93 | 9281 | 11.50 |
| At3g08770.1 | 94 | 19 | 94 | 9883 | 9.61 | |
| | | At3g51590.1 | 467 | 24 | 95 | 9945 | 9.61 |
| | | At3g51600.1 | 107 | 25 | 93 | 9891 | 12.68 |
| At4g33355.1 | 112 | 28 | 91 | 9514 | 9.86 | |
| At5g01870.1 | 94 | 22 | 94 | 9923 | 10.45 | |
| | | At5g59310.1 | 138 | 23 | 89 | 8854 | 10.76 |
| | | At5g59320.1 | 94 | 23 | 92 | 9221 | 10.76 |
| | | At1g43665 b | none | 22 | 75 | 8367 | 9.59 |
| | | At1g43666.1 | none | 19 | 77 | 8458 | 9.67 |
| | | At1g43667.1 | none | 21 | 77 | 8488 | 9.59 |
| At1g48750.1 | none | 26 | 68 | 7258 | 10.74 | |
| At1g66850.1 | none | 24 | 78 | 7970 | 7.12 | |
| At1g73780.1 | none | 29 | 69 | 7674 | 9.67 | |
| At2g14846.1 | none | 21 | 78 | 8386 | 9.69 | |
| At3g12545 c | none | 25 | 64 | 7206 | 10.50 | |
| At3g18280.1 | none | 28 | 68 | 7372 | 12.40 | |
| At3g29105 d | none | 24 | 70 | 7841 | 9.92 | |
| At3g57310.1 | none | 24 | 79 | 8504 | 9.71 | |
| | | At5g38160.1 | none | 24 | 79 | 8309 | 5.43 |
| | | At5g38170.1 | none | 24 | 79 | 8342 | 7.12 |
| | | At5g38180.1 | none | 24 | 71 | 8127 | 7.28 |
| | | At5g38195.1 | none | 24 | 71 | 7718 | 4.40 |
| At5g07230.1 | 120 | 24 | 67 | 6883 | 4.29 | |
| At5g52160.1 | none | 32 | 64 | 6791 | 4.64 | |
| At5g62080.1 | 315 | 30 | 65 | 6636 | 4.14 | |
| | | At5g48485.1 | none | 26 | 76 | 7974 | 4.25 |
| | | At5g48490.1 | none | 25 | 76 | 8078 | 4.59 |
| | | At5g55410.1 | 81 | 30 | 77 | 8544 | 10.35 |
| | | At5g55450.1 | none | 30 | 74 | 7779 | 9.95 |
| | | At5g55460.1 | 106 | 32 | 77 | 8303 | 10.50 |
| At2g37870.1 | 96 | 23 | 92 | 9575 | 12.67 | |
| At3g53980.1 | 99 | 23 | 91 | 9362 | 9.91 | |
| At5g05960.1 | 88 | 25 | 91 | 9530 | 10.85 | |
| At1g32280.1 | 258 | 23 | 89 | 9383 | 9.69 | |
| At4g30880.1 | 192 | 22 | 87 | 9222 | 9.91 | |
| At4g33550 e | 79 | 29 | 86 | 9283 | 10.01 | |
| At5g56480.1 | 150 | 23 | 90 | 9582 | 4.91 | |
| At1g70250 f | none | 19 | 90 | 9865 | 4.64 | |
| At3g07450.1 | none | 29 | 77 | 7980 | 12.16 | |
| At3g52130.1 | none | 26 | 99 | 10484 | 4.40 | |
| At1g52415 g | 170 | 24 | 92 | 10825 | 10.25 | |
| At1g64235 h | 577 | 24 | 94 | 10313 | 10.83 | |
| At4g08530 i | none | 22 | 104 | 11859 | 9.53 | |
| At4g28395 j | 74, 121 k | 20 | 120 | 13430 | 5.28 | |
AA, number of amino acids; MM, molecular mass in Dalton; pI, isoelectric point.
a cysteine residues were not taken into account in the pI calculation.
b annotations curated (strand: -1; exon start: 16455949, end: 16456244).
c annotations curated (strand: -1; exon start: 3977557, end: 3977828).
d annotations curated (strand: +1; exon start: 11082271, end: 11082557).
e annotations curated (strand: +1; exon 1 start: 16134443, end: 16134767; exon 2 start: 16134847, end: 16134869).
f annotations curated (strand: +1; exon start: 26456628, end: 26456958).
g annotations curated (strand: +1; exon 1 start: 19529835, end: 19530183; exon 2 start: 19530354, end: 19530355).
h annotations curated (strand: +1; exon 1 start: 23839912, end: 23840250; exon 2 start: 23840828, end: 23840845).
i annotations curated (strand: +1; exon start: 5421971, end: 5422352).
j annotations curated (strand: +1; exon 1 start: 14044281, end: 14044490; exon 2 start: 14044565, end: 14044734; exon 3 start: 14044856, end: 14044898).
k AtLtpY.4 contains two introns.
Figure 1Organization of . Positions of nsLtp genes are indicated on chromosomes (scale in Mbp).
Triticum aestivum nsLtp genes and features of the deduced mature proteins. Details are given in Additional file 2.
| mature nsLTPs | |||||
| type | number of subfamilies | number of members | AA | MM | pI a |
| I | 12 | 85 | 86–98 | 8625–9855 | 4.14, 8.15–11.81 |
| II | 8 | 34 | 66–71 | 6841–7437 | 8.00–11.74 |
| III | 2 | 3 | 66–71 | 6727–7107 | 9.84–10.85 |
| IV | 4 | 12 | 74–82 | 7668–8607 | 11.09 |
| V | 3 | 10 | 91–99 | 9240–10514 | 4.06, 9.54–12.13 |
| VI | 2 | 8 | 83–94 | 8608–9793 | 4.01–4.29, 9.59–9.77 |
| VII | 1 | 3 | 148–150 | 15139–15450 | 9.71–10.39 |
| VIII | 1 | 1 | 96 | 9482 | 4.59 |
AA, number of amino acids; MM, molecular mass in Dalton; pI, isoelectric point.
a cysteine residues were not taken into account in the pI calculation
Figure 2Multiple sequence alignment of rice nsLTPs. Amino acid sequences were deduced from nsLtp genes identified from the TIGR Rice Pseudomolecules release 4 (Table 1). Sequences were aligned using HMMERalign to maximize the eight-cysteine motif alignment, and manually refined. The conserved cysteine residues are black boxed and additional cysteine residues grey boxed.
Figure 3Multiple sequence alignment of arabidopsis nsLTPs. Amino acid sequences were deduced from nsLtp genes identified from the TAIR arabidopsis genome database (TAIR release 6.0) (Table 2). Sequences were aligned using HMMERalign to maximize the eight-cysteine motif alignment, and manually refined. The conserved cysteine residues are black boxed and additional cysteine residues grey boxed.
Figure 4Multiple sequence alignment of wheat nsLTPs. Amino acid sequences were deduced from genes or ESTs indexed in the NCBI database. Amino acid sequences were aligned using HMMERalign to maximize the eight-cysteine motif alignment, and manually refined. For each nsLTP subfamily, one sequence is presented and the number of putative members identified is indicated between parentheses. The conserved cysteine residues are black boxed and additional cysteine residues grey boxed. Accession numbers are given in Additional file 2 and amino acid sequence of mature nsLTPs in Additional file 3.
Figure 5Diversity of the eight cysteine motif in rice, arabidopsis and wheat nsLTP types. The consensus motif of each nsLTP type was deduced from the analysis of the matures sequences of the 52 rice nsLTPs, the 49 arabidopsis nsLTPs and the 156 wheat nsLTPs presented in Table 1, Table 2, and Additional file 2, respectively. AtLTPII.8 that appears to be more distantly related to other type II sequences (see the phylogenetic analysis) was excluded. The values allowing direct identification of the nsLTP type are grey boxed. a cysteine residue number 6 is missing in AtLTPII.10. b cysteine residue number 7 is missing in TaLTPVIa.5. c cysteine residue number 8 is missing in AtLTPI.1. d AtLTPII.10, OsLTPVI.1, OsLTPVI.2, OsLTPVI.4, and TaLTPVIa subfamily members harbor an extra cysteine residue. All type VI contain a Val 4 aa before Cys7 and a Met 10 aa before Cys7 allowing a distinction between type IV and type VI. e AtLTPII.6 harbors an extra cysteine residue. f TaLTPIVc.1 and TaLTPIVa subfamily members harbor an extra cysteine residue. g 12 amino acid residues were counted for the TaLTPIVd.1 that displays no CXC motif. h OsLTPVII.1 and TaLTPVIIa.1 subfamily members harbor an extra cysteine residue.
Figure 6Unrooted phylogenetic tree between rice, arabidopsis and wheat nsLTP gene families. The mature sequences of the 122 non-redundant wheat nsLTPs, the 49 rice nsLTPs, and the 45 arabidopsis nsLTPs were aligned using HMMalign and then manually refined. The phylogenetic tree was built from the protein alignment (Additional file 3) with the maximum-likelihood method using the PHYML program [75]. When possible, subtrees including sequences of the same type are grouped and represented by a grey triangle close to which is indicated, in brackets, the number of sequences of arabidopsis, rice and wheat respectively. Subtrees are detailed in Figure 7. Bootstrap values (% of 100 re-sampled data set) are indicated for each node.
Figure 7Rooted phylogenetic subtrees detailed from unrooted phylogenetic tree between rice, arabidopsis and wheat nsLTP gene families. Each subtree represented by a grey triangle in Figure 6 is detailed and rooted on the remaining parts of the tree. Wheat nsLTPs are in black, rice nsLTPs in red and arabidopsis nsLTPs in blue. Monophyletic subfamilies are indicated by solid brackets, paraphyletic subfamilies by dotted brackets. Black brackets indicate the wheat subfamily in which a potential rice ortholog nsLTP gene is present, and green brackets indicate wheat-specific subfamilies. Bootstrap values (% of 100 re-sampled data set) are indicated for each node.