| Literature DB >> 18620554 |
Nuno A Fonseca1, Cristina P Vieira, Peter W H Holland, Jorge Vieira.
Abstract
BACKGROUND: Although homeobox genes have been the subject of many studies, little is known about the main amino acid changes that occurred early in the evolution of genes belonging to different classes.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18620554 PMCID: PMC2491631 DOI: 10.1186/1471-2148-8-200
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Majority rule consensus sequence of HOXL, NKL and PRD genes. The relative location of the described amino acid patterns is shown (see text for details). Red – HOXL1 and HOXL2 patterns. Green – NKL pattern; Blue – PRD pattern; Boxed – ANTP1 and ANTP2 patterns; Grey shadow – ANTP-PRD pattern.
Number of sequences from each homeobox gene class or subclass showing a given amino acid pattern.
| Pattern | HoxL (202) | NKL (204) | PRD (200) | LIM (66) | POU (83) | HNF (15) | SINE (23) | TALE (66) | CUT (20) | PROS (3) | ZF (94) | CERS (10) | Uncl. (288) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HoxL 1 | 194 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 178 |
| HoxL 2 | 199 | 1 | 16 | 0 | 38 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 190 |
| NKL | 3 | 116 | 0 | 0 | 0 | 0 | 7 | 0 | 0 | 0 | 0 | 0 | 52 |
| PRD | 0 | 0 | 172 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 |
| ANTP 1 | 197 | 199 | 1 | 1 | 7 | 12 | 0 | 11 | 9 | 0 | 25 | 2 | 251 |
| ANTP 2 | 197 | 139 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 242 |
| ANTP-PRD | 200 | 200 | 190 | 52 | 0 | 0 | 23 | 0 | 0 | 0 | 0 | 0 | 275 |
Uncl. – unclassified sequences. The total number of sequences in each class is shown in parentheses.
Number of sequences showing only the refined NKL amino acid patterns.
| Pattern | HoxL (202) | NKL (88) | PRD (200) | LIM (66) | POU (83) | HNF (15) | SINE (23) | TALE (66) | CUT (20) | PROS (3) | ZF (94) | CERS (10) | Uncl. (288) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DLX-BARX | 0 | 40 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
| EN | 11 | 28 | 0 | 0 | 0 | 0 | 1 | 0 | 3 | 0 | 0 | 0 | 21 |
| NOTO | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
| NANOG | 6 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| VENTX | 8 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 6 |
Uncl. – unclassified sequences. The total number of sequences in each class is shown in parentheses.
DLX-BARX – [AKST] [QDENPS] [LAST] [Q] [V]; EN – [AKSTN] [DENPS] [LAST] [Q] [VI];
NOTO – [AKST] [DENPS] [LASTN] [Q] [V]; NANOG – [AKST] [DENPSY] [LASTK] [Q] [V];
VENTX – [AKST] [DENPS] [LASTV] [Q] [VI]
Amino acid pattern presence in non-bilaterian sequences.
| Patterns other than ANTP-PRD | Expected pattern for | HoxL | NKL | PRD | Demox | LIM | SINE | TALE | Uncl. |
|---|---|---|---|---|---|---|---|---|---|
| ANTP1; ANTP2; | 13 | ||||||||
| ANTP1; ANTP2; | 1 | ||||||||
| ANTP1; ANTP2; | NKL | 22 | 12 | ||||||
| ANTP1; NKL | NKL* | 4 | 1 | 1 | 7 | ||||
| ANTP1; ANTP2; | EN (NKL) | 5 | |||||||
| ANTP1; ANTP2; | NOTO (NKL) | 3 | 4 | ||||||
| ANTP1; ANTP2; | VENTX (NKL) | 1 | |||||||
| ANTP1; ANTP2 | Demox | 1 | 3 | 14 | |||||
| PRD | PRD | 24 | 2 | ||||||
| Other pattern combinations | ? | 1 | 20 | 33 | 5 | 14 | 1 | 59 |
Only those non-bilaterian sequences that encompass the regions where the amino acid patterns here described are located, and that showed the ANTP-PRD pattern were used. In total 251 non-redundant sequences were used (gi numbers are listed in the appendix). Cases where non-bilaterian genes are clearly misidentified when using amino acid patterns are shown underlined (see text for details). Uncl. – unclassified sequences.
* this signature can also be considered characteristic of NKL genes since only 68% of bilaterian NKL sequences show the ANTP2 pattern
Frequency of genes following amino acid rules specific for the PRD or ANTP classes.
| Position | 27 | 29 | 30 | 30 | 50 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Rules | Not Ne | Not A | Not (LP, NP) | Not (LH, VH) | (P,Ne) | |||||
| Dataset | Phy | Unc | Phy | Unc | Phy | Unc | Phy | Unc | Phy | Unc |
| PRD | 0.060 (200) | 0.000 (14) | 0.305 (200) | 0.714 (14) | 0.020 (200) | 0.071 (14) | 0.020 (200) | 0.071 (14) | 0.315 (197) | 0.786 (14) |
| (HOXL and NKL) | 0.998 (406) | 1.000 (230) | 1.000 (406) | 1.000 (230) | 0.998 (406) | 1.000 (230) | 0.998 (406) | 1.000 (230) | 0.000 (404) | 0.000 (230) |
In brackets is indicated the total number of sequences analysed.
Phy – set of sequences classified using a phylogenetic approach
Unc – set of sequences not classified using a phylogenetic approach. These sequences, nevertheless, show amino acid patterns typical of PRD, HOXL and NKL genes. Ne – negatively charged amino acids; P – positively charged amino acids; A -aromatic amino acids; LP – Less-polar amino acids; NP – Non-polar amino acids; LH – less hydrophobic amino acids; VH – very hydrophobic amino acids; S- small amino acids; T – tinny amino acids; Al – Alyphatic amino acids.
Frequency of genes following amino acid rules specific for the HOXL or NKL sub-classes (positions 1 to 30).
| Position | 14 | 15 | 15 | 15 | 28 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Rules | Not A | Not (LH, VH) | Not (LP, NP) | Not (S, T) | Not (S, T) | |||||
| Dataset | Phy | Unc | Phy | Unc | Phy | Unc | Phy | Unc | Phy | Unc |
| HoxL | 0.995 (202) | 1.000 (177) | 1.000 (202) | 1.000 (177) | 1.000 (202) | 1.000 (177) | 1.000 (202) | 1.000 (177) | 0.990 (202) | 1.000 (178) |
| NKL | 0.709 (203) | 0.808 (52) | 0.588 (204) | 0.346 (52) | 0.598 (204) | 0.346 (52) | 0.637 (204) | 0.346 (52) | 0.500 (204) | 0.673 (52) |
In brackets is indicated the total number of sequences analysed.
Legend as in Table 4
Frequency of genes following amino acid rules specific for the HOXL or NKL sub-classes (positions 31 to 60).
| Position | 33 | 33 | 47 | 54 | ||||
|---|---|---|---|---|---|---|---|---|
| Rules | Not (LH, VH) | Not (NP, LP) | Not (Non-Al) | Not LP | ||||
| Dataset | Phy | Unc | Phy | Unc | Phy | Unc | Phy | Unc |
| HoxL | 1.000 (202) | 1.000 (178) | 1.000 (202) | 1.000 (178) | 1.000 (201) | 1.000 (178) | 1.000 (200) | 1.000 (178) |
| NKL | 0.716 (204) | 0.865 (52) | 0.701 (204) | 0.865 (52) | 0.730 (204) | 0.981 (52) | 0.431 (202) | 0.731 (52) |
In brackets is indicated the total number of sequences analysed.
Legend as in Table 4