Literature DB >> 15003118

The ABC transporter gene family of Caenorhabditis elegans has implications for the evolutionary dynamics of multidrug resistance in eukaryotes.

Jonathan A Sheps1, Steven Ralph, Zhongying Zhao, David L Baillie, Victor Ling.   

Abstract

BACKGROUND: Many drugs of natural origin are hydrophobic and can pass through cell membranes. Hydrophobic molecules must be susceptible to active efflux systems if they are to be maintained at lower concentrations in cells than in their environment. Multi-drug resistance (MDR), often mediated by intrinsic membrane proteins that couple energy to drug efflux, provides this function. All eukaryotic genomes encode several gene families capable of encoding MDR functions, among which the ABC transporters are the largest. The number of candidate MDR genes means that study of the drug-resistance properties of an organism cannot be effectively carried out without taking a genomic perspective.
RESULTS: We have annotated sequences for all 60 ABC transporters from the Caenorhabditis elegans genome, and performed a phylogenetic analysis of these along with the 49 human, 30 yeast, and 57 fly ABC transporters currently available in GenBank. Classification according to a unified nomenclature is presented. Comparison between genomes reveals much gene duplication and loss, and surprisingly little orthology among analogous genes. Proteins capable of conferring MDR are found in several distinct subfamilies and are likely to have arisen independently multiple times.
CONCLUSIONS: ABC transporter evolution fits a pattern expected from a process termed 'dynamic-coherence'. This is an unusual result for such a highly conserved gene family as this one, present in all domains of cellular life. Mechanistically, this may result from the broad substrate specificity of some ABC proteins, which both reduces selection against gene loss, and leads to the facile sorting of functions among paralogs following gene duplication.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15003118      PMCID: PMC395765          DOI: 10.1186/gb-2004-5-3-r15

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


Background

ATP-binding cassette (ABC) transporters are one of the largest families of transport proteins constituting the single largest gene family, comprising about 5% of the genome, in Escherichia coli [1]. ABC transporters are grouped into several structural classes, or subfamilies, on the basis of amino acid sequence and domain organization [2] (Figure 1). The presence of a strongly conserved ATP-binding motif defines membership in the family and the basic functional organization of an ABC transporter in the membrane is the same from bacteria to humans, and in all subclasses [3-5]. A complex of at least two ATP-binding domains, coupled to two blocks of membrane-spanning helices, appears to be the minimum requirement for a functional transporter. Often these domains are found in tandem within a single molecule, but in many cases are distributed across separate proteins that must then assemble in the membrane. ABC transporters are collectively able to accommodate an unusually large array of different substrates. This diversity of function is manifest at the family level, but also in individual members of the family, for example those associated with multidrug resistance (MDR).
Figure 1

Structural diversity of ABC transporters. Illustration of the various domain organizations found among members of the ABC transporter family in C. elegans. TM indicates a transmembrane domain typically containing six predicted membrane-spanning helices. ABC indicates an ATP-binding cassette domain. The color codes for each structure are used throughout the figures to show the lack of concordance between structural categories and families defined on the basis of sequence similarity.

Decottignies and Goffeau [6] catalogued the entire ABC transporter family of the yeast Saccharomyces cerevisiae and in so doing delineated six of the major subgroups of eukaryotic ABC transporters. Allikmets et al. [7] catalogued all the then known 33 human ABC transporters, including those known only from partial expressed sequence tag (EST) sequences, and divided these into seven subfamilies. This scheme has been adopted, with a revised nomenclature, by the Human Genome Organisation (HUGO) [8] in order to provide a unified nomenclature for both human and mouse ABC transporters. Of these seven subfamilies, one, ABCA, has no exact equivalent in the yeast genome [9,10]. Genes considered to be part of subfamily ABCA have been identified in the slime mold Dictyostelium discoideum, as well as in malaria parasites [11] and Caenorhabditis elegans (this paper). With the completion of the human and Drosophila melanogaster genomes, a joint summary of the ABC transporter complements of both genomes was published [12]. This identified a new subfamily, ABCH, which appears to be the most divergent yet. One, previously unclassified yeast ABC gene, YDR061w [13], appears to be a structurally aberrant member of subfamily H. The phenotypes of five ABC transporter knockouts have been reported in C. elegans. Four of these involve genes expected, by homology to mammalian genes, to be involved in drug resistance: three P-glycoproteins (Pgp-1, Pgp-3 and Pgp-4) (subfamily B) and one multi-drug resistance protein (MRP) [14,15] (subfamily C). These ABC transporter mutants are associated with sensitivity to environmental insult [16]. Pgp-3 mutant strains of C. elegans are more sensitive to the drugs chloroquine and colchicine. Pgp-1 and mrp-1 strains are hypersensitive to toxic pigments produced by some bacteria [17]. All the nematode P-glycoproteins examined so far seem to be highly expressed in intestinal cells [18], and in the excretory cell, which functions somewhat like a kidney in C. elegans. The mrp-1, pgp-1 and pgp-3 mutant strains have been reported to be hypersensitive to the heavy metals cadmium and arsenite [15]. The fifth reported knockout is of the product of the ced-7 gene [19]. Mutant alleles of ced-7 cause a defect in engulfment of the cell corpses left behind by apoptosis. ced-7 is a member of the ABCA subfamily, and has a similar phenotype to the abca1 gene in humans. ABCA1 protein is required for engulfment of apoptotic cells by macrophages and is thought to regulate membrane fluidity through an increase in phosphatidylserine exposure on the outer leaflet of the cell membrane [20]. The term orthology is used to describe genes separated from one another by speciation events while paralogy describes those separated by gene duplication events [21]. Of particular interest, from the point of view of functional annotation, are the cases where a pair of genes, one from each of a pair of organisms, are found. In these cases it is reasonable to presume that the orthologous genes may share a conserved function retained from the same single gene present in the common ancestor of the two organisms. However, where a single gene (or set of duplicated genes) in one genome is most closely related to a set of duplicated (paralogous) genes in another genome this is sometimes termed co-orthology [22], and then no particular orthologous pair can be unambiguously specified. In the case of co-orthologs the argument for retention of analogous functions between members of the sets of descendant genes is much weaker. Comparison of two complete genomes, those of C. elegans and S. cerevisiae [23], demonstrated a high fraction of ortholog pairs in gene families involved in core biological functions. Specifically, Chervitz et al. [23] found, when pairing conserved yeast genes with their most similar worm homologs (subject to a BLAST score cut-off of < 10-100), 57% of these highly conserved gene pairs involved orthologous, rather than paralogous, pairs of genes. In this category of core functions they included trafficking, and, as possibly the largest family of trafficking genes in animal genomes, ABC transporters should be expected to share in this high level of one-to-one correspondence between genomes. We expected therefore that this would allow us to assign predicted functions to newly discovered C. elegans ABC proteins on the basis of their already-characterized mammalian orthologs. Following a comprehensive phylogenetic analysis of ABC transporters from four eukaryote genomes, we found that the frequency of orthologous pairs among ABC transporters was substantially lower than we expected. Particular domain organizations and substrate specificities seem to have evolved independently several times in multiple lineages. This is expected to complicate the functional analysis of ABC transporter function in newly characterized genomes.

Results and discussion

Here we present a classification of all ABC transporters encoded in the C. elegans genome, based on a phylogenetic analysis which includes the 49 currently known human ABC proteins for which there are reliable, public, sequence data. We took the approach of analyzing primarily the conserved ATP-binding cassettes from each protein, regardless of the structural class from which the domain is drawn. This allows evaluation of the evolutionary history of each protein in the family, without biases that might result from gene-fusion events resulting in convergent acquisition of similar domain structures by distantly related proteins. In addition, we re-evaluated the relationships of transporters within statistically reliable clusters whose members are closely related enough that structural variations do not lead to errors in alignment. We did this to capture additional phylogenetic information, which may be apparent in the less conservative transmembrane domains, at a level of analysis where it is less likely to be misleading. An example of our first-pass approach is given in Figure 2, which shows an analysis of isolated ATP-binding cassette domains from the human ABC transporters only. In particular, we find that all seven subfamilies recognized by Allikmets et al. [7] are recovered with significant bootstrap support. Their finding, that subfamily B is more closely related to the carboxy-terminal component of subfamily C than the two halves of ABCC molecules are with one another, is supported by our results.
Figure 2

Tree of human ATP-binding cassette domains. The evolution of the ABCB subfamily from within the ABCC subfamily, and the structural diversity of subfamily B is shown here. Each cluster of ABC domains within each subfamily, except for subfamily B, is collapsed to form a single, representative, branch; n-term: amino-terminal ABC; c-term: carboxy-terminal ABC. The phylogeny of ATP-binding cassettes from human ABC transporters was produced according the following procedure. Predicted amino-acid sequences were aligned using ClustalX [54]. Aligned sequences were used to generate matrices of mean distances among proteins, and these matrices were used to generate a phylogenetic tree according to the neighbor-joining algorithm [55], refined using the SPR branch-swapping technique under the minimum evolution criterion, implemented by PAUP*4.0b10 [56]. Bootstrapping [57] was used to determine the relative support for the various branches of the tree (1,000 replicates), and nodes with less than 50% support were collapsed to form polytomies. The structures of the proteins in which the domains are embedded are indicated according to the color scheme in Figure 1. It should be noted that branch lengths in the figures are not to scale and do not represent distances between protein sequences. The original alignment files are available as Additional data files 1-8.

A collection of transporters

We found a total of 60 confirmed ABC transporters in the annotated protein set derived from the C. elegans genome sequence. This represents approximately 0.3% of the total number of genes (approximately 19,000) in the worm genome. Only 8 of the 60 predicted genes lack any corresponding mRNA (Table 1), and only one (F56F4.6) is structurally aberrant in a way that would suggest it is likely to be a pseudogene.
Table 1

Characterization of the 60 C. elegans ABC proteins

SubfamilyORF name/CGC nameChromosomeGenBank accession numberSize (amino acids)Predicted topologycDNA if knownRNAi phenotype
AAbtC24F3.5/ Abt-1IVCAA187751,429(6TM-ABC)2None
C48B4.4/ Ced-7IIINP_4991151,704(8TM-ABC)2CompleteNone
F12B6.1/ Abt-2IAAB541531,547(6TM-ABC)2PartialNone
F55G11.9/ Abt-3IVCAB052221,431(8/4TM-ABC)2None
F56F4.6IAAB54203260ABCNone
Y39D8C.1/ Abt-4VAAC692231,802(6/8TM-ABC)2PartialNone
Y53C10A.9/ Abt-5ICAA221421,564(6TM-ABC)2PartialNone
BPgp (full molecules)C05A9.1/ Pgp-5XCAA942021,283(6TM-ABC)2PartialNone
C34G6.4/ Pgp-2IAAB524821,265(6TM-ABC)2PartialNone
C47A10.1/ Pgp-9VCAB039731,294(6TM-ABC)2PartialNone
C54D1.1/ Pgp-10XAAC481491,283(4TM-ABC)2PartialNone
DH11.3/ Pgp-11IICAA889401,270(6TM-ABC)2PartialNone
F22E10.1/ Pgp-12XCAA917991,318(6TM-ABC)2PartialNone
F22E10.2/ Pgp-13XCAA918001,291(6TM-ABC)2None
F22E10.3/ Pgp-14XCAA918011,327(6TM-ABC)2PartialNone
F22E10.4/ Pgp-15XCAA918021,270(6TM-ABC)2None
F42E11.1/ Pgp-4XCAA914631,266(6TM-ABC)2PartialNone
K08E7.9/ Pgp-1IVCAB012321,321(6TM-ABC)2PartialNone
T21E8.1/ Pgp-6XCAA942201,225(6TM-ABC)2PartialNone
T21E8.2/ Pgp-7XCAA942191,269(6TM-ABC)2None
T21E8.3/ Pgp-8XCAA942031,243(6TM-ABC)2PartialNone
ZK455.7/ Pgp-3XCAA914671,268(6TM-ABC)2PartialNone
Haf (half molecules)C30H6.6/ Haf-1IVCAB028125864TM-ABCPartialNone
F43E2.4/ Haf-2IIAAC711217618TM-ABCPartialNone
F57A10.3/ Haf-3VCAB094187336TM-ABCPartialNone
W04C9.1/ Haf-4IAAC687247878TM-ABCCompleteWeak embryonic lethality, slow growth
W09D6.6/ Haf-5IIICAB049478018TM-ABCCompleteNone
Y48G8AL.11/ Haf-6IAAK299115654TM-NBFPartial
Y50E8A.16/ Haf-7VCAB605868076TM-ABCPartial
Y57G11C.1/ Haf-8IVCAB165036334TM-ABCNone
ZK484.2/ Haf-9IAAK393948158TM-ABCCompleteNone
CMrp/CftC18C4.2/ Cft-1VAAK521751247(5/6TM-NBF)2PartialNone
E03G2.2/ Mrp-3XCAA921481,398(6TM-ABC)2PartialNone
F14F4.3/ Mrp-5XCAB542251,427(6TM-ABC)2PartialSlow growth, Clear
F20B6.3/ Mrp-6XAAA823171,396(6TM-ABC)2PartialEgg laying defect
F21G4.2/ Mrp-4XCAB026671,573(10/6TM-ABC)2PartialNone
F57C12.4/ Mrp-2XAAB070221,525(10/6TM-ABC)2CompleteNone
F57C12.5/ Mrp-1XAAD315501,528(12/6TM-ABC)2CompleteNone
Y43F8C.12/ Mrp-7VCAA216221,119(12/2TM-ABC)2Partial
Y75B8A.26/ Mrp-8XCAA221101,144(4/6TM-ABC)2Partial
DC44B7.8IIAAA683396654TM-ABCPartial
C44B7.9IIAAA683406614TM-ABCPartialNone
C54G10.3VCAA998106606TM-ABCCompleteNone
T02D1.5IVCAB05907346TM-ABCPartialNone
T10H9.5VAAC192385986TM-ABCCompleteNone
EY39E4B.1IIICAB54424610ABC-ABCPartialEmbryonic lethality
FF18E2.2VCAA99835622ABC-ABCPartialNone
F42A10.1IIIAAA19072712ABC-ABCPartialNone
T27E9.7/ GCN20-2IIICAB04880622ABC-ABCCompleteNone
GC05D10.3IIIAAA20989598ABC-4TMPartialNone
C10C6.5IVCAB05682610ABC-6TMPartialNone
C16C10.12IIICAA86750610ABC-4TMPartialNone
F02E11.1IIAAB66050658ABC-4TMPartialNone
F19B6.4IVCAA93461695ABC-6TMPartialNone
T26A5.1IIIAAC77504608ABC-4TMPartialNone
Y42G9A.6IIIAAF60554684ABC-6TMPartialNone
Y47D3A.11IIICAB57891547ABC-6TMPartialNone
Y49E10.9IIICAB11549454ABC-4TMNone
HC56E6.1IIAAA810931,667ABC-12TMLarval arrest
C56E6.5IIAAA81094595ABC-6TMPartialNone

Subfamily names given are according to the HUGO nomenclature (A-H) [33] as well as the CGC (Caenorhabditis Genetics Centre [61]) gene names for each subfamily. TM, transmembrane domain, where the number preceding it is the predicted number of membrane spanning helices or the number in the amino-terminal/carboxy-terminal TM domains, respectively. ABC, ATP-binding cassette. The existence of known cDNAs, whether complete or partial, is listed according to information in WormBase release WS112 [62]. RNAi phenotypes of genes on chromosome I are given according to [63], those on chromosomes II, IV, V and X are from [64], those of genes on chromosome III are from [65].

Thirty ABC transporters are described in the yeast genome, or approximately 0.5% of its approximately 6,000 proteins [13]. At present 49 human ABC transporters have been identified and, at least partially, cloned. They are included here (Figures 3,4,5,6,7 and Table 2) to illustrate their relationships with nematode proteins, which might then shed light on their biological roles. Inclusion of human as well as D. melanogaster ABC transporters in our tree allows us to explicitly classify C. elegans ABC transporters according to the current eight-subfamily taxonomic scheme for ABC transporters [12].
Figure 3

Phylogenetic tree of ABCA proteins in three eukaryote genomes. A phylogeny derived and displayed according to the procedure outlined in the legend to Figure 2, except that complete protein sequences were used, not just those of the ATP-binding cassettes. The genome of origin for each protein is indicated by prefixes before each gene name, according the following scheme: Ce, C. elegans; Dm, D. melanogaster; Hs, H. sapiens; Sc, S. cerevisiae.

Figure 4

Phylogenetic tree of ABCB proteins in four eukaryote genomes. A phylogeny derived and displayed according to the procedure outlined in the legend to Figure 3. Shown here is the division between the half transporters, which are most of the ABCB genes in mammals, and the full-transporters (called P-glycoproteins (P-gps)) that have evolved from them. Four lineages of P-gps (exemplified by genes F22E10.1-4, T21E8.1-3, C47A10.1 and C54D1.1) have been lost in both flies and mammals, and of the two remaining P-gp lineages, one has been lost in each of the fly and human lines of descent. Subsequent duplications within the single remaining P-gp lineage in both flies and mammals have not been sufficient to keep pace with continuing P-gp duplications in the worm genome.

Figure 5

Phylogenetic tree of ABCC proteins in four eukaryote genomes. A phylogeny derived and displayed according to the procedure outlined in the legend of Figure 3.

Figure 6

Phylogenetic trees of ABCD, ABCE, and ABCF proteins in four eukaryote genomes. Phylogenies derived and displayed according to the procedure outlined in the legend of Figure 3.

Figure 7

Phylogenetic trees of ABCG and ABCH proteins in four eukaryote genomes. Phylogenies derived and displayed according to the procedure outlined in the legend of Figure 3.

Table 2

Alphabetic list, by taxon, of protein sequences used in this study

S. cerevisiaeAccession numberD. melanogasterAccession numberC. elegansAccession numberH. sapiensAccession number
ADP1NP_009937171D11.2AAF45509C05A9.1CAA94202ABCA1NP_005493
ATM1NP_014030AtetAAF51027C05D10.3AAA20989ABCA2NP_001597
BPT1NP_013086BrownAAF47020C10C6.5CAB05682ABCA3CAA65825
CAF16NP_116625CG10226AAF50670C16C10.12CAA86750ABCA5NP_061142
GCN20NP_116664CG10441AAF53737C18C4.2AAK52175ABCA6NP_525023
MDL1NP_013289CG10505AAF46706C24F3.5CAA18775ABCA7AF328787
MDL2NP_015053CG11069AAF56361C30H6.6CAB02812ABCA8AB020629
PDR10NP_014973CG11147AAF52284C34G6.4AAB52482ABCA9NP_525022
PDR11NP_012252CG11460AAF55727C44B7.8AAA68339ABCA10XP_085647
PDR12NP_015267CG11897AAF56869C44B7.9AAA68340ABCA12NP_056472
PDR15NP_010694CG11898AAF56870C47A10.1CAB03973ABCA13NP_689914
PDR5NP_014796CG12703AAF49018C48B4.4CAA82384ABCB5AAO73470
PXA1NP_015178CG14709AAF54656C54D1.1AAC48149ABCB7AB005289
PXA2NP_012733CG1494AAF50838C54G10.3CAA99810ABCB9AC002486
SNQ2NP_010294CG1703AAF48069C56E6.1AAA81093ABCC10NP_258261
Ste6NP_012713CG1718AAF50837C56E6.5AAA81092ABCC11NP_149163
YBT1NP_013052CG17338AAF53736DH11.3CAA88940ABCC12NM_033226
YCF1NP_010419CG17646AAF51341E03G2.2CAA92148ABCC13NP_742021
YDR061wNP_010346CG1801AAF50836F02E11.1AAB66050ABCF1AAH34488
YDR091CNP_010376CG1819AAF50847F12B6.1AAB54153ABCF2NP_005683
yEF3NP_013350CG1824AAF48177F14F4.3CAB54225ABCF3NP_060828
yEFBP53978CG18633AAF56360F18E2.2CAA99835ABCG5AF320293
YER036CNP_010953CG2316AAF59367F19B6.4CAA93461ABCG8AF320294
YHL035CNP_011828CG3164AAF51548F20B6.3AAA82317ABCR (A4)AF001945
YKR103WNP_013030CG3327AAF51122F21G4.2CAB02667ALDP (D1)CAA79922
YNR070wNP_014468CG4225AAF55241F22E10.1CAA91799ALDR (D2)NP_005155
YOL075CNP_014567CG4562AAF55707F22E10.2CAA91800BCRP (G2)XP_032425
YOR011wNP_878167CG4794AAF55726F22E10.3CAA91801BSEP (B11)AF091582
YOR1NP_011797CG4822AAF51552F22E10.4CAA91802CFTR (C7)AAC13657
YPL226WS65245CG5651AAF50342F42A10.1AAA19072MABC1 (B8)AF047690
CG5789AAF56312F42E11.1CAA91463MABC2 (B10)XP_001871
CG5853AAF52835F43E2.4AAC71121MDR1 (B1)4505769
CG5944AAF49305F55G11.9CAB05222MDR3 (B4)AAA36207
CG6052AAF49312F56F4.6AAB54203MRP1 (C1)AAB46616
CG6162AAF56584F57A10.3CAB09418MRP2 (C2)CAA65259
CG6214AAF53223F57C12.4AAB07022MRP3 (C3)AB010887
CG7346AAF50035F57C12.5AAD31550MRP4 (C4)NP_005836
CG7491AAF53328K08E7.9CAB01232MRP5 (C5)AAB71758
CG7627AAF52648T02D1.5CAB05909MRP6 (C6)AF076622
CG7806AAF52639T10H9.5AAC19238MTABC3 (B6)NP_005680
CG7955AAF47525T21E8.1CAA94220PMP69 (D4)AF009746
CG8473AAF48511T21E8.2CAA94219PMP70 (D3)CAA41416
CG8799AAF58947T21E8.3CAA94203RNAse LI (E1)CAA53972
CG8908AAF57490T26A5.1AAC77504SUR1 (C8)AAB02278
CG9270AAF53950T27E9.7CAB04880SUR2 (C9)AF061323
CG9281AAF48493W04C9.1AAC68724TAP1 (B2)CAA40741
CG9330AAF49142W09D6.6CAB04947TAP2 (B3)AAA59841
CG9663AAF51130Y39D8C.1AAC69223WHITE 1 (G1)AAC51098
CG9664AAF51131Y39E4B.1CAB54424WHITE 2 (G4)NP_071452
CG9892AAF51223Y42G9A.6NP_498332
CG9990AAF56807Y43F8C.12CAA21622
Mdr49AAF58437Y47D3A.11CAB57891
Mdr50AAF58271Y48G8AL.11AAK29911
Mdr65AAF50669Y49E10.9CAB11549
ScarletAAF49455Y50E8A.16CAB60586
SurAAF52866Y53C10A.9CAA22142
WhiteAAF45826Y57G11C.1CAB16503
Y75B8A.26CAA22110
ZK455.7CAA91467
ZK484.2AAK39394

Typing ABCs to subfamily

We define membership of a particular gene in an ABC transporter subfamily primarily on the basis of the position of its ATP-binding domains in our first phylogenetic tree (not shown). Genes that fell unambiguously within a clade containing genes already assigned to given subfamily, were included in that subfamily. Where we could not assign a gene to a particular clade with a significant bootstrap value, the assignment was made on the basis of which subfamily's members scored highest when that gene was used as query in a BLAST search. The subfamilies are sometimes named according to the well-characterized mammalian genes that typify each of them, for example, P-gp (P-glycorprotein), MRP, White gene homologs, RNAse L inhibitor, GCN20 homologs, ABC1 and ALDP [7]. These correspond to the HUGO-defined subfamilies B, C, G, E, F, A and D, respectively. Re-analysis of the full-length sequences confirmed the placement all C. elegans genes within the preexisting subfamilies, with substantial bootstrap support (Figures 3,4,5,6,7).

Instances of orthology

In the set of worm and human ABC transporters, only 8 of 49 possible pairs (16%) of sister genes contained a single human protein and a nematode homolog (Table 3). Similarly, 10% of ABC transporters were found in orthologous pairs when the comparison is made between yeast and worm genomes. A more comprehensive comparison of worm and yeast genomes [23] came to the overall conclusion that 57% of genes in highly conserved gene families were found in orthologous pairs, and the study suggested that such gene families provide a conserved core proteome which forms the basis of eukaryote biochemistry. ABC transporters are conserved in all eukaryotic and prokaryotic genomes, so it is interesting to note that they are found in orthologous pairs much less frequently than most gene families that are roughly as well conserved. Clearly, ABC transporter evolution has not been typical of strongly conserved gene families, and while we might have inferred that ABC-transporter-mediated metabolism differs radically among eukaryotes, this seems improbable, given the broadly comparable set of substrates associated with ABC transporters in all eukaryotes where they have been studied.
Table 3

Frequency of orthologous pairs among ABC transporters

ScCeDmHs
Sc10%17%10%
Ce314%16%
Dm5822%
Hs5811

Numbers below the diagonal represent the number of orthologous pairs of ABC transporters, according to our phylogeny, found in pairwise comparisons between each of the four genomes in this study. Percentages above the diagonal are calculated from the corresponding number given for that pair, divided by the smaller of the two counts of ABC transporters in that pair of genomes. Ce, C. elegans; Dm, D. melanogaster; Hs, H. sapiens; Sc, S. cerevisiae.

Within the P-gp-related ABCB subfamily, the only one-to-one pairings found between C. elegans and human genes are those of W09D6.6 (Haf-5) and MTABC3 (B6), and Y48G8AL.11 (Haf-6) and MABC1 (B8). These are both half-transporters localized to the mitochondria. MTABC3 (B6) is involved in iron homeostasis [24] and its rat ortholog, PRP, is overexpressed during hepatocarcinogenesis [25]. Two other mitochondrial ABC transporters in humans, MABC2 (B10) and ABCB7, have orthologs in flies and/or yeast, but not nematodes. Among ABCC molecules, whose range of functions broadly overlaps with P-gps, only C18C4.2 (Cft-1) and CFTR (C7) are indicated as orthologs in our analysis. However, the bootstrap value on this pairing is very low (51%, see Figure 5), so we cannot attach much confidence to this observation. It may simply be that C18C4.2 (Cft-1) is a highly divergent member of subfamily C, and does not bear much functional similarity to CFTR (C7). Although not forming simple pairs with any nematode gene, human MRP5 (C5), a transporter of nucleotide analogs [26,27], and ABCC11 and ABCC12 appear to be co-orthologous to worm F14F4.3 (Mrp-5), which may provide some hint as to the function of the latter. All four of the C. elegans members of subfamilies E and F (Figure 6) form strongly supported and unambiguous pairs with their homologs in D. melanogaster, Homo sapiens, and yeast. This unusually strong conservation, compared to the other subfamilies of ABC genes, argues for involvement in something indispensable, at least on an evolutionary timescale. The three genes in subfamily F, which lack transmembrane domains, are generally regarded as forming ribosome associated proteins involved in regulation of mRNA translation, rather than transporters. The RNase L inhibitor (E1), also known as the oligoadenylate-binding protein (OABP), is thought to be involved in the regulation of the interferon-induced antiviral response [28] that bears some similarities to the mechanism thought to underlie the now common molecular biology technique of double-stranded RNA-directed interference (RNAi). It also seems to have a role in muscle differentiation [29] in mammals. The critical role of the RNase L inhibitor is underlined by its conservation even in a highly reduced genome. In the rather minimal genome of the endosymbiotic Guillardia theta nucleomorph (302 genes) the RNase L inhibitor is the only ABC protein found [30]. The yeast ortholog of the RNase L inhibitor protein, YDR091c, is essential for growth, as is YER036c, the yeast ortholog of T27E9.7/ABCF2 [31]. On the other hand GCN20, the yeast version of F42A10.1/ABCF3, is not essential, although mutants do have specific defects in translation.

Processes of gene duplication and loss

While the conservation of simple orthologous gene pairs is a rare observation in our study, the numbers of genes in most ABC transporter subfamilies are about the same, despite numerous instances of gene duplication and loss. For example, within ABCB the number of half-transporters in each genome is almost identical. Furthermore, most mammalian half-transporters in subfamily B are found in clusters of functionally related, or at least co-localized, genes (the TAP (B2 and B3) genes, and the four mitochondrial ABCB genes, MABCs1 and 2 (B8 and B10), MTABC3 (B6) and ABCB7 [32]), paired with similarly sized groups of C. elegans genes. Likewise the number of genes in subfamilies A, C and D is much the same between genomes. However, it does appear that C. elegans, relative to humans, has undergone a massive expansion in the P-gp (full or pseudo-dimer configuration) subclass of subfamily B, and subfamily G, the 'White-like' genes. The likelihood that ABC transporter lineages have been lost repeatedly in evolution is evident from the phylogeny. The single group of P-gps in mammals contains only four members, while C. elegans has 15 P-gps, of which only three are closely related to their mammalian homologs. A literal reading of the tree (Figure 4) would suggest the presence of five additional P-gp lineages in the common ancestor of nematodes, flies and mammals that have been lost, independently, in both mammals and flies. These losses, and the species-specific expansion of the remaining lineages of genes, underlines the peculiarly dynamic composition of this group of multifunctional transport proteins.

Conclusions

The completion of the C. elegans and D. melanogaster genome projects [33,34] make it possible to analyze entire gene families in metazoans. The advantage of performing a combined analysis of all known ABC proteins from two organisms is that it allows unambiguous identification of orthologous pairs of genes, as well as allowing the pattern of evolution by a process of gene duplication, lineage sorting, and functional convergence to be explicitly modeled. Saurin et al. [35] surveyed the ABC transporters, considering both eukaryotic and prokaryotic systems, and found that there is a fundamental phylogenetic division among ABC transporters involved in import versus export processes. The importer class of ABCs is found only in prokaryotes, whereas exporters are found in all domains of life [35]. However, that survey, while covering all classes of ABC transporter, was not comprehensive with respect to any of the organisms surveyed. Most recently, Schriml and Dean [10] compared the human ABC family to that of the mouse Mus musculus, and found almost perfect identity between the two genomes. We have integrated previous information with the complete inventory of ABC transporters from the genome of the nematode worm C. elegans. We find that most of the ABC transporters in the worm can be classified into the existing human transporter taxonomy. We find 60 ABC transporters in the worm genome, representing an overall doubling in size of the ABC transporter family relative to yeast, whose genome contains one third as many protein-coding genes. No ABC genes were found that could be classified among the bacterial import proteins. At least three subfamilies of ABC transporter contain members capable of a conferring an MDR phenotype, and transporters from at least two different subfamilies cause MDR in human tumors [36]. A multi-drug transporter is a single protein capable of specifically recognizing several structurally distinct classes of compounds, and which catalyzes their efflux from the cell or sequestration in a subcellular compartment. Proteins of the P-glycoprotein (P-gp) group (ABCB) transport hydrophobic compounds and function in transport of lipids and bile from the liver as well as generally defending the body from toxic natural products in the diet [37]. P-gps are also a component of the blood-brain barrier and function in tolerance of drugs normally minimally toxic to mammals, such as ivermectin [38]. Multi-drug resistance mediated by MRP group (ABCC) proteins depends on a slightly different mechanism. MRPs seem to function by co-transporting toxic compounds with glutathione, or as glutathione conjugates [36]. An MDR phenotype is also associated with some members of the ABCG group of transporters, in both yeast [39] and humans [40]. The MDR phenotype appears to have evolved not just once, but at least three times in the history of ABC transporters. Given the distribution of MDR-causing and non-MDR genes among mammalian P-gps; it seems reasonable to infer that MDR genes may well have arisen more than once among the P-gps themselves. It has been observed [41,42] that the entire ABC transporter family is characterized by a highly adaptable common mechanism for coupling substrate binding to ATP hydrolysis and extrusion. It has been pointed out that, because P-gp recognizes substrate directly within the cytoplasmic leaflet of the plasma membrane [43], it does so at a much higher effective substrate concentration than would be the case if it recognized aqueous substrate. As a result, P-gp drug-binding sites can operate at relatively low affinity, and this, in turn, facilitates recognition of multiple substrates. This flexibility may be the key to explaining the range of tasks performed by ABC transporters, but also their apparently anomalous evolutionary history. The mammalian P-gps include proteins capable of producing an MDR phenotype (MDR1 (B1)), as well as members with, apparently, specificity restricted to single physiological substrates such as phosphatidylcholine (MDR3 (B4)). As none of these have simple, orthologous, relationships with any of the C. elegans P-gps, no detailed predictions of function in nematode P-gps can be drawn on the basis of phylogeny alone. C. elegans P-gps do differ from one another in their ability to cause resistance to various environmental toxins [16], with no apparent correlation between phenotype and genetic distance from their mammalian homologs. Both human abca1 and nematode ced-7 mutants present similar apoptotic phenotypes, despite their rather distant relationship (Figure 3). ABCA1 mutations also cause defects in high-density lipoprotein cholesterol transport, and it is still an open question as to whether the analogous function of these two homologs in apoptosis accurately predicts a sharing of other functions. Similar limitations on the extent to which function may be predicted from sequence alone are likely to obtain in those subfamilies whose members are noted for variability and multiplicity of function, that is, subfamilies A, B, C and G. Schriml and Dean [10] speculated that the distinct clustering of amino- and carboxy-terminal halves of ABCA proteins suggests that full ABC transporters have generally evolved from half-transporters. The pattern of structural change within the closely related subfamilies ABCD, ABCC and ABCB does suggest that the half-transporter configuration was the ancestral one for at least these three subfamilies (Figure 2). It also reveals instances where half-transporters have evolved from duplicated genes, as in the origination of ABCB from a fragment of an ABCC gene, and that, in turn, some ABCB genes have duplicated again, in giving rise to the P-gp genes. A comprehensive comparison of worm and yeast genomes [23] noted that while most of the nematode genome did not closely resemble that of yeast, there was a strongly conserved 20% of the nematode genome that had a high degree of homology to a corresponding 40% of the yeast genome. Within this highly conserved subset of genes, there was a very frequent finding of orthology between members of the two genomes. As many as 57% of the most closely related gene pairs contained exactly one worm and one yeast gene. The obvious inference is that one corresponding gene was present in the common ancestor of the two species. Their overall picture of genome evolution is one in which a conserved cadre of proteins performs core biological functions required by all eukaryotes. These would remain essentially invariant throughout eukaryotes, and one expects analogous functions to be carried out by orthologous genes across large evolutionary distances. These gene families are presumably protected over the long run by their essential and irreplaceable roles in basic biochemical functions required by all organisms. However, as Chervitz et al. [23] point out, only a minority of gene families fit this mode, with most genes belonging to poorly conserved or taxonomically restricted families. We expected that the frequency of simple orthologous gene pairs typical of highly conserved gene families shared by both yeast and worm would hold true for our comparison between nematode and human versions of such a highly conserved gene family as ABC transporters. However, this generality clearly does not apply to ABC transporters, despite their strong conservation across all domains of life. It seems reasonable to suppose that the rather loose relationship between substrate specificity and amino acid sequence that characterizes ABC transporters allows for much more potential exchange and sorting of biological functions among homologous genes than is typical. In turn, this pervasive pre-adaptation for functional overlap enables organisms to survive the occasional loss of substantial numbers of ABC transporters and to rapidly re-evolve lost functionality by co-opting homologous genes. The evolutionary dynamic we propose here is reminiscent of an explanation put forward by Huynen et al. [44] to explain a pattern observed in a comparative analysis of 11 microbial genomes. They found that the frequency distribution of gene-family sizes within each completely sequenced genome tended to follow a power-law distribution across a 30-fold range of genome sizes. Their model is one in which genes are duplicated or deleted randomly in time, but the gene families are coherent with respect to the probability of duplication or deletion in each time unit in the simulation. In other words, the probability of duplicating or deleting a gene may change over time, but every member of a gene family always has the same probability of duplication or deletion as every other member of the family. So, whereas a given family can be either favored for expansion or targeted for deletion in a given time period, all members of the family are equally favored or disfavored by selection at the same time. Huynen et al. argued that this property of 'dynamic coherence' in a gene family could arise if all gene-family members have more or less the same function, so that they are all favored or disfavored by selection at the same time, depending on how much that function is needed. Under a power-law distribution, gene families would tend to be subject to fluctuations of a size on the same order as the gene-family size itself [44]. We should then expect that typical gene families will have undergone very substantial episodes of expansion and near-extinction, and in Huynen et al.'s model all gene families do become extinct within a finite time. It is evident that ABC transporters are highly atypical for a strongly conserved gene family, in that the family as a whole is highly conserved across genomes despite being subject to the same large fluctuations in size, which would tend to eventually eliminate gene families whose members are not individually indispensable. It should be noted that the ABC family does not seem uniformly subject to one or the other mode of evolution. Subfamilies E and F, which are not involved with transport, but rather have roles in translation and gene regulation, fit the 'strongly conserved' [23] model very well, retaining simple orthologous relationships over long spans of time. Only the transporter subfamilies themselves, because of their highly adaptable substrate-recognition capability, are subject to large fluctuations in size. We propose that finding large sets of paralogous genes, and infrequently conserved orthologs, in a gene family reflects ongoing cycles of gene loss and reacquisition of analogous functions in distantly related, newly expanded, lineages. Furthermore, we suggest that this is in fact the expected outcome of dynamic coherence, a mode shared, perhaps, by most of the less-conservative gene families, as well as the ABC genes. We expect that future functional studies, to determine the extent of parallel and convergent evolution among ABC transporters, will eventually allow us to discern the fundamental roles of ABC transporters that ensure their long-term survival as a group. Also of interest will be whether the functional suites of genes fulfilling these roles are bounded in any way that resembles the phylogenetic subdivisions into which we presently categorize these proteins.

Materials and methods

Identification of ABC transporter genes

A computer file, WormPep16 [45], containing 16,332 protein sequences predicted from the completed C. elegans genome was searched using the FASTA program [46]. Our initial query sequences were those of known C. elegans ABC proteins (for example, Pgp-1, the D. melanogaster white gene homolog T26A5.1, and so on). Matching protein sequences returned by FASTA were checked by BLAST [47], using either the NCBI [48] or Baylor College of Medicine (BCM) servers [49]. Only those with highly significant matches to annotated ABC proteins in the sequence database were retained. The most poorly matched, verified ABC protein from each FASTA run was used as the query sequence for an additional FASTA search, and this process was repeated until no new ABC proteins were found. At a later stage in the analysis, representative members of different ABC transporter subfamilies were used as query sequences to search the updated WormPep81 file using a BLAST server at the Sanger Centre [45]. Searches were conducted using multiple queries until all proteins already included in our dataset were found. No additional ABC proteins were identified, though some sequences were found to have been included in our dataset twice under different names. These redundant sequences were eliminated. FASTA searches were run on a SUN Microsystems UltraSPARC 5 computer. All other computer operations were carried out on an Apple Power Macintosh G3. Yeast and human ABC transporter sequences were obtained from NCBI and are described in the literature [10,13].

Identification of ABC protein features

BLAST + Beauty searches on the BCM server identified the location of the conserved Walker A and ABC signature motifs (Prosite motifs [50] PS00017 and PS00211, respectively) associated with the ATP-binding cassette(s) of each protein. The number and positions of transmembrane domains in each ABC protein were predicted by using TopPred II v1.3 [51] and then vetting the program's results by eye to exclude spurious transmembrane segments. Chromosomal locations of each ABC protein in the C. elegans genome were looked up in the C. elegans database AceDB [52].

Phylogenetic analyses

Using the information derived from each protein sequence (as above) we extracted only the sequence of each predicted ATP-binding cytoplasmic domain. These domains were assembled into a single file using the SeqApp1.9 multiple sequence editor [53], and aligned using ClustalX [54]. In those cases where two ATP-binding cassettes (ABCs) are present in a single protein with no intervening transmembrane domains (Subfamilies E and F, see Figure 1), the entire sequence was divided into two at an arbitrary point halfway between the two predicted ABC domains. As a result, 'two-domain' proteins are represented twice in our initial analysis. Once this approach had been used to assign genes to particular well-supported subgroups, we realigned the sequences and reanalyzed the relationships within each group using full-length amino acid sequence data. Aligned sequences were used to generate matrices of mean distances between proteins, and these matrices were used to generate phylogenetic trees according to the neighbour-joining algorithm [55], refined using the SPR branch-swapping technique under the minimum evolution criterion, implemented by PAUP*4.0b10 [56]. Bootstrapping (1,000 replicates) was done according to the method of Felsenstein [57], using the same parameters described above. Phylogenetic trees were visualized and manipulated using TreeView 1.6.2 [58] and MacClade 3.0.4 [59].

Additional data files

The following additional data are included with the online version of this article: the protein sequence alignments for the ABCA subfamily (Additional data file 1), the ABCB subfamily (Additional data file 2), the ABCC subfamily (Additional data file 3), the ABCD subfamily (Additional data file 4), the ABCE and ABCF subfamilies (Additional data file 5), the ABCG subfamily (Additional data file 6), the ABCH subfamily (Additional data file 7), and the protein sequences from the nucleotide-binding folds only (Additional data file 8). In addition to the four genomes discussed in this paper, mouse (M. musculus) ABC transporter genes are included in some of these alignments. All eight files are in Nexus format, which is a plain-text format designed for use with the programs PAUP [56] and MacClade [59]. A Nexus Data Editor for Windows is also available [60].

Additional data file 1

The protein sequence alignments for the ABCA subfamily Click here for additional data file

Additional data file 2

The protein sequence alignments for the ABCB subfamily Click here for additional data file

Additional data file 3

The protein sequence alignments for the ABCC subfamily Click here for additional data file

Additional data file 4

The protein sequence alignments for the ABCD subfamily Click here for additional data file

Additional data file 5

The protein sequence alignments for the ABCE and ABCF subfamilies Click here for additional data file

Additional data file 6

The protein sequence alignments for the ABCG subfamily Click here for additional data file

Additional data file 7

The protein sequence alignments for the ABCH subfamily Click here for additional data file

Additional data file 8

The protein sequences from the nucleotide-binding folds only Click here for additional data file
  51 in total

Review 1.  ABC transporters: from microorganisms to man.

Authors:  C F Higgins
Journal:  Annu Rev Cell Biol       Date:  1992

2.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

3.  Rapid and sensitive sequence comparison with FASTP and FASTA.

Authors:  W R Pearson
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

4.  Distinguishing homologous from analogous proteins.

Authors:  W M Fitch
Journal:  Syst Zool       Date:  1970-06

5.  Functional consequences of phenylalanine mutations in the predicted transmembrane domain of P-glycoprotein.

Authors:  T W Loo; D M Clarke
Journal:  J Biol Chem       Date:  1993-09-25       Impact factor: 5.157

6.  Complementation of transport-deficient mutants of Escherichia coli alpha-hemolysin by second-site mutations in the transporter hemolysin B.

Authors:  F Zhang; J A Sheps; V Ling
Journal:  J Biol Chem       Date:  1993-09-15       Impact factor: 5.157

7.  The P-glycoprotein gene family of Caenorhabditis elegans. Cloning and characterization of genomic and complementary DNA sequences.

Authors:  C R Lincke; I The; M van Groenigen; P Borst
Journal:  J Mol Biol       Date:  1992-11-20       Impact factor: 5.469

8.  Systematic functional analysis of the Caenorhabditis elegans genome using RNAi.

Authors:  Ravi S Kamath; Andrew G Fraser; Yan Dong; Gino Poulin; Richard Durbin; Monica Gotta; Alexander Kanapin; Nathalie Le Bot; Sergio Moreno; Marc Sohrmann; David P Welchman; Peder Zipperlen; Julie Ahringer
Journal:  Nature       Date:  2003-01-16       Impact factor: 49.962

Review 9.  Comparison of the complete protein sets of worm and yeast: orthology and divergence.

Authors:  S A Chervitz; L Aravind; G Sherlock; C A Ball; E V Koonin; S S Dwight; M A Harris; K Dolinski; S Mohr; T Smith; S Weng; J M Cherry; D Botstein
Journal:  Science       Date:  1998-12-11       Impact factor: 47.728

10.  The expression of two P-glycoprotein (pgp) genes in transgenic Caenorhabditis elegans is confined to intestinal cells.

Authors:  C R Lincke; A Broeks; I The; R H Plasterk; P Borst
Journal:  EMBO J       Date:  1993-04       Impact factor: 11.598

View more
  81 in total

1.  The role of Brugia malayi ATP-binding cassette (ABC) transporters in potentiating drug sensitivity.

Authors:  Jeffrey B Tompkins; Laurel E Stitt; Alana M Morrissette; Bernadette F Ardelli
Journal:  Parasitol Res       Date:  2011-04-15       Impact factor: 2.289

2.  Function of the Caenorhabditis elegans ABC transporter PGP-2 in the biogenesis of a lysosome-related fat storage organelle.

Authors:  Lena K Schroeder; Susan Kremer; Maxwell J Kramer; Erin Currie; Elizabeth Kwan; Jennifer L Watts; Andrea L Lawrenson; Greg J Hermann
Journal:  Mol Biol Cell       Date:  2007-01-03       Impact factor: 4.138

3.  ATP-binding cassette transporters are required for efficient RNA interference in Caenorhabditis elegans.

Authors:  Prema Sundaram; Benjamin Echalier; Wang Han; Dawn Hull; Lisa Timmons
Journal:  Mol Biol Cell       Date:  2006-05-24       Impact factor: 4.138

Review 4.  Molecular basis of the polyspecificity of P-glycoprotein (ABCB1): recent biochemical and structural studies.

Authors:  Eduardo E Chufan; Hong-May Sim; Suresh V Ambudkar
Journal:  Adv Cancer Res       Date:  2015-01-08       Impact factor: 6.242

5.  The matrix peptide exporter HAF-1 signals a mitochondrial UPR by activating the transcription factor ZC376.7 in C. elegans.

Authors:  Cole M Haynes; Yun Yang; Steven P Blais; Thomas A Neubert; David Ron
Journal:  Mol Cell       Date:  2010-02-26       Impact factor: 17.970

Review 6.  ABC transporters and RNAi in Caenorhabditis elegans.

Authors:  Lisa D Timmons
Journal:  J Bioenerg Biomembr       Date:  2007-12       Impact factor: 2.945

7.  Differential transcriptomic responses of Biomphalaria glabrata (Gastropoda, Mollusca) to bacteria and metazoan parasites, Schistosoma mansoni and Echinostoma paraensei (Digenea, Platyhelminthes).

Authors:  Coen M Adema; Patrick C Hanington; Cheng-Man Lun; George H Rosenberg; Anthony D Aragon; Barbara A Stout; Mara L Lennard Richard; Paul S Gross; Eric S Loker
Journal:  Mol Immunol       Date:  2009-12-03       Impact factor: 4.407

8.  Pharmacogenetic analysis of lithium-induced delayed aging in Caenorhabditis elegans.

Authors:  Gawain McColl; David W Killilea; Alan E Hubbard; Maithili C Vantipalli; Simon Melov; Gordon J Lithgow
Journal:  J Biol Chem       Date:  2007-10-24       Impact factor: 5.157

9.  Numbers of genes in the NBS and RLK families vary by more than four-fold within a plant species and are regulated by multiple factors.

Authors:  Meiping Zhang; Yen-Hsuan Wu; Mi-Kyung Lee; Yun-Hua Liu; Ying Rong; Teofila S Santos; Chengcang Wu; Fangming Xie; Randall L Nelson; Hong-Bin Zhang
Journal:  Nucleic Acids Res       Date:  2010-06-11       Impact factor: 16.971

10.  The ABC transporter gene family of Daphnia pulex.

Authors:  Armin Sturm; Phil Cunningham; Michael Dean
Journal:  BMC Genomics       Date:  2009-04-21       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.