| Literature DB >> 12801410 |
Víctor González1, Patricia Bustos, Miguel A Ramírez-Romero, Arturo Medrano-Soto, Heladia Salgado, Ismael Hernández-González, Juan Carlos Hernández-Celis, Verónica Quintero, Gabriel Moreno-Hagelsieb, Lourdes Girard, Oscar Rodríguez, Margarita Flores, Miguel A Cevallos, Julio Collado-Vides, David Romero, Guillermo Dávila.
Abstract
BACKGROUND: Symbiotic bacteria known as rhizobia interact with the roots of legumes and induce the formation of nitrogen-fixing nodules. In rhizobia, essential genes for symbiosis are compartmentalized either in symbiotic plasmids or in chromosomal symbiotic islands. To understand the structure and evolution of the symbiotic genome compartments (SGCs), it is necessary to analyze their common genetic content and organization as well as to study their differences. To date, five SGCs belonging to distinct species of rhizobia have been entirely sequenced. We report the complete sequence of the symbiotic plasmid of Rhizobium etli CFN42, a microsymbiont of beans, and a comparison with other SGC sequences available.Entities:
Mesh:
Substances:
Year: 2003 PMID: 12801410 PMCID: PMC193615 DOI: 10.1186/gb-2003-4-6-r36
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Structure of the symbiotic plasmid p42d of R. etli CNF42. The structure of p42d is represented in five concentric circles. Outermost circle, relevant regions referred to in the text: NRa, b and c, regions containing nitrogenase structural genes; FIX1 and FIX2, clusters containing nitrogen-fixation genes; NOD, major cluster of nodulation genes; CPX, cluster for cytochrome P450; TRA, cluster for tra genes; REP, replicator region; TSSIII and IV, clusters for transport secretion system genes. The 125 kb region that contains most of the symbiotic genes, described in the text as a putative mobile element, is shown in green. Second circle, organization of predicted CDSs located according to the direction of transcription color-coded as below; those transcribed on the plus strand are shown in the outer half of the circle. For each class, the number of CDSs and the percentage of the total are: hypothetical (70) 19.5% (dark red); hypothetical conserved (62) 17.3% (red); integration recombination (55) 15.3% (purple); various enzymatic functions (45) 12.3% (khaki); transport secretion systems (37) 10.3% (gray); nitrogen fixation (35) 9.8% (yellow); nodulation (18) 5% (dark blue); transcriptional regulation (15) 4.2% (light blue); plasmid maintenance (10) 2.8% (orange); electron transfer (7) 2.1% (magenta); chemotaxis (3) 0.8% (pink); and polysaccharide synthesis (2) 0.6% (green). Third circle, elements related to insertion sequences (ERIS). Putative partial ISs (purple), and putative complete ISs (black). Fourth circle, reiterated DNA families. The major reiterated families (see text) are shown in different colors. Innermost circle, potential genomic rearrangements. Arrowheads indicate the sites for homologous recombination leading to genomic rearrangements. Black lines connect sites for amplification or deletion events; red lines connect sites for inversion.
Figure 2Compositional features of the coding sequences (CDS) of p42d. (a) GC content, and (b) CU of the 359 CDS of p42d. Red lines indicate the average in GC (58.1%) and CU (0.58). Blue lines indicate 1 standard deviation of GC ± 3.5% and CU ± 0.16. Highest and lowest percentage values of GC are 69.4 and 45.8 respectively. The CU limit values varies from 0.11 to 1.00. (c) CDS distribution with the color codes for functional classes and the relevant regions described in Figure 1.
Figure 3Comparison of predicted proteins from p42d with those from other genomes and SGCs. Bidirectional best hits (BDBHs) between p42d and other genomes are shown. The bars in all rows represent the percentage identity (number of identities/length of the alignment) of BDBHs between p42d and the indicated genome (see below for color code). The horizontal red line in each row indicates 50% of similarity. A color code is shown for each genome or compartment. (a) Different organisms: Bacillus subtilis (dark magenta); Brucella melitensis (yellow); Caulobacter crescentus (red); Escherichia coli K12 (light magenta), Methanobacterium thermoautotrophicum (dark purple), and Ralstonia solanacearum (purple). (b) A. tumefaciens C58 circular chromosome (white), linear chromosome (pale gray), pAT (gray), and pTi (dark gray). (c) B. japonicum USDA110 SGC (turquoise). (d) pNGR234a (blue green). (e) M. loti R7A SGC (green). (f) M. loti MAFF303099 SGC (dark blue), and the rest of the chromosome (light blue). (g) S. meliloti pSymA (pale yellow), pSymB (yellow), and the chromosome (dark yellow). (h) CDS distribution for p42d with the color codes for functional classes and the relevant regions as indicated in Figure 1.
Number of bidirectional best hits between pairs of SGCs or complete genomes
| p42d | ||||||||||
| pNGR234a | 120 | pNGR23a | ||||||||
| SGC | 88* | 133 | SGC | |||||||
| 63 | 86 | 59 | ||||||||
| pSymA | 100 | 77 | 47 | ND | pSymA | |||||
| pSymB | 20 | 43 | 17 | ND | ND | pSymB | ||||
| 62 | 127 | 66 | 2367 | 321 | 613 | |||||
| pMLa | 15 | 29 | 8 | 18 | 17 | 10 | ND | pMLa | ||
| pMLb | 5 | 12 | 11 | 8 | 23 | 12 | ND | ND | pMLb | |
| SGC | 81 | 116 | 79 | ND | 65 | ND | ND | ND | ND | SGC |
| SGCR7A | 101 | 135 | 89 | 86 | 73 | 54 | 30 | 21 | 2 | 240 |
Bidirectional best hits (BDBHs) were calculated in pairwise comparisons using BLASTP. All reciprocal matches with e-value up to 1e-04 and a coverage of at least 50% on the length of the shorter CDS were collected. p42d, the symbiotic plasmid of R. etli, 371 kb, 359 CDS; pNGR234a, the symbiotic plasmid of Rhizobium sp. 536 kb, 416 CDS; SGCBj, B. japonicum USDA110 symbiotic chromosomal region, 410 kb, 388 CDS; SmChr, S. meliloti chromosome, 3,600 kb, 3396 CDS; pSymA, S. meliloti symbiotic plasmid A, 1,354 kb, 1,295 CDS; pSymB, S. meliloti symbiotic plasmid B, 1,683 kb, 1,571 CDS; MlChr, M. loti MAFF303099 chromosome without the symbiotic island, 6,425 kb, 6,172 CDS; pMLa, M. loti MAFF303099 cryptic plasmid a, 351 kb, 320 CDS; pMLb, M. loti MAFF303099 cryptic plasmid b, 208 kb, 209 CDS; SGCMl, M. loti MAFF303099 symbiotic island, 611 kb, 580 CDS; SGCR7A, M. loti R7A symbiotic island, 502 kb, 414 CDS. ND, not determined. *The number of BDBHs with the complete genome of B. japonicum USDA110 is 150.
Figure 4Analysis of synteny among the SGCs. Pairs of orthologous proteins among different genomes or SGCs are plotted. Each protein pair is shown according to the location of the corresponding coordinate of the predicted translation start of the gene on the DNA region. The axes correspond to the total length of the respective DNA region: p42d 371,255 bp; M. loti MAFF303099 symbiotic island 610,975 bp; M. loti R7A symbiotic island 502,000 bp; S. meliloti pSymA 354,226 bp; pNGR234a 536,165 bp and B. japonicum symbiotic region 410,573 bp. For each group the first region mentioned corresponds to the x-axis. (a) p42d vs pNGR234a; (b) p42d vs pSymA; (c) p42d vs B. japonicum symbiotic region; (d) p42d vs M. loti MAFF303099 symbiotic island; (e) p42d vs M. loti R7A, symbiotic island; (f) pNGR234a vs S. meliloti pSymA; (g) pNGR234a vs B. japonicum symbiotic region; (h) pNGR234a vs M. loti 303099 symbiotic island; (i) pNGR234a vs M. loti R7A symbiotic island; (j) M. loti MAFF303099 symbiotic island vs M. loti R7A symbiotic island.
Figure 5Distribution of the 20 genes common to all the SGCs analyzed. (a) p42d; (b) M. loti MAFF303099 SGC; (c) pNGR234a; (d) M. loti R7A SGC; (e) B. japonicum SGC; (f) S. meliloti pSymA. The color bars indicate the position of the genes. The nodulation genes nodABCDIJ are represented in blue, and the nitrogen-fixation genes nifHDKNEXAB, fixABCX and fdxBN are represented in yellow.