| Literature DB >> 21304715 |
Adriana Giongo1, Heather L Tyler, Ursula N Zipperer, Eric W Triplett.
Abstract
Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures.Entities:
Keywords: Gluconacetobacter; Optical Mapping; chromosomal rearrangements
Year: 2010 PMID: 21304715 PMCID: PMC3035290 DOI: 10.4056/sigs.972221
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Optical and in silico BglII restriction maps for G. diazotrophicus PAl 5
| | |||
|---|---|---|---|
| | | ||
| Map Length (bp) | 3,845,512 | 3,887,492 | 3,944,163 |
| Number of Fragments | 424 | 486 | 503 |
| Average fragment length (bp) | 9,070 | 7,999 | 7,841 |
| Maximum fragment length (bp) | 52,064 | 51,728 | 50,690 |
| Minimum fragment length (bp) | 562 | 24 | 28 |
Figure 1- Alignment of G. diazotrophicus PAl 5 optical map with in silico maps of genome sequences A: The BglII optical map of G. diazotrophicus PAl 5 aligned against in silico optical maps calculated from the genome sequence proposed by RioGene (AM889285) and JGI (CP001189). B-D: Misassemblies in PAl 5 RioGene sequence when aligned against the optical map. B: Two large inversions in RioGene sequence compared to the optical map. C: Large translocation in RioGene sequence. D: Five translocations in RioGene sequence. Dark blue represents cut sites, light blue represents aligned regions, red represents regions aligning to both sequences, and white represents unaligned regions. Alignment lines for inversions and translocations highlighted in pink. Inverted and translocated regions highlighted in yellow.
Rearrangement positions in G. diazotrophicus PAl 5 genome sequence from RioGene
| | | ||
|---|---|---|---|
| | | ||
| Inversion | 391,267 | 955,614 | 564,347 |
| 3,078,324 | 3,634,241 | 555,917 | |
| Translocation | 149,268 | 358,682 | 209,414 |
| 1,115,930 | 1,446,765 | 330,835 | |
| 1,581,823 | 1,651,595 | 69,772 | |
| 1,627,253 | 1,796,352 | 169,099 | |
| 1,796,352 | 2,662,168 | 865,816 | |
| 3,706,896 | 3,878,171 | 171,275 | |
Regions of in silico maps not aligned to the G. diazotrophicus PAl 5 optical map
| | | |
|---|---|---|
| Total length of unaligned regions | 27,540 | 1,053,347 |
| Average unaligned fragment length | 574 | 5,885 |
| Maximum unaligned fragment length | 1,341 | 32,719 |
| Minimum unaligned fragment length | 24 | 28 |
Comparison of coding sequences between G. diazotrophicus PAl 5 genome sequences based on percent identity
| | | |||
|---|---|---|---|---|
| | | | | |
| 100 | 2024 | 56.7 | 2069 | 56.0 |
| ³ 99 | 2812 | 79.2 | 2876 | 77.8 |
| ³ 90 | 3190 | 89.8 | 3313 | 89.6 |
| > 75 | 3267 | 92.0 | 3402 | 92.0 |
| ³ 50 | 3326 | 93.7 | 3449 | 93.3 |
| < 50 | 225 | 6.3 | 247 | 6.7 |
| 0 | 187 | 5.3 | 168 | 4.5 |
Unique functional roles between G. diazotrophicus PAl 5 genome sequences
| | |
|---|---|
| Ribose ABC transport system, | Sorbitol dehydrogenase (EC 1.1.1.14) |
| D-alanine--D-alanine ligase (EC 6.3.2.4) | Transketolase, C-terminal section (EC 2.2.1.1) |
| UDP-N-acetylenolpyruvoylglucosamine | Transketolase, N-terminal section (EC 2.2.1.1) |
| Organic hydroperoxide resistance protein | COG0028: Thiamine pyrophosphate- |
| Organic hydroperoxide resistance | D-galactonate regulator, IclR family |
| Molybdenum cofactor biosynthesis protein B | Epi-inositol hydrolase (EC 3.7.1.-) |
| Flagellar biosynthesis protein fliL | Chromosome partition protein smc |
| Flagellar hook-associated protein flgL | dTDP-rhamnosyl transferase RfbF |
| Deoxyuridine 5’-triphosphate | Protein of unknown function DUF374 |
| Aminopeptidase S (Leu, Val, Phe, Tyr | Nicotinate-nucleotide adenylyltransferase |
| Leucyl/phenylalanyl-tRNA—protein | DNA repair exonuclease family protein |
| Cysteinyl-tRNA synthetase (EC 6.1.1.16) | ATP-dependent DNA helicase UvrD/PcrA, |
| tRNA:Cm32/Um32 methyltransferase | Outer membrane lipoprotein carrier |
| DNA-binding response regulator KdpE | |
| Osmosensitive K+ channel histidine kinase | |
| Potassium-transporting ATPase A chain | |
| Potassium-transporting ATPase B chain | |
| Beta-hexosaminidase (EC 3.2.1.52) | |
| Potassium-transporting ATPase C chain | |
| Protein-export membrane protein secD (TC | |
| H+/Cl- exchange transporter ClcA |
Transposases in G. diazotrophicus PAl 5 genome sequences
| | | |
|---|---|---|
| Total transposase genes | 59 | 110 |
| Transposase | 6 | 19 |
| Transposase (class II) | 1 | 2 |
| Transposase (class III) | 1 | 0 |
| Transposase (class IV) | 1 | 0 |
| Putative transposase | 27 | 64 |
| Transposase IS3 family protein | 2 | 4 |
| Transposase IS3/IS911 family protein | 1 | 0 |
| Transposase IS4 family protein | 6 | 4 |
| Transposase IS5 family protein | 4 | 7 |
| Transposase IS256 | 1 | 0 |
| Transposase IS630 | 0 | 1 |
| Isrso16-transposase OrfA protein | 1 | 0 |
| Transposase and inactivated derivative | 2 | 1 |
| Transposase mutator type | 5 | 6 |
| Probable insertion sequence transposase protein | 1 | 0 |
| TRm2011-2a transposase | 0 | 2 |