| Literature DB >> 30458842 |
Christophe Guyeux1, Bashar Al-Nuaimi2,3, Bassam AlKindy4, Jean-François Couchot2, Michel Salomon2.
Abstract
BACKGROUND: To reconstruct the evolution history of DNA sequences, novel models of increasing complexity regarding the number of free parameters taken into account in the sequence evolution, as well as faster and more accurate algorithms, and statistical and computational methods, are needed. More particularly, as the principal forces that have led to major structural changes are genome rearrangements (such as translocations, fusions, and so on), understanding their underlying mechanisms, among other things via the ancestral genome reconstruction, are essential. In this problem, since finding the ancestral genomes that minimize the number of rearrangements in a phylogenetic tree is known to be NP-hard for three or more genomes, heuristics are commonly chosen to obtain approximations of the exact solution. The aim of this work is to show that another path is possible.Entities:
Keywords: Ancestral reconstruction; Bacterial lineages; Evolution; Genome rearrangements; Mycobacterium tuberculosis; Pathogens
Mesh:
Year: 2018 PMID: 30458842 PMCID: PMC6245693 DOI: 10.1186/s12918-018-0618-2
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
The considered Mycobacterium strains
| Accession (GenBank) | Organism name | Sequence length (bp) | Nickname |
|---|---|---|---|
| CP010335.1 |
| 4,419,839 |
|
| CP010336.1 |
| 4,405,033 |
|
| NC_000962.3 |
| 4,411,532 |
|
| NC_002755.2 |
| 4,403,837 |
|
| NC_009525.1 |
| 4,419,977 |
|
| NC_009565.1 |
| 4,424,435 |
|
| NC_012943.1 |
| 4,398,250 |
|
| NC_016768.1 |
| 4,394,985 |
|
| NC_016934.1 |
| 4,418,088 |
|
| NC_017522.1 |
| 4,405,981 |
|
| NC_017524.1 |
| 4,398,525 |
|
| NC_018078.1 |
| 4,399,120 |
|
| NC_018143.2 |
| 4,411,709 |
|
| NC_020089.1 |
| 4,421,197 |
|
| NC_020559.1 |
| 4,392,353 |
|
| NC_021054.1 |
| 4,411,128 |
|
| NC_021194.1 |
| 4,390,306 |
|
| NC_021251.1 |
| 4,414,325 |
|
| NC_021740.1 |
| 4,391,174 |
|
| NC_022350.1 |
| 4,408,224 |
|
| NZ_AP014573.1 |
| 4,415,078 |
|
| NZ_CP002871.1 |
| 4,407,929 |
|
| NZ_CP002882.1 |
| 4,401,899 |
|
| NZ_CP002883.1 |
| 4,399,405 |
|
| NZ_CP002885.1 |
| 4,414,346 |
|
| NZ_CP007027.1 |
| 4,410,911 |
|
| NZ_CP007803.1 |
| 4,385,518 |
|
| NZ_CP007809.1 |
| 4,410,788 |
|
| NZ_CP009100.1 |
| 4,411,507 |
|
| NZ_CP009101.1 |
| 4,411,515 |
|
| NZ_CP009426.1 |
| 4,379,376 |
|
| NZ_CP009427.1 |
| 4,410,945 |
|
| NZ_CP009480.1 |
| 4,396,119 |
|
| NZ_CP010330.1 |
| 4,421,903 |
|
| NZ_CP010337.1 |
| 4,401,829 |
|
| NZ_CP010338.1 |
| 4,417,090 |
|
| NZ_CP010339.1 |
| 4,399,422 |
|
| CP010340.1 |
| 4,426,489 |
|
| NZ_CP012090.1 |
| 4,418,548 |
|
| NZ_CP012506.1 |
| 4,379,515 |
|
| NZ_HG813240.1 |
| 4,412,379 |
|
| CP010329.1 |
| 4,428,621 |
|
| NC_015758.1 |
| 4,389,314 |
|
| CP010334.1 |
| 4,386,422 |
|
| CP010333.1 |
| 4,370,115 |
|
| NC_015848.1 |
| 4,482,059 |
|
| NC_019951.1 |
| 4,525,948 |
|
| NC_019950.1 |
| 4,432,426 |
|
| NC_019952.1 |
| 4,524,466 |
|
| NC_019965.1 |
| 4,420,197 |
|
| NC_002945.3 |
| 4,345,492 |
|
| NC_008769.1 |
| 4,374,522 |
|
| NC_012207.1 |
| 4,371,711 |
|
| NZ_CP003494.1 |
| 4,334,064 |
|
| NC_016804.1 |
| 4,350,386 |
|
| NC_020245.2 |
| 4,376,711 |
|
| NZ_CP009449.1 |
| 4,358,088 |
|
| NZ_AM412059.1 |
| 4,340,116 |
|
| NZ_CP008744.1 |
| 4,410,431 |
|
| NZ_CP012095.1 |
| 4,351,712 |
|
| NZ_CP009243.1 |
| 4,370,138 |
|
| NZ_CP013741.1 |
| 4,370,705 |
|
| CP010331.1 |
| 4,351,313 |
|
| CP010332.1 |
| 4,336,227 |
|
| NZ_CP014566.1 |
| 4,371,707 |
|
Fig. 1Indels on internal nodes of the tree of some M. canettii species
Fig. 2Ancestral reconstruction of one problematic indel in the alignment
Single nucleotide polymorphism between species (100.X is the name of an ancestral species, cf. the phylogeny)
|
|
| |||
|---|---|---|---|---|
| Father | Children | No. of SNPs | Children | No.of SNPs |
|
|
| 1 |
| 5 |
|
| 9 |
| 14 | |
|
|
| 1041 |
| 1 |
|
| 12398 |
| 0 | |
|
|
| 28 |
| 0 |
|
| 735 |
| 0 | |
|
| - | - |
| 1 |
| - | - |
| 0 | |
| 100.4 |
| - |
| 0 |
|
| - |
| 1 | |
|
|
| 111 |
| 5 |
|
| 438 |
| 1 | |
Fig. 3Flowchart of the proposed approach
Fig. 4Ancestral reconstruction of a M. canettii SNP
Fig. 5Synteny blocks of Mycobacterium strains available online
Fig. 6M. canettii phylogeny (outgroup: M. tuberculosis)
Fig. 7M. tuberculosis phylogeny (GTR Gamma model and outgroup:M. africanum)
Number of columns of the MSA with SPNs or indels for M. canettii (large deletions are counted character by character)
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
|
| 3354 | 1150 | 27437 | 61346 | 7510 | 0 |
|
| 4833 | 7971 | 27468 | 60987 | 0 | 7510 |
|
| 60957 | 61233 | 62717 | 0 | 60987 | 61346 |
|
| 27256 | 27260 | 0 | 62717 | 27468 | 27437 |
|
| 3524 | 0 | 27260 | 61233 | 7971 | 1150 |
|
| 0 | 3524 | 27256 | 60957 | 4833 | 3354 |
Species entries are in boldface
Variations in the alignment of the M. tuberculosis clade under consideration
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|
|
| 0 | 199770 | 214401 | 219205 | 216387 | 217235 | 216919 | 217186 |
|
| 199770 | 0 | 212403 | 219039 | 216908 | 216672 | 216726 | 216953 |
|
| 214401 | 212403 | 0 | 216808 | 216534 | 217011 | 216786 | 216882 |
|
| 219205 | 219039 | 216808 | 0 | 216669 | 216916 | 216251 | 216678 |
|
| 216387 | 216908 | 216534 | 216669 | 0 | 142974 | 189148 | 199505 |
|
| 217235 | 216672 | 217011 | 216916 | 142974 | 0 | 189460 | 199412 |
|
| 216919 | 216726 | 216786 | 216251 | 189148 | 189460 | 0 | 194315 |
|
| 217186 | 216953 | 216882 | 216678 | 199505 | 199412 | 194315 | 0 |
Species entries are in boldface
Fig. 8SNPs location of mononucleotidic variants of M. canettii
Fig. 9SNPs location of mononucleotidic variants of M. turberculosis
Fig. 10Synteny blocks in M. canettii. Each genome is colored according to the position of the corresponding region in the first genome (gray if a region is unshared)
Differences in the alignment on chromosome 1 of abortus
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 0 | 2320 | 1030 | 4304 | 7194 | 7481 | 5308 | 4891 | 4850 | 7837 | 839 | 12693 | 4695 | 18486 |
|
| 2320 | 0 | 1772 | 5150 | 6658 | 8371 | 4911 | 5071 | 5030 | 8022 | 1762 | 12841 | 5621 | 16724 |
|
| 1030 | 1772 | 0 | 3996 | 6866 | 7116 | 5033 | 4603 | 4576 | 7470 | 537 | 12958 | 4385 | 18049 |
|
| 4304 | 5150 | 3996 | 0 | 10010 | 5955 | 2649 | 853 | 2462 | 6271 | 3800 | 11488 | 4738 | 16568 |
|
| 7194 | 6658 | 6866 | 10010 | 0 | 13161 | 9784 | 9884 | 9892 | 12820 | 6601 | 17617 | 10413 | 22727 |
|
| 7481 | 8371 | 7116 | 5955 | 13161 | 0 | 6834 | 6408 | 6441 | 425 | 6911 | 15180 | 7869 | 16608 |
|
| 5308 | 4911 | 5033 | 2649 | 9784 | 6834 | 0 | 2103 | 505 | 6494 | 4807 | 11411 | 5745 | 16113 |
|
| 4891 | 5071 | 4603 | 853 | 9884 | 6408 | 2103 | 0 | 1907 | 6055 | 4393 | 11534 | 5321 | 16337 |
|
| 4850 | 5030 | 4576 | 2462 | 9892 | 6441 | 505 | 1907 | 0 | 6102 | 4350 | 11524 | 5342 | 16581 |
|
| 7837 | 8022 | 7470 | 6271 | 12820 | 425 | 6494 | 6055 | 6102 | 0 | 7253 | 14833 | 8210 | 16283 |
|
| 839 | 1762 | 537 | 3800 | 6601 | 6911 | 4807 | 4393 | 4350 | 7253 | 0 | 12818 | 4157 | 17940 |
|
| 12693 | 12841 | 12958 | 11488 | 17617 | 15180 | 11411 | 11534 | 11524 | 14833 | 12818 | 0 | 14057 | 24464 |
|
| 4695 | 5621 | 4385 | 4738 | 10413 | 7869 | 5745 | 5321 | 5342 | 8210 | 4157 | 14057 | 0 | 18905 |
|
| 18486 | 16724 | 18049 | 16568 | 22727 | 16608 | 16113 | 16337 | 16581 | 16283 | 17940 | 24464 | 18905 | 0 |
Species entries are in boldface
Single nucleotide polymorphism in Brucella melitensis
| Fathers | Children | No. of SNPs |
|
|
| 64 |
|
| 74 | |
|
|
| 106 |
|
| 8 | |
|
|
| 4458 |
|
| 104 | |
|
|
| 840 |
| melitensis5 | 997 | |
|
|
| 372 |
| 100.4 | 689 | |
|
|
| 23 |
| melitensis7 | 26 |
Single nucleotide polymorphism in Brucella abortus
|
|
| |||
|---|---|---|---|---|
| Fathers | Children | No. of SNPs | Children | No. of SNPs |
|
|
| 55 |
| 41 |
|
| 72 |
| 38 | |
|
|
| 37 |
| 25 |
|
| 55 |
| 15 | |
|
|
| 37 |
| 17 |
|
| 5 |
| 0 | |
|
| 100.3 | 24 | 100.3 | 15 |
| abortus4 | 84 | abortus4 | 51 | |
Brucella genus: genome information
| Accession (GenBank) | Organism name | Sequence length(bp) | Nickname |
|---|---|---|---|
| NC_006932.1 |
| 2,124,241 |
|
| NC_006933.1 |
| 1,162,04 | |
| NC_010742.1 |
| 2,122,487 |
|
| NC_010740.1 |
| 1,161,449 | |
| NC_016795.1 |
| 2,123,773 |
|
| NC_016777.1 |
| 1,162,259 | |
| NZ_CP007663.1 |
| 2,124,677 |
|
| NZ_CP007662.1 |
| 1,155,633 | |
| NZ_CP007681.1 |
| 2,128,683 |
|
| NZ_CP007680.1 |
| 1,160,817 | |
| NZ_CP007682.1 |
| 2,125,180 |
|
| NZ_CP007683.1 |
| 1,163,338 | |
| NZ_CP007700.1 |
| 2,123,620 |
|
| NZ_CP007701.1 |
| 1,161,669 | |
| NZ_CP007705.1 |
| 2,124,100 |
|
| NZ_CP007706.1 |
| 1,155,846 | |
| NZ_CP007709.1 |
| 2,124,096 |
|
| NZ_CP007710.1 |
| 1,157,058 | |
| NZ_CP007738.1 |
| 2,124,832 |
|
| NZ_CP007737.1 |
| 1,1633,26 | |
| NZ_CP007765.1 |
| 2,123,991 |
|
| NZ_CP007764.1 |
| 1,162,137 | |
| NZ_CP008774.1 |
| 2,116990 |
|
| NZ_CP008775.1 |
| 1,156,120 | |
| NZ_CP009626.1 |
| 1,162,580 |
|
| NZ_CP009625.1 |
| 2,122,847 | |
| NZ_LN997863.1 |
| 2,177,010 |
|
| NZ_LN997864.1 |
| 1,061,127 | |
| NZ_CP007759.1 |
| 1,206,801 |
|
| NZ_CP007758.1 |
| 2,105,950 | |
| NC_010103.1 |
| 2,105,69 |
|
| NC_010104.1 |
| 1,206,800 | |
| NC_016778.1 |
| 2,107,023 |
|
| NC_016796.1 |
| 1,170,489 | |
| NZ_CP007629.1 |
| 2,106,955 |
|
| NZ_CP007630.1 |
| 1,203,360 | |
| NC_022905.1 |
| 2,117,718 |
|
| NC_022906.1 |
| 1,160,316 | |
| NC_007618.1 |
| 2,121,359 |
|
| NC_007624.1 |
| 1,156,948 | |
| NZ_CP008751.1 |
| 1,185,741 |
|
| NZ_CP008750.1 |
| 2,126,134 | |
| NZ_CP007762.1 |
| 1,177,791 |
|
| NZ_CP007763.1 |
| 2,116,984 | |
| NZ_CP007761.1 |
| 1,187,961 |
|
| NZ_CP007760.1 |
| 2,122,766 | |
| NC_017283.1 |
| 1,176,758 |
|
| NC_017248.1 |
| 2,117,717 | |
| NC_017247.1 |
| 1,185,778 |
|
| NC_017246.1 |
| 2,126,451 | |
| NC_017245.1 |
| 1,185 615 |
|
| NC_017244.1 |
| 2,126,133 | |
| NC_012442.1 |
| 1,185,518 |
|
| NC_012441.1 |
| 2,125,701 | |
| NC_013119.1 |
| 2,117,050 |
|
| NC_013118.1 |
| 1,220,319 | |
| NC_009505.1 |
| 2,111,370 |
|
| NC_009504.1 |
| 1,164,220 | |
| NC_015857.1 |
| 2,138,342 |
|
| NC_015858.1 |
| 1,260,926 | |
| NZ_CP007743.1 |
| 2,139,033 |
|
| NZ_CP007742.1 |
| 1,191,996 | |
| NZ_CP010851.1 |
| 1,207,241 |
|
| NZ_CP010850.1 |
| 2,107,845 | |
| CP009095.1 |
| 1,215,956 |
|
| CP009094.1 |
| 2,224,908 | |
| CP009097.1 |
| 1,311,857 |
|
| CP009096.1 |
| 2,181,422 | |
| NZ_CP008756.1 |
| 1,410,995 |
|
| NZ_CP008757.1 |
| 1,902,870 | |
| NZ_CP007718.1 |
| 1,190,208 |
|
| NZ_CP007719.1 |
| 2,107,052 | |
| NZ_CP007716.1 |
| 1,187,980 |
|
| NZ_CP007717.1 |
| 2,131,717 | |
| NZ_CP007696.1 |
| 1,398,244 |
|
| NZ_CP007695.1 |
| 1,926,295 | |
| NZ_CP007721.1 |
| 1,401,375 |
|
| NZ_CP007720.1 |
| 1,927,083 | |
| NZ_CP007698.1 |
| 1,401,378 |
|
| NZ_CP007697.1 |
| 1,927,594 | |
| NC_004310.3 |
| 2,107,794 |
|
| NC_004311.2 |
| 1,207,381 | |
| NC_010169.1 |
| 1,923,763 |
|
| NC_010167.1 |
| 1,400,844 | |
| NC_017251.1 |
| 2,107,783 |
|
| NC_017250.1 |
| 1,207,380 | |
| NC_016797.1 |
| 2,108,637 |
|
| NC_016775.1 |
| 1,207,451 | |
| NZ_CP006961.1 |
| 2,107,842 |
|
| NZ_CP006962.1 |
| 1,207,433 | |
| NZ_CP007691.1 |
| 1,926,480 |
|
| NZ_CP007692.1 |
| 1,398,285 | |
| NZ_CP007693.1 |
| 1,926,716 |
|
| NZ_CP007694.1 |
| 1,398,326 |
Fig. 11Brucella, chromosome 1: a high sequence similarity with little recombination events
Fig. 12Synteny map of Brucella abortus (a) chromosome 1 and (b) chromosome 2. Genomes investigation tends to show a high sequence similarity with little recombination events. Each genome is colored according to the position of the corresponding region in the first genome, or gray if a region is unshared
Fig. 13Well-supported phylogeny of Brucella abortus species calculated on the entire chromosome 1. The outgroup is melitensis, while RaxML has been launched with the GTR Gamma model
Fig. 14Well supported phylogeny of Brucella melitensis species
Fig. 15SNPs location in Brucella abortus species. (a) Chromosome 1, (b) chromosome 2
Fig. 16Single nucleotide polymorphism in Brucella melitensis species
Fig. 17Nucleotides in the ancestral nodes and their children on Brucella abortus species. a Chromosome 1 b chromosome 2
Fig. 18Dotplot of Brucella melitensis species, chromosome 1
Fig. 19Brucella abortus phylogenetic tree: estimation of the CRISPRs length and locations by using the CRISPRFinder web server [36]
Fig. 20CRISPR investigation in B. melitensis