| Literature DB >> 26830005 |
Pasi K Korhonen1, Edoardo Pozio2, Giuseppe La Rosa2, Bill C H Chang1,3, Anson V Koehler1, Eric P Hoberg4, Peter R Boag5, Patrick Tan6,7, Aaron R Jex1, Andreas Hofmann1,8, Paul W Sternberg9, Neil D Young1, Robin B Gasser1.
Abstract
Trichinellosis is a globally important food-borne parasitic disease of humans caused by roundworms of the Trichinella complex. Extensive biological diversity is reflected in substantial ecological and genetic variability within and among Trichinella taxa, and major controversy surrounds the systematics of this complex. Here we report the sequencing and assembly of 16 draft genomes representing all 12 recognized Trichinella species and genotypes, define protein-coding gene sets and assess genetic differences among these taxa. Using thousands of shared single-copy orthologous gene sequences, we fully reconstruct, for the first time, a phylogeny and biogeography for the Trichinella complex, and show that encapsulated and non-encapsulated Trichinella taxa diverged from their most recent common ancestor ∼21 million years ago (mya), with taxon diversifications commencing ∼10-7 mya.Entities:
Mesh:
Year: 2016 PMID: 26830005 PMCID: PMC4740406 DOI: 10.1038/ncomms10513
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Assembly and gene prediction statistics for the draft genomes of all recognized encapsulated Trichinella taxa*.
| Country of origin | Poland | Norway | Italy | USA | USA | Tanzania | Namibia | Japan | Argentina |
| Host of origin | Domestic pig | Polar bear | Red fox | Coyote | Grizzly bear | Warthog | Lion | Raccoon dog | Cougar |
| Genome size (bp) | 50,035,721 | 48,088,508 | 51,516,808 | 49,039,267 | 50,855,377 | 47,453,436 | 49,332,017 | 49,096,069 | 49,773,209 |
| Number of scaffolds; contigs | 7,667; 9,209 | 4,635; 5,353 | 8,025; 9,709 | 5,255; 6,430 | 5,957; 7,471 | 2,973; 4,254 | 4,125; 5,876 | 5,824; 7,488 | 6,623; 8,003 |
| N50 (bp) | 212,546 | 141,195 | 147,150 | 106,482 | 158,103 | 293,867 | 239,129 | 212,690 | 154,103 |
| N90 (bp) | 36,812 | 30,368 | 9,232 | 15,500 | 17,717 | 70,070 | 46,127 | 43,251 | 23,618 |
| Genome GC content (%) | 33.62 | 33.56 | 33.72 | 33.58 | 33.44 | 33.49 | 33.57 | 33.48 | 33.62 |
| Coding GC content (%) | 43.08 | 43.14 | 43.31 | 43.14 | 43 | 43.1 | 43.15 | 43.06 | 43.24 |
| Exonic proportion; including introns (%) | 33.08; 71.63 | 33.71; 71.36 | 34.09; 71.23 | 33.62; 69.28 | 33.54; 69.81 | 33.87; 70.85 | 33.31; 69.28 | 35.26; 74.88 | 35.88; 72.27 |
| Number of putative coding genes | 14,745 | 13,662 | 16,067 | 14,863 | 15,242 | 13,232 | 14,920 | 13,127 | 15,319 |
| Mean gene size (bp) | 2,526 | 2,604 | 2,381 | 2,370 | 2,420 | 2,633 | 2,377 | 2,947 | 2,449 |
| Mean CDS length (bp) | 1,045 | 1,093 | 962 | 1,017 | 1,003 | 1,108 | 1,026 | 1,061 | 965 |
| Mean exon count per gene | 6.31 | 6.39 | 5.7 | 6.01 | 5.93 | 6.48 | 6.01 | 6.51 | 5.91 |
| Mean exon length (bp) | 186.2 | 193.9 | 202.09 | 193.32 | 198.14 | 196.76 | 192.52 | 216.2 | 209.06 |
| Mean intron length (bp) | 254.67 | 253.47 | 261.63 | 241.76 | 253.31 | 248.69 | 244.35 | 279.96 | 247.89 |
| Total length of coding sequences | 27,499,002 | 25,039,143 | 29,213,671 | 25,397,919 | 26,529,424 | 25,001,000 | 25,015,560 | 29,155,076 | 25,830,832 |
| Repetitive sequences (%) | 18.97 | 17.79 | 21.8 | 18.78 | 20.29 | 17.26 | 18.68 | 6.74 | 19.64 |
| CEG completeness: complete; partial (%) | 95.97; 97.18 | 95.97; 97.18 | 96.77; 97.18 | 95.97; 97.18 | 96.37; 97.18 | 96.37; 97.18 | 96.77; 97.18 | 95.56; 97.18 | 96.37; 97.18 |
CDS, coding DNA sequence; CEG, core essential gene; ISS, Istituto Superiore di Sanità.
International Trichinella Reference Center ( http://www.iss.it/site/Trichinella/) ISS codes are indicated.
*T1=T. spiralis; T2=T. nativa; T3=T. britovi; T5=T. murrelli; T7=T. nelsoni; T12=T. patagoniensis; and Trichinella genotypes T6, T8 and T9.
Assembly and gene prediction statistics for the draft genomes of all recognized non-encapsulated Trichinella taxa*.
| Country of origin | Russia | Russia | Kazakhstan | USA | Australia | Thailand | Zimbabwe |
| Host of origin | Raccoon | Brown rat | Tawny eagle | Black vulture | Spotted quoll | Human | Nile crocodile |
| Genome size (bp) | 49,202,366 | 48,147,010 | 49,171,591 | 48,479,966 | 46,056,875 | 46,871,975 | 50,937,231 |
| Number of scaffolds; contigs | 7,287; 8,136 | 7,547; 7,647 | 6,600; 7,483 | 6,287; 7,079 | 1,381; 2,571 | 2,552; 3,122 | 11,275; 12,675 |
| N50 (bp) | 235,426 | 112,255 | 287,133 | 234,172 | 167,180 | 222,396 | 205,645 |
| N90 (bp) | 60,266 | 9,250 | 69,797 | 50,288 | 50,779 | 64,723 | 8,776 |
| Genome GC content (%) | 32.61 | 32.58 | 32.57 | 32.69 | 32.46 | 32.7 | 32.87 |
| Coding GC content (%) | 42.64 | 42.39 | 42.44 | 42.62 | 42.24 | 42.25 | 42.41 |
| Exonic proportion; including introns (%) | 33.63; 72.68 | 34.73; 71.61 | 34.24; 73.39 | 29.84; 61.96 | 33.73; 67.98 | 35.96; 76.49 | 33.92; 72.45 |
| Number of putative coding genes | 12,699 | 13,754 | 12,462 | 14,708 | 11,006 | 11,854 | 14,933 |
| Mean gene size (bp) | 2,955 | 2,620 | 3,053 | 2,071 | 2,944 | 3,169 | 2,591 |
| Mean CDS length (bp) | 1,041 | 1,006 | 1,052 | 994 | 1,122 | 1,133 | 933 |
| Mean exon count per gene | 6.58 | 6.22 | 6.66 | 5.91 | 6.64 | 6.92 | 5.87 |
| Mean exon length (bp) | 210.71 | 207.52 | 217.94 | 169.78 | 222.84 | 217.53 | 209.41 |
| Mean intron length (bp) | 281.73 | 255.09 | 283.56 | 217.87 | 260.03 | 281.51 | 280.34 |
| Total length of coding sequences | 25,932,768 | 21,637,764 | 25,191,644 | 15,904,360 | 22,611,694 | 24,407,202 | 26,665,876 |
| Repetitive sequences (%) | 18.01 | 18.41 | 17.73 | 17.71 | 16.12 | 14.47 | 20.99 |
| CEG completeness: complete; partial (%) | 96.77; 97.58 | 96.37; 97.58 | 96.77; 97.58 | 97.18; 97.58 | 95.97; 97.58 | 96.37; 97.58 | 97.18; 97.58 |
CDS, coding DNA sequence; CEG, core essential gene; ISS, Istituto Superiore di Sanità.
International Trichinella Reference Center ( http://www.iss.it/site/Trichinella/) ISS codes are indicated.
*T4=T. pseudospiralis (including five distinct populations: T4.1–T4.5); T10=T. papuae; T11=T. zimbabwensis.
Figure 1The evolution and biogeography of Trichinella taxa.
(a) The phylogeny of all 12 currently recognized taxa of Trichinella based on analyses of amino-acid sequence data from shared SCOs (n=1,284) employing Bayesian infererence, ML and MP methods, with Trichuris suis and A. suum as outgroups; 1,042 and 747 are the numbers of orthologous gene groups, which are unique to encapsulated (red) and non-encapsulated (blue) Trichinella taxa, respectively. The topology of the trees constructed using each of these methods was the same; all nodes have absolute statistical support (1.00 or 100%), except for one node (*) in the ML analysis, where it was 99%. The grey bars on the nodes represent 95% confidence intervals for the estimate of species branching time. T. spiralis (ISS195)19 shares the same phylogenetic position (**) as T. spiralis (ISS3). Host animals: suids (Sus scrofa) represent both the sylvatic and domestic porcine hosts (left); the reproductive potential of a particular Trichinella taxon in S. scrofa12 is indicated by the colour scale: white: not assessed; light grey: low; dark grey: medium; black: high. Other animals (right) represent examples of carnivorous sylvatic hosts in different geographic regions, including fox, lion, mountain lion, marsupial, crocodile and bird of prey, and the accidental human host. (b) Representation of genome-wide synteny among Trichinella taxa (same order as listed on the right in a). Genomic scaffolds (black) sharing at least 10 SCOs between Trichinella taxa are displayed. A purple line indicates a single SCO and a red line an inverted SCO. The numbers on the right indicate the genomic length in megabases (Mb). (c) Biogeography of Trichinella taxa proposed on the basis of known global (climate, extinctions and/or tectonic) events and diversification times (mya) for Trichinella taxa, estimated using a molecular clock approach. Encapsulated taxa: T. spiralis, T1; T. nativa, T2; T. britovi, T3; T. murrelli, T5; T. nelsoni, T7; T. patagoniensis, T12; and Trichinella genotypes T6, T8 and T9. Non-encapsulated taxa (infecting mammals, reptiles and/or birds): T. pseudospiralis, T4; T. papuae, T10; T. zimbabwensis, T11 (ref. 12). Geographic distributions of Trichinella taxa were reported by Pozio and Zarlenga11. The embedded public domain world map image ( https://commons.wikimedia.org/wiki/File:BlankMap-World6.svg) has been modified using the programs World map tool v.1.16 ( http://law.nagoya-u.ac.jp/en/appendix/software/worldmap) and GIMP v.2.8 ( https://www.gimp.org).