| Literature DB >> 21748342 |
Motoshige Yasuike1, Jong Leong, Stuart G Jantzen, Kristian R von Schalburg, Frank Nilsen, Simon R M Jones, Ben F Koop.
Abstract
Sea lice are common parasites of both farmed and wild salmon. Salmon farming constitutes an important economic market in North America, South America, and Northern Europe. Infections with sea lice can result in significant production losses. A compilation of genomic information on different genera of sea lice is an important resource for understanding their biology as well as for the study of population genetics and control strategies. We report on over 150,000 expressed sequence tags (ESTs) from five different species (Pacific Lepeophtheirus salmonis (49,672 new ESTs in addition to 14,994 previously reported ESTs), Atlantic L. salmonis (57,349 ESTs), Caligus clemensi (14,821 ESTs), Caligus rogercresseyi (32,135 ESTs), and Lernaeocera branchialis (16,441 ESTs)). For each species, ESTs were assembled into complete or partial genes and annotated by comparisons to known proteins in public databases. In addition, whole mitochondrial (mt) genome sequences of C. clemensi (13,440 bp) and C. rogercresseyi (13,468 bp) were determined and compared to L. salmonis. Both nuclear and mtDNA genes show very high levels of sequence divergence between these ectoparastic copepods suggesting that the different species of sea lice have been in existence for 37-113 million years and that parasitic association with salmonids is also quite ancient. Our ESTs and mtDNA data provide a novel resource for the study of sea louse biology, population genetics, and control strategies. This genomic information provides the material basis for the development of a 38K sea louse microarray that can be used in conjunction with our existing 44K salmon microarray to study host-parasite interactions at the molecular level. This report represents the largest genomic resource for any copepod species to date.Entities:
Mesh:
Year: 2011 PMID: 21748342 PMCID: PMC3280385 DOI: 10.1007/s10126-011-9398-z
Source DB: PubMed Journal: Mar Biotechnol (NY) ISSN: 1436-2228 Impact factor: 3.619
Sea lice EST project summary
|
|
|
|
|
| |
|---|---|---|---|---|---|
| Number of clonesc | 38,880e | 51,607 | 7,680 | 19,200 | 8,448 |
| Number of sequencesd | 64,666e | 5734.9i | 14,821 | 32,135 | 16,441 |
| Average trimmed EST length (bp)f | 756 | 644 | 790 | 730 | 749 |
| Number of contigsg | 11,922 |
| 4,392 | 8,251 | 4,239 |
| Number of singletons | 4,186 | 5,145 | 1,662 | 3,106 | 2,199 |
| Number of putative transcripts | 16,108 | 14,466 | 6,054 | 11,357 | 6,438 |
| Maximum contig size (no. of ESTs) | 554 | 1482 | 15 | 34 | 21 |
| Average contig size (no. of ESTs) | 4.0 | 4.0 | 2.5 | 2.8 | 2.6 |
| Number of transcripts with BLAST hitsh | 7,157 | 6,726 | 3,775 | 5,830 | 3,951 |
| Percent with significant BLAST hits | 44.4% | 46.5% | 62.4% | 51.3% | 61.4% |
a L. salmonis Pacific form
b L. salmonis Atlantic (Canada, Norway) form
cNumber of clones which from at least one sequence (5′ or 3′) was obtained
dNumber of 5′ and 3′ EST sequences obtained
eTwenty-eight thousand thirty-two clones and 49,672 sequences were obtained from this study, while 5,760 clones and 14,994 sequences were previously reported (Yazawa et al. 2008)
fVector, low quality, and contaminating bacterial sequences are trimmed
gA contig (contiguous sequence) contains two or more ESTs
hNumber of transcripts that have a RPS-BLAST or BLASTX hit of less than 1 E−10 to the Conserved Domain Database (CDD) or SwissProt databases
i28K sequences were obtained from F. Nilsen (University of Bergen, Norway)
Fig. 1Screenshot of sea lice EST contig summary and search tools. The top panel allows users to perform homology searches for sequences of interest. The second provides the ability to search using sequence data, identifiers, accession numbers, and descriptive keywords. The third to seventh panels show a summary of the EST clustering results of C. clemensi, C. rogercressyi, Pacific L. salmonis, Atlantic L. salmonis, and L. branchialis, respectively
Comparison of the Pacific and Atlantic L. salmonis, C. clemensi, C. rogercressyi, and L. branchialis nuclear genes
| Type | No. of queries | No. of matches | Percentage with matchb | Average length | Maximum length | Standard deviation for length | Average identities | Maximum identities | Minimum identities | Standard deviation for% identities | Average positive AAs | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Atlantic form | blastn | 14,466 | 8,121 | 56% | 626 bp | 2,891 bp | 365.25 | 96% | 100% | 78% | 4.41 | – |
| tblastx | 14,466 | 8,827 | 61% | 187 aa | 820 aa | 110.46 | 88% | 100% | 22% | 18.15 | 91% | |
|
| blastn | 6,054 | 1,598 | 26% | 327 bp | 1,316 bp | 208.89 | 81% | 98% | 76% | 2.83 | – |
| tblastx | 6,054 | 3,852 | 64% | 151 aa | 569 aa | 79.38 | 72% | 100% | 12% | 17.09 | 83% | |
|
| blastn | 6,054 | 1,595 | 26% | 318 bp | 1,316 bp | 197.67 | 81% | 98% | 76% | 2.98 | – |
| tblastx | 6,054 | 4,096 | 68% | 145 aa | 539 aa | 75.68 | 72% | 100% | 16% | 16.65 | 83% | |
|
| blastn | 6,054 | 1,893 | 31% | 338 bp | 1,305 bp | 202.55 | 83% | 100% | 77% | 3.11 | – |
| tblastx | 6,054 | 3,715 | 61% | 142 aa | 456 aa | 71.72 | 71% | 100% | 21% | 17.09 | 82% | |
|
| blastn | 6,054 | 257 | 4% | 278 bp | 1,226 bp | 186.83 | 81% | 97% | 77% | 3.01 | – |
| tblastx | 6,054 | 2,634 | 44% | 125 aa | 436 aa | 63.27 | 59% | 100% | 21% | 16.47 | 74% | |
|
| blastn | 11,357 | 1,931 | 17% | 301 bp | 1,309 bp | 178.46 | 82% | 99% | 76% | 2.86 | – |
| tblastx | 11,357 | 5,937 | 52% | 139 aa | 557 aa | 72.40 | 70% | 100% | 18% | 17.45 | 81% | |
|
| blastn | 11,357 | 1,973 | 17% | 292 bp | 1,498 bp | 175.48 | 82% | 100% | 76% | 3.40 | – |
| tblastx | 11,357 | 6,383 | 56% | 133 aa | 521 aa | 68.41 | 70% | 100% | 21% | 16.81 | 82% | |
|
| blastn | 6,438 | 417 | 6% | 264 bp | 1,079 bp | 168.51 | 81% | 99% | 76% | 2.99 | – |
| tblastx | 6,438 | 3,284 | 51% | 132 aa | 466 aa | 69.24 | 62% | 100% | 22% | 16.22 | 77% | |
|
| blastn | 6,438 | 405 | 6% | 260 bp | 1,079 bp | 163.04 | 81% | 98% | 77% | 2.75 | – |
| tblastx | 6,438 | 3,375 | 52% | 126 aa | 466 aa | 64.90 | 62% | 100% | 22% | 16.31 | 76% | |
|
| blastn | 6,438 | 254 | 4% | 250 bp | 1,211 bp | 156.59 | 81% | 99% | 76% | 2.73 | – |
| tblastx | 6,438 | 3,021 | 47% | 121 aa | 465 aa | 59.81 | 59% | 100% | 17% | 16.38 | 74% |
aThe number of queries that had BLASTN hit with an E value <1 E−10 and 100 bp minimum of alignment length or that had tBLASTX with an E value <1 E−10 and 50 aa minimum of alignment length
bFirst match that conformed to parameters was taken from the top five hits of blast output. If no suitable match was found in the top five hits, it was not included in the results
Comparison of the L. salmonis, C. clemensi, and C. rogercressyi mtDNA genes
| Genes | In nucleic sequence (%) | In deduced amino acid sequence (%) | ||||||
|---|---|---|---|---|---|---|---|---|
| Pacific form | Pacific form |
| Atlantic form | Pacific form | Pacific form |
| Atlantic form | |
| rrnS | 77.2 | 76.4 | 74.9 | 98.8 | – | – | – | – |
| rrnL | 68.3 | 67.2 | 71.2 | 96.9 | – | – | – | – |
| atp8a | 72.0 | 67.7 | 72.0 | 96.8 | – | – | – | – |
| atp6 | 63.9 | 65.4 | 65.3 | 91.9 | 61.5 | 66.7 | 60.3 | 95.9 |
| cob | 71.0 | 70.8 | 71.3 | 93.7 | 79.7 | 80.3 | 77.6 | 98.5 |
| coxi | 79.1 | 77.9 | 82.6 | 92.9 | 90.8 | 91.2 | 94.1 | 99.2 |
| cox2 | 76.6 | 75.7 | 78.5 | 93.5 | 75.4 | 81.2 | 85.5 | 100.0 |
| cox3 | 73.2 | 71.7 | 72.4 | 92.0 | 75.6 | 82.5 | 79.2 | 98.2 |
| nadi | 72.2 | 71.6 | 70.5 | 92.8 | 69.2 | 75.3 | 66.8 | 96.9 |
| nad2 | 57.9 | 57.8 | 59.3 | 90.9 | 40.5 | 45.8 | 43.2 | 94.8 |
| nad3 | 68.5 | 57.8 | 65.0 | 91.6 | 64.1 | 54.7 | 59.3 | 97.5 |
| nad4 | 61.9 | 58.8 | 58.7 | 91.2 | 49.1 | 49.3 | 44.0 | 92.6 |
| nad4Lb | N.A. | N.A | N.A. | 94.3 | – | – | – | 97.3 |
| nad5 | 62.2 | 58.8 | 61.8 | 90.7 | 51.9 | 50.4 | 49.4 | 95.5 |
| nad6 | 59.6 | 56.1 | 59.1 | 93.8 | 48.3 | 42.0 | 40.0 | 97.3 |
| Average | 68.8 | 66.7 | 68.8 | 93.5 | 64.2 | 65.4 | 63.6 | 97.0 |
aComparisons of amino acid sequences of atp8 genes were not conducted because these sequences are very short in size (31 aa)
b nad4L genes are absent in the two Caligus species
Ranges of 16S rRNA gene divergence based on Kimura two-parameter distance and crustacean molecular clock calibrations
| Distance (K2P) | Divergence Range (Myr) | ||||
|---|---|---|---|---|---|
| Ano | Fid | Gra (low) | Gra (high) | ||
| Pacific form | 0.405 | 106.2 | 45.0 | 62.3 | 46.0 |
| Pacific form | 0.431 | 113.0 | 47.8 | 66.2 | 48.9 |
|
| 0.333 | 87.4 | 37.0 | 51.2 | 37.8 |
The values for “Distance” are the Kimura two-parameter (K2P) distance between the species. Rates of molecular evolution used for the 16S rRNA gene include 0.38% K2P/million year (Myr) for anomurans (Ano; Cunningham et al. 1992), 0.90% K2P/Myr for fiddler crabs (Fid; Sturmbauer et al. 1996), and 0.65 (low)–0.88% (high) K2P/Myr obtained from grapsid crabs (Gra; Schubart et al. 1998)
Fig. 2Genomic organization of the C. clemensi (13,440 bp) and the C. rogercressyi (13,468 bp) mt genomes. The complete mt genomes of the Atlantic (15,445 bp) and Pacific (16,148 bp) L. salmonis were previously reported, and these mt genomes are identical in gene organization (Tjensvoll et al. 2005; Yazawa et al. 2008). Boxes represent mtDNA genes. tRNA genes are denoted by the single letter amino acid code, and an underline indicates tRNA genes located on negative strand. rrnL and rrnS refer to 16S and 12S rRNA; cox1, cox2, and cox3 refer to cytochrome oxidase subunit I, II, and III; cob refers to cytochrome b; nad1–6 and nad4L refer to NADH dehydrogenase subunits 1–6 and 4 L, atp6 and atp8 refer to ATP synthase subunits 6 and 8, respectively, and CR refers to control region. Transcription directions for the protein-coding and rRNA genes are shown by arrowheads