| Literature DB >> 20396923 |
Sharen Bowman1, Sophie Hubert, Brent Higgins, Cynthia Stone, Jennifer Kimball, Tudor Borza, Jillian Tarrant Bussey, Gary Simpson, Catherine Kozera, Bruce A Curtis, Jennifer R Hall, Tiago S Hori, Charles Y Feng, Marlies Rise, Marije Booman, A Kurt Gamperl, Edward Trippel, Jane Symonds, Stewart C Johnson, Matthew L Rise.
Abstract
Atlantic cod is a species that has been overexploited by the capture fishery. Programs to domesticate this species are underway in several countries, including Canada, to provide an alternative route for production. Selective breeding programs have been successfully applied in the domestication of other species, with genomics-based approaches used to augment conventional methods of animal production in recent years. Genomics tools, such as gene sequences and sets of variable markers, also have the potential to enhance and accelerate selective breeding programs in aquaculture, and to provide better monitoring tools to ensure that wild cod populations are well managed. We describe the generation of significant genomics resources for Atlantic cod through an integrated genomics/selective breeding approach. These include 158,877 expressed sequence tags (ESTs), a set of annotated putative transcripts and several thousand single nucleotide polymorphism markers that were developed from, and have been shown to be highly variable in, fish enrolled in two selective breeding programs. Our EST collection was generated from various tissues and life cycle stages. In some cases, tissues from which libraries were generated were isolated from fish exposed to stressors, including elevated temperature, or antigen stimulation (bacterial and viral) to enrich for transcripts that are involved in these response pathways. The genomics resources described here support the developing aquaculture industry, enabling the application of molecular markers within selective breeding programs. Marker sets should also find widespread application in fisheries management.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20396923 PMCID: PMC3084941 DOI: 10.1007/s10126-010-9285-z
Source DB: PubMed Journal: Mar Biotechnol (NY) ISSN: 1436-2228 Impact factor: 3.619
List of CGP libraries
| Library Name | Library Type | Breeding Program | Tissue Type | For./Rev. | Tissue Treatment | No. of sequences |
|---|---|---|---|---|---|---|
| gmnsil | N | N/A | noda. infected liver | N/A | none | 381 |
| gmnsul | N | N/A | uninfected liver | N/A | none | 136 |
| gmnbbr | N | NB | brain | N/A | none | 9322 |
| gmnbgi | N | NB | gill | N/A | none | 2224 |
| gmapht | N | NB,NL,NH | heart | N/A | none | 16983 |
| gmapov | N | NB,NL,NH | ovary | N/A | none | 11555 |
| gmnbbrts | N | NB | brain | N/A | thermal stress | 3358 |
| gmnblits | N | NB | liver | N/A | thermal stress | 15475 |
| gmnbhkas | N | NB | head kidney | N/A |
| 6057 |
| gmnbhkic | N | NB | head kidney | N/A | pIC | 7510 |
| gmnbmd | N | NB | mixed digestive | N/A | none | 6151 |
| gmnbmu | N | NB | muscle | N/A | none | 4600 |
| gmapte | N | NB,NL,NH | testis | N/A | none | 4225 |
| gmnlskic | N | NL | spleen/head kidney | N/A | pIC | 4665 |
| gmlbgits | N | NB,NL | gill | N/A | thermal stress | 1168 |
| gmnbpcic | N | NB | pyloric caeca | N/A | pIC | 4561 |
| gmnbspic | N | NB | spleen | N/A | pIC | 4544 |
| gmnl2pia | N | NL | blood | N/A | pIC & | 2538 |
| gmnlem | N | NL | embryo | N/A | none | 12135 |
| gmnlla | N | NL | larvae | N/A | none | 20493 |
| gmnlpbas | N | NL | blood | N/A |
| 71 |
| gmnlpbia | N | NL | blood | N/A | pIC & | 513 |
| gmnlpbic | N | NL | blood | N/A | pIC | 95 |
| gmnlkfic | SSH | NL | head kidney | F | pIC | 1005 |
| gmnlkric | SSH | NL | head kidney | R | pIC | 135 |
| gmnlsfic | SSH | NL | spleen | F | pIC | 3005 |
| gmnlsric | SSH | NL | spleen | R | pIC | 869 |
| gmnlkfta | SSH | NL | head kidney | F | heat shock | 1451 |
| gmnlkrta | SSH | NL | head kidney | R | heat shock | 93 |
| gmnllfta | SSH | NL | liver | F | heat shock | 1524 |
| gmnllrta | SSH | NL | liver | R | heat shock | 1586 |
| gmnlmfta | SSH | NL | skeletal muscle | F | heat shock | 1419 |
| gmnlbfic | SSH | NL | brain | F | pIC | 2067 |
| gmnlbric | SSH | NL | brain | R | pIC | 2111 |
| gmnlkfas | SSH | NL | head kidney | F |
| 1033 |
| gmnlkras | SSH | NL | head kidney | R |
| 986 |
| gmnlpfas | SSH | NL | blood | F |
| 81 |
| gmnlpras | SSH | NL | blood | R |
| 85 |
| gmnlpfic | SSH | NL | blood | F | pIC | 442 |
| gmnlpric | SSH | NL | blood | R | pIC | 90 |
| gmnlsfas | SSH | NL | spleen | F |
| 1048 |
| gmnlsras | SSH | NL | spleen | R |
| 1087 |
For each library, the type, breeding program, tissue(s) used and any treatments used on fish prior to tissue collection are shown, together with the number of sequences generated. Libraries produced were generated using either normalized library (N) or suppression subtractive hybridization (SSH) protocols, with the latter being divided into forward (F) or reverse (R) libraries. Fish were collected for the New Brunswick (NB), Newfoundland (NL) or New Hampshire (NH) breeding programs, or originated from a pre-project collection (N/A). The scheme used for library nomenclature is described in the Methods section. N/A = not applicable, noda. = nodavirus. For details on the heat shock challenge, see Hori et al 2010. For details on the formalin-killed, atypical Aeromonas salmonicida (A. sal.) challenge, see Feng et al. 2009. For details on the viral mimic (pIC) challenge, see Rise et al. 2008.
Normalized cDNA libraries created for Gadus morhua contributing sequences used for SNP identification
| Library Name | No. of Fish | Breeding Program | Age | Fish Population | No. of sequences |
|---|---|---|---|---|---|
| gmnsil | 3 | N/A | Adult | Pre-project collection from NS waters | 178 |
| gmnsul | 3 | N/A | Adult | Pre-project collection from NS waters | 67 |
| gmnbbr | 10 | NB program | Adult | Same population as NB YC1 parents | 5992 |
| gmnbgi | 10 | NB program | Adult | Same population as NB YC1 parents | 1346 |
| gmapht | 20 | NB, NL, NH program | Adult | Same population as NB YC1 and NL YC2 parents | 12374 |
| gmapov | 20 | NB, NL, NH program | Adult | Same population as NB YC1 and NL YC2 parents | 7683 |
| gmnbbrts | 14 | NB program | Juvenile | NB YC1 F1 progeny | 1959 |
| gmnblits | 14 | NB program | Juvenile | NB YC1 F1 progeny | 10650 |
| gmnbhkas | 14 | NB program | Juvenile | NB YC1 F1 progeny | 4665 |
| gmnbhkic | 18 | NB program | Juvenile | NB YC1 F1 progeny | 5460 |
| gmnbmd | 10 | NB program | Adult | NB YC1 parents | 4148 |
| gmnbmu | 10 | NB program | Adult | NB YC1 parents | 3720 |
| gmnbspic | 18 | NB program | Juvenile | NB YC1 F1 progeny | 3833 |
| gmnbpcic | 14 | NB program | Juvenile | NB YC1 F1 progeny | 3858 |
| gmapte | 17 | NB, NL, NH program | Adult | Same population as NB YC1 and NL YC2 parents | 3279 |
| gmnlskic | 21 | NL program | Juvenile | NL YC1 F1 progeny | 3795 |
| gmnlem | 340 | NL program | Embryo | NL YC2 F1 progeny | 8958 |
| gmnlla | 290 | NL program | Larvae | NL YC2 F1 progeny | 15550 |
| gmlbgits | 12 | NB, NL program | Juvenile | NB YC1 F1 progeny, NL YC1 F1 progeny | 241 |
| gmnlpbia | 26 | NL program | Juvenile blood | NL YC1 F1 progeny | 240 |
Libraries were generated by pooling one (or more) tissue types from multiple fish, from one (or more) breeding programs. YC1 and YC2 denote Year Class 1 and Year Class 2 fish, respectively.
SSH libraries created for Gadus morhua
| Library name | No. of fish treated | No. of fish control | Family |
|---|---|---|---|
| gmnlsfic | 18 | 12 | NL YC1 F1 family 32 |
| gmnlsric | 18 | 12 | NL YC1 F1 family 32 |
| gmnlkfic | 18 | 12 | NL YC1 F1 family 32 |
| gmnlkric | 18 | 12 | NL YC1 F1 family 32 |
| gmnlkfta | 32 | 32 | NL YC1 F1 family 4 |
| gmnlkrta | 32 | 32 | NL YC1 F1 family 4 |
| gmnllfta | 32 | 32 | NL YC1 F1 family 4 |
| gmnllrta | 32 | 32 | NL YC1 F1 family 4 |
| gmnlmfta | 32 | 32 | NL YC1 F1 family 4 |
| gmnlbfic | 24 | 16 | NL YC1 F1 family 32 |
| gmnlbric | 24 | 16 | NL YC1 F1 family 32 |
| gmnlkfas | 20 | 20 | NL YC1 F1 family 32 |
| gmnlkras | 20 | 20 | NL YC1 F1 family 32 |
| gmnlpfas | 20 | 20 | NL YC1 F1 family 32 |
| gmnlpras | 20 | 20 | NL YC1 F1 family 32 |
| gmnlpfic | 20 | 20 | NL YC1 F1 family 32 |
| gmnlpric | 20 | 20 | NL YC1 F1 family 32 |
| gmnlsfas | 20 | 20 | NL YC1 F1 family 32 |
| gmnlsras | 20 | 20 | NL YC1 F1 family 32 |
The numbers of fish contributing to each library are shown, together with the program and family details. In the “forward” SSH libraries (i.e. enriched for transcripts that were up-regulated by the treatment), the treated fish samples were used as the tester and the control samples were used as the driver. In the “reverse” SSH libraries (i.e. enriched for transcripts that were down-regulated by the treatment), the control fish samples were used as the tester and the treated samples were used as the driver as previously described (Rise et al. 2008). All fish were juveniles enrolled in the NL CGP breeding program.
EST assembly summary statistics
| All Version 2.0 | 3′ reads only | |
|---|---|---|
| Number of good sequences | 154,142 | 97,976 |
| Average trimmed EST length (bp) | 563 | 591 |
| Number of contigs | 23,838 | 13,448 |
| Number of singletons | 27,976 | 21,746 |
| Number of putative transcripts | 51,814 | 35,194 |
| Maximum no. of ESTs per contig | 154 | 83 |
| Average no. of ESTs per contig | 5.27 | 5.66 |
| Number of putative transcripts with | ||
| Significant BLASTX hits | 15,873 | 8,628 |
| No significant BLAST hits | 35,941 | 26,566 |
| Percentage with no significant BLAST hits | 69.37 | 75.48 |
| Number of contigs containing | ||
| 2 ESTs | 9,618 | 4,415 |
| 3 ESTs | 3,993 | 2,310 |
| 4 ESTs | 2,423 | 1,444 |
| 5-10 ESTs | 5,194 | 3,558 |
| 11-20 ESTs | 1,886 | 1,309 |
| 21-30 ESTs | 452 | 279 |
| 31-50 ESTs | 221 | 121 |
| >50 ESTs | 51 | 12 |
The number of putative transcripts is defined as the number of contigs plus the number of singletons. A score of 1 × e-5 was used as the E value cut-off for BLAST
SNP characteristics
| Population | No. of fish genotyped | Polymorphic SNPs | Sequence contribution of population to SNP assembly |
|---|---|---|---|
| NB YC1 (Cape Sable) | 23 | 71% | 58% |
| NB YC2 (Georges Bank) | 24 | 71% | N/A |
| NL YC2 (Bay Bulls) | 23 | 69% | 41% |
| NL YC3 (Smith Sound) | 23 | 70% | N/A |
| Iceland | 26 | 61% | N/A |
| Ireland | 15 | 53% | N/A |
| Norway | 25 | 56% | N/A |
The percentage of SNPs tested that were polymorphic in different populations of fish is shown. Only NB YC1 and NL YC2 fish provided samples from which the sequence collection was generated. N/A = not applicable.
Examples of sequences which were used in microarray construction that also harbour a SNP
| Best Hit in NCBI nr database | SNPs | ||||||
|---|---|---|---|---|---|---|---|
| Contig name | Accession number | Annotation | Species |
| SNP name | Cape Sable | Bay Bulls |
| Ribosomal | |||||||
| all_v2.0.609.C4 | gb|ACQ58145.1 | 39S ribosomal protein L30, mitochondrial precursor |
| 1.00E-11 | cgpGmo-S198 | 0.09/0.04 | 0.30/0.15 |
| all_v2.0.4842.C1 | gb|ACO09691.1 | 39S ribosomal protein L32, mitochondrial precursor |
| 7.00E-46 | cgpGmo-S967a | 0.61/0.43 | 0.52/0.43 |
| all_v2.0.9851.C1 | gb|ACI67749.1 | 39S ribosomal protein L52, mitochondrial precursor |
| 2.00E-35 | cgpGmo-S878 | 0.43/0.22 | 0.48/0.37 |
| all_v2.0.560.C2 | gb|ACH70774.1 | ribosomal protein S7 |
| 1.00E-101 | cgpGmo-S1917 | 0.30/0.20 | 0.26/0.22 |
| all_v2.0.3432.C1 | gb|ACN10030.1 | 60S ribosomal protein L10 |
| 1.00E-119 | cgpGmo-S105 | 0.17/0.13 | 0.30/0.15 |
| all_v2.0.1527.C1 | gb|ACN10349.1 | 60S ribosomal protein L18a |
| 1.00E-90 | cgpGmo-S1730 | 0.35/0.30 | 0.26/0.22 |
| all_v2.0.4345.C1 | gb|ACI67287.1 | 60S ribosomal protein L27 |
| 9.00E-65 | cgpGmo-S474 | 0.22/0.20 | 0.26/0.26 |
| all_v2.0.14203.C1 | gb|ACN10033.1 | 60S ribosomal protein L38 |
| 5.00E-25 | cgpGmo-S916 | 0.35/0.26 | 0.39/0.24 |
| all_v2.0.983.C1 | gb|ACO09602.1 | 60S ribosomal protein L8 |
| 1.00E-140 | cgpGmo-S357 | 0.43/0.22 | 0.39/0.32 |
| all_v2.0.64.C1 | dbj|BAF98661.1 | Ribosomal protein L13a |
| 1.00E-91 | cgpGmo-S1519 | 0.35/0.30 | 0.35/0.35 |
| all_v2.0.603.C1 | dbj|BAF45898.1 | Ribosomal protein S10 |
| 4.00E-69 | cgpGmo-S2265 | 1.00/0.50 | 1.00/0.50a |
| all_v2.0.545.C7 | gb|ACO09841.1 | 40S ribosomal protein S16 |
| 5.00E-67 | cgpGmo-S649a | 0.30/0.20 | 0.52/0.26 |
| all_v2.0.3134.C1 | ref|NP_001134397.1 | 40S ribosomal protein SA |
| 1.00E-139 | cgpGmo-S699b | 0.22/0.11 | 0.09/0.04 |
| Immune/Stress | |||||||
| all_v2.0.2956.C2 | gb|ACO09614.1 | Complement component 1 Q subcomponent-binding protein |
| 1.00E-115 | cgpGmo-S1200 | 0.52/0.48 | 0/0 |
| all_v2.0.4174.C1 | gb|ACO14444.1 | Cystatin-F precursor |
| 1.00E-27 | cgpGmo-S296 | 0.04/0.06 | 0.13/0.06 |
| all_v2.0.2115.C1 | gb|ACN10355.1 | CXC chemokine receptor type 4 |
| 1.00E-59 | cgpGmo-S525 | 0.43/0.26 | 0.35/0.17 |
| all_v2.0.2319.C4 | gb|AAF72567.1 | Immunoglobulin D heavy chain constant region variant a |
| 3.00E-33 | cgpGmo-S1542 | 1.00/0.50 | 1.00/0.50a |
| all_v2.0.4114.C1 | ref|XP_001923855.1 | Similar to interleukin 12 receptor beta 2.b |
| 1.00E-23 | cgpGmo-S946 | 0.35/0.48 | 0.65/0.37 |
| all_v2.0.1550.C1 | gb|ACO14448.1 | Macrophage migration inhibitory factor |
| 1.00E-41 | cgpGmo-S2107 | 0.52/0.39 | 0.30/0.24 |
| all_v2.0.8001.C1 | ref|XP_001894282.1 | T-cell receptor beta chain ANA 11 |
| 2.00E-07 | cgpGmo-S1484 | 0.48/0.28 | 0.48/0.33 |
| all_v2.0.682.C1 | gb|ACN10899.1 | T-complex protein 1 subunit alpha |
| 1.00E-106 | cgpGmo-S772 | 0.30/0.28 | 0.52/0.43 |
| all_v2.0.3543.C1 | ref|NP_001133482.1 | T-complex protein 1 subunit delta |
| 1.00E-107 | cgpGmo-S2247 | 1.00/0.50 | 1.00/0.50a |
The contig name in the All Version 2.0 assembly is given, together with the accession number and annotation of the best BLAST hit in the NCBI nr database. The SNP name for a SNP identified on that contig is shown, together with values for the observed heterozygosity and minor allele frequency for that SNP in fish collected from Cape Sable, Nova Scotia and Bay Bulls, Newfoundland.
aIndicates putative SNPs that may represent variation between different genes rather than different alleles of the same gene
Fig. 1An overview of the genomics workflow within the CGP. The sequence information has been generated from populations of fish enrolled in selective breeding and is being used to develop high-throughput molecular resources for Atlantic cod. These genomics tools (such as genetic markers and an oligonucleotide microarray) will be applied to dissection of QTL within the family-based breeding programs (adapted from Rise et al, in press)