| Literature DB >> 20175901 |
Juan I Montoya-Burgos1, Aurélia Foulon, Ilham Bahechar.
Abstract
BACKGROUND: Fast evolving genes are targets of an increasing panel of biological studies, from cancer research to population genetics and species specific adaptations. Yet, their identification and isolation are still laborious, particularly for non-model organisms. We developed a method, named the Inter-Specific Selective Hybridization (ISSH) method, for generating cDNA libraries enriched in fast evolving genes. It utilizes transcripts of homologous tissues of distinct yet related species. Experimental hybridization conditions are monitored in order to discard transcripts that do not find their homologous counterparts in the two species sets as well as transcripts that display a strong complementarity between the two species. Only heteroduplexes that disanneal at low stringency are used for constructing the resulting cDNA library.Entities:
Mesh:
Year: 2010 PMID: 20175901 PMCID: PMC2838844 DOI: 10.1186/1471-2164-11-126
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Schematic representation of the ISSH method. The cDNA pool of the species of interest, whose fast evolving transcripts are to be isolated, is called the "probed" while the mRNA pool of the species used as a template is called the "selector". Thick lines, probed transcripts; thin lines, selector mRNAs; small black dot, biotin; small opened or dashed bars at the donor transcript ends, tails of the short-tailed random primers A and B; grey ball, magnetic beads coated with streptavidin; magnet shape, magnetic separator; grey bars at the ends of short-tailed random primers, double strand adapters; arrows, PCR primers. Fast evolving transcripts which are isolated with the ISSH method are shown at the bottom of the chart.
Analysis of sequence divergence for the enriched and the control libraries.
| Size category | T-test (df-t) | |||||
|---|---|---|---|---|---|---|
| (bp.) | Mean (S.D.) | n | Mean (S.D.) | n | ||
| Blast against zebrafish | ||||||
| 90-109 | 29.93 (7.56) | 126 | 26.93 (8.77) | 66 | -2.358 (116) | 0.010* |
| 110-129 | 33.32 (9.26) | 129 | 28.84 (10.06) | 67 | -3.038 (124) | 0.001** |
| 130-149 | 34.82 (9.86) | 138 | 31.61 (11.06) | 61 | -1.95 (103) | 0.027* |
| 150-169 | 36.84 (9.83) | 103 | 33.56 (11.43) | 47 | -1.701 (78) | 0.046* |
| 170-189 | 40.42 (9.88) | 100 | 33.34 (14.11) | 36 | -2.777 (47) | 0.004** |
| 190-209 | 40.97 (10.01) | 94 | 37.03 (13.14) | 39 | -1.681 (57) | 0.049* |
| 210-229 | 40.52 (10.92) | 80 | 28.91 (14.08) | 55 | -5.143 (96) | < 0.001** |
| 230-249 | 39.33 (11.01) | 64 | 34.63 (13.87) | 34 | -1.71 (55) | 0.045* |
| ≥ 250 | 42.39 (11.41) | 145 | 34.72 (15.60) | 34 | -2.702 (41) | 0.005** |
| Blast against | ||||||
| 90-109 | 30.09 (9.93) | 131 | 24.70 (12.86) | 66 | -2.986 (105) | 0.002** |
| 110-129 | 31.66 (11.29) | 140 | 22.12 (12.62) | 49 | -4.771 (78) | < 0.001** |
| 130-149 | 35.30 (12.17) | 127 | 27.48 (12.30) | 33 | -3.261 (49) | < 0.001** |
| 150-169 | 36.18 (12.55) | 115 | 29.40 (15.87) | 39 | -2.423 (55) | 0.009** |
| 170-189 | 37.93 (11.71) | 96 | 31.87 (13.66) | 19 | -1.807 (23) | 0.04* |
| 190-209 | 36.53 (12.15) | 103 | 27.73 (12.41) | 25 | -3.193 (36) | 0.001** |
| 210-229 | 43.66 (12.01) | 48 | 28.76 (15.48) | 18 | -3.689 (25) | < 0.001** |
| ≥ 230 | 36.16 (12.08) | 99 | 28.44 (18.10) | 18 | -1.740 (19) | 0.047* |
Sequence divergence was corrected using the K2P model with 2 transitions per transversion.
Figure 2Number of annotated contigs per category of gene expression level for the enriched and control libraries. Using Unigene database information, gene expression level is calculated as the number of ESTs of the gene under consideration in the studied tissue divided by the total ESTs of the tissue library, multiplied by 10'000.
Tentatively annotated fast evolving transcript fragments and their sequence divergence as compared to the closest ortholog in the teleost mRNA refseq database and in the Hypostomus gr. plecostomus EST dataset.
| contig | mRNA refseq annotation according to closest teleost ortholog | Species | Cds/UTR | ||
|---|---|---|---|---|---|
| 342 | zgc:175146 | Dr | 3'UTR | 0.4411 | 0.2990 |
| 435 | interferon regulatory factor 6 (irf6) | Dr | 3'UTR | 0.4824 | 0.3485 |
| 478 | NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, assembly factor 2 (ndufaf2), nuclear gene encoding mitochondrial protein | Dr | 3'UTR | 0.2680 | 0.2531 |
| 597 | similar to porcupine homolog (LOC100148644) | Dr | 3'UTR | 0.4824 | 0.3485 |
| 605 | zgc:158374 | Dr | cds | 0.4411 | 0.2833 |
| 710 | single-minded homolog 2 (sim2) | Dr | 3'UTR | 0.4411 | 0.3151 |
| 785 | similar to pol polyprotein (LOC796496) | Dr | cds | 0.3839 | 0.1324 |
| 809 | similar to ORF1-encoded protein (LOC100004717) | Dr | 5'UTR | 0.4824 | 0.3839 |
| 1137 | zgc:56382 | Dr | cds | 0.3485 | 0.2680 |
| 1451 | RMD5 homolog B (rmd5b) | Ss | 5'UTR | 0.5042 | 0.1324 |
| 1479 | similar to ORF1-encoded protein (LOC100004764) | Dr | 3'UTR | 0.2531 | 0.1573 |
| 1492 | wu:fc33e05 | Dr | 3'UTR | 0.3151 | 0.2680 |
| 1565 | hypothetical LOC570897 | Dr | cds | 0.3316 | 0.2385 |
| 1614 | ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1) like (rac1l) | Dr | 3'UTR | 0.2833 | 0.2990 |
| 1694 | similar to NLR family, pyrin domain containing 3 (LOC100002061) | Dr | 3'UTR | 0.3839 | 0.4024 |
| 1695 | monoacylglycerol O-acyltransferase 2 | Dr | 3'UTR | 0.3485 | 0.2531 |
| 1782 | similar to G protein-coupled receptor 128 (LOC100148710) | Dr | cds | 0.3839 | 0.2531 |
| 1819 | similar to Uromodulin precursor (Tamm-Horsfall urinary glycoprotein) (THP) (LOC100007639) | Dr | cds | 0.4824 | 0.4024 |
| 1902 | hypothetical protein LOC100150258 | Dr | cds | 0.3839 | 0.1324 |
| 1905 | si:dkeyp-27b10.2 | Dr | cds | 0.3485 | 0.2531 |
| 2016 | zgc:64076 | Dr | 3'UTR | 0.2680 | 0.1702 |
| 2029 | zgc:85811 | Dr | cds | 0.3151 | 0.1833 |
| 2066 | similar to CG6639 CG6639-PA (LOC100000002) | Dr | cds | 0.4824 | 0.3151 |
| 2085 | hypothetical protein LOC100149782 | Dr | cds | 0.2990 | 0.2103 |
| 2225 | zgc:158374 | Dr | cds | 0.4614 | 0.4214 |
| 2342 | similar to zymogen granule membrane glycoprotein 2 (LOC100005977) | Dr | cds | 0.4614 | 0.4411 |
| Reference fast evolving sequences | |||||
| cytochrome oxidase subunit I (COI) | cds | 0.239 | 0.145 | ||
| reticulon 4 (RTN4) introns 1 & 2 | introns | 0.748 | 0.170 |
Sequence divergence was calculated using the alignment region with sequence in all species compared. Divergences were corrected using the K2P model with 2 transitions per transversion. The lowest part of the table presents the sequence divergence of two published fast evolving markers used for characterizing species or genera. Dr: Danio rerio; Ss:
Salmo salar; A.: Ancistrus; H.:Hypostomus.
Figure 3Gene ontology classification of the fraction of annotated transcripts belonging to the library enriched in fast evolving genes and the control library. Only the major categories of biological processes are used, according to Panther database. Dotted bars indicate biological processes involved in metabolism.