Literature DB >> 17374169

Zebrafish orthologs of human muscular dystrophy genes.

Leta S Steffen1, Jeffrey R Guyon, Emily D Vogel, Rosanna Beltre, Timothy J Pusack, Yi Zhou, Leonard I Zon, Louis M Kunkel.   

Abstract

BACKGROUND: Human muscular dystrophies are a heterogeneous group of genetic disorders which cause decreased muscle strength and often result in premature death. There is no known cure for muscular dystrophy, nor have all causative genes been identified. Recent work in the small vertebrate zebrafish Danio rerio suggests that mutation or misregulation of zebrafish dystrophy orthologs can also cause muscular degeneration phenotypes in fish. To aid in the identification of new causative genes, this study identifies and maps zebrafish orthologs for all known human muscular dystrophy genes.
RESULTS: Zebrafish sequence databases were queried for transcripts orthologous to human dystrophy-causing genes, identifying transcripts for 28 out of 29 genes of interest. In addition, the genomic locations of all 29 genes have been found, allowing rapid candidate gene discovery during genetic mapping of zebrafish dystrophy mutants. 19 genes show conservation of syntenic relationships with humans and at least two genes appear to be duplicated in zebrafish. Significant sequence coverage on one or more BAC clone(s) was also identified for 24 of the genes to provide better local sequence information and easy updating of genomic locations as the zebrafish genome assembly continues to evolve.
CONCLUSION: This resource supports zebrafish as a dystrophy model, suggesting maintenance of all known dystrophy-associated genes in the zebrafish genome. Coupled with the ability to conduct genetic screens and small molecule screens, zebrafish are thus an attractive model organism for isolating new dystrophy-causing genes/pathways and for use in high-throughput therapeutic discovery.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17374169      PMCID: PMC1851013          DOI: 10.1186/1471-2164-8-79

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Muscular dystrophies are a heterogeneous group of genetic disorders characterized by loss of muscle strength and integrity. Common pathological hallmarks of the mammalian muscular dystrophies include the presence of necrotic muscle fibers, fiber size variation, centralized nuclei indicating fiber regeneration, inflammatory infiltrates, and replacement of muscle fibers by fat and connective tissue to varying degrees. However, muscular dystrophies differ in their age of onset, severity, the muscle groups affected, additional non-muscle phenotypes (such as reduced average IQ) and the genetic mode of inheritance (reviewed in [1]). To date, 31 distinct muscular dystrophies have been described and 25 distinct genes have been causatively linked to these muscular dystophies [2]. The most common, Duchenne Muscular Dystrophy (DMD), accounts for the majority of dystrophy patients. DMD affects 1 in 3500 males and typically results in death by the third or fourth decade. The mutated gene in DMD, dystrophin, was identified in the 1980's [3] and its characterization has led to methods of genetic testing and a better understanding of dystrophic pathology. However, no cure has yet been identified. In addition, the causative genes remain unknown in several dystrophies and additional patients with unclassified dystrophy phenotypes. Finally, while several dystrophy-associated genes encode proteins that directly or indirectly interact, others, including the nuclear proteins (lamin A/C and emerin), and cytoplasmic proteins (TRIM32) have not yet been linked in a common pathway that would make apparent the cause of their dystrophic phenotype. The small freshwater zebrafish, Danio rerio, has recently emerged as a promising model organism for the study of muscular dystrophies and other human diseases. Due to its small size, large numbers of offspring (50–350 per week), rapid development of the skeletal musculature, and transparency in embryonic/juvenile stages, zebrafish provide an excellent system for genetic screens to identify novel muscular dystrophy-causing genes and pathways. More recent experiments have also proven zebrafish a useful organism for drug screens using whole vertebrates, suggesting that identification of dystrophic zebrafish mutants may allow drug screens for muscular dystrophy therapeutics. (See [4-8] for zebrafish drug screens.) The zebrafish sapje mutant was identified in 1996 with grossly normal muscle at 2 days post-fertilization (dpf) but decreased muscle organization and motility at 5 dpf [9]. The causative gene mutation was mapped to a nonsense mutation in dystrophin, suggesting conservation of this dystrophy-associated molecular pathway in fish [10]. Other studies have employed anti-sense morpholino or RNAi knockdowns to show similar dystrophy-like pathology and phenotypes when dystrophin (DMD/BMD), caveolin-3 (LGMD 1C), δ-sarcoglycan (LGMD 2F), or laminin α2 (MDC 1A) proteins are reduced, suggesting that this conservation may extend to other orthologs of human dystrophy-associated genes [10-16]. To aid in the further identification of zebrafish dystrophy mutants, we have interrogated current sequence databases to identify zebrafish orthologs of the known human dystrophy genes. Positioning these genes allows rapid candidate identification during genetic mapping of dystrophic zebrafish mutants and may allow prioritization of novel mutants – those with linkage to a genomic region containing no known dystrophy-associated ortholog. Due to the evolving nature of the Sanger Centre Zebrafish Genome assembly, we have also identified the BAC clone location of these genes. BAC sequences should allow more consistent local sequence information and easy updating when future genome alignments are released. This study has identified orthologous zebrafish transcripts for 24 out of 25 of the known human dystrophy-associated genes and 4 additional myopathy-related genes. Genomic positions have been identified for all 29 of these genes and BAC locations for 24. This genomic data suggests that at least two dystrophy genes are duplicated in the zebrafish genome. Localization of the closest mammalian gene neighbors also shows that syntenic relationships are conserved for 19 dystrophy- and myopathy-causing genes.

Results and discussion

Mutations in 25 genes have been linked to 27 distinct forms of human muscular dystrophy (MD). In humans, these genes are distributed across 17 of the 23 chromosomes. Protein products of these genes position throughout the muscle fiber – from the extracellular matrix and sarcolemmal membrane to the sarcomere, the golgi, the cytoplasm, and the nucleus. We surveyed the zebrafish GenBank database to identify putative orthologs of the 25 human muscular dystrophy-associated genes and four additional myopathy-associated genes. Results with a high degree of similarity and significant sequence coverage were confirmed by reciprocal blast into Genbank mammalian databases. This in silico approach identified orthologous transcripts for 23 out of 25 muscular dystrophy-associated genes and all 4 myopathy-associated genes within the Genbank database (Tables 1 and 2).
Table 1

Zebrafish orthologs of human muscular dystrophy genes and their genomic locations.

Fish Genomic LocationFish BAC Loc

GeneSymbolAssociated DiseaseFish ESTNotesScaffold locationClone LocationGene locationNotesBAC NameNotesSyntenyNotes
Calpain-3CAPN3LGMD2ANM_0010045712601NA-BX322589Chr17 62.6 Mb and 63.1 MbBy blasts and synteny, appears split,. The first ~ 1550 nt has no corresponding BAC.N/A – zC283F6Human CAPN3 matches well with zK12H9YesSyntenic with GANC on genome and ZFP106 on genome and BAC (zC283F6)

Caveolin 3CAV3LGMD1C, hyperCKemia, Rippling muscle diseaseNM_205738879BX664752Chr6 33.2 MbOrganized in 2 exons, similar to human CAV3zKp111E5On one sideSyntenic with OXTR on BAC and genome.

DystrophinDMDDuchenne MD, Becker MDXM_678461, XM_678362, XM_678552, partialAll three are partial transcripts, but in order, cover most of the DMD coding region42BX004756, CT033808Chr1 9.6 Mb-9.4 MbDuplications within gene likely incorrect. Additional partial sequence located on Chr1 scaffold 49 at 10.5 MbzC59A4, zC274B7Transcripts span these two overlapping BACsNo

DysferlinDYSFLGMD2B, Miyoshi Myopathy, Distal myopathy with anterior tibial onsetXM_684324Many transcripts are similar to dysferlin, but this is the only one that aligns closer to dysferlin (rather than myoferlin or otoferlin) on reciprocal blast1155CR847843Chr7 83.3 Mb-83.5 MbHuman dysferlin also identifies Chr12 (BC063743, likely fish myoferlin) and Chr13 (XM_682373, similar to myoferlin, dysferlin, and otoferlin)zKp78E10 – bZ50C18The first BAC contains the 5' ~ 1/7th while the second BAC overlaps and gives coverage to the transcript endNoFlanking genes are on Chr7 but not in the same region or on the same scaffold as dysferlin

EmerinEMDEmery-Dreifuss MDXM_685843, XM_549369Identical except for a single 7nt internal fragment. Likely alternatively spliced variants. Poor homology to mammalian emerin.3352No dataChr23 18.96 MBDuplication of last 4/5 of transcript on Chr7 39.54 Mb, scaffold 1085. No synteny with Chr7.zC133L21Identical matches on unfinished BACs zK233H12, zK181F15, and zK93L1.YesRPL10 and FLNA are at 19.1 Mb and 19.0 Mb, on Chr23 Both are syntenic on BACs.

Fukutin related proteinFKRPMDC1C, LGMD2IXM_695011, partial2206No dataChr15 26.6 MbzK31C13YesSTRN4 and SLC1A5 are syntenic on the genome and the BAC.

Lamin A/CLMNAEmery-Dreifuss MD, LGMD1B, CMD1A, etc.NM_1529712371CR848742Chr16 37.8 MbzK181C1NoFlanking genes are not syntenic with each other, either

MyotilinTTIDLGMD1A, myofibrillar myopathyNone foundClosest match is Zv5 predicted transcript ENSDARG0153481999CT573287Chr14 12.2 Mb & 10.8 MbENSDARG015348 split between two loci in Zv6. Genome incorrect.zK101K8Complete and contiguous BAC coverageNoFlanking genes are not syntenic with each other, either

Sarcoglycan alphaSGCALGMD2DXM_680178Close homology with SCGE1664BX548040Chr12 870 KbzC190L11On one sideSyntenic with Col1A1 on both genome and BAC

Sarcoglycan betaSGCBLGMD2ENM_001034973First half of transcript aligns with full length of human SGCB2974CT583700Chr20 59.8 MbSecond half of EST is located on Chr25 1.3 Mb (Scaffold 3566)zC253J24YesBoth genes syntenic on genome and BAC

Sarcoglycan deltaSGCDLGMD2F, CMD1LNM_0010018163106BX294656Chr21 39.0–38.7 MbzC238M13First 300nt located on zK189O20On one sideSyntenic with MRPL22

Sarcoglycan gammaSGCGLGMD2CNM_0010037482184BX927291Chr15 20.2 MbIncomplete coverage. Duplicate exon also on scaffold 2184 but not on BAC.zC261A10Complete coverageOn one sideSyntenic with SACS on both genome and BAC

TelethoninTCAPLGMD2G, Dilated cardio-myopathy (CMD1N)XM_679011371CR387996Chr3 19.9 MbHuman TCAP also identifies a locus on Chr16 scaffold 2377 but coverage is less complete and exons are not contiguouszK183N6No

Tripartite Motif-containing protein 32TRIM32LGMD2HXM_686142Human TRIM32 only has one coding exon.NA688No dataNo dataCoding sequence on scaffold NA688. Putative 5' UTR exons are located in duplicate on Chr8, scaffold 1244NonePutative 5' UTR exons are located on zK72L14 & zK65O14YesASTN2 spans the both the human and the zebrafish TRIM32 loci on Scaffold NA688

TitinTTNLGMD2J, Tibial MD, Hereditary Myopathy with early respiratory failureXM_679005 (TTN2, partial), XM_678144 (TTN1, partial)Locus is duplicated. Only partial transcripts available3186BX640499, BX571737, BX640465Chr9 41.8 Mb-42.2 MbLocus duplications are in tandem. Duplicate genes are divergent in sequence and likely to be true duplicates.zKp67D2, dZ258D18, zK190I10, dZ249N21, zC198B21BACs overlap to cover the entire titin locus.YesSyntenic with FLJ39502 and FKBP7 on genome and BAC.

MD-Muscular Dystrophy, LGMD-Limb Girdle Muscular Dystrophy, CMD-Congenital Muscular Dystrophy, nt-nucleotides.

Table 2

Zebrafish orthologs of human congenital muscular dystrophy and selected myopathy genes and their genomic locations. Genes associated with both non-congenital and congenital muscular dystrophies are in Table 2-1.

Fish Genomic LocationFish BAC Loc

GeneSymbolAssociated DiseaseFish ESTNotesScaffold locationClone LocationGene locationNotesBAC NameNotesSyntenyNotes
Collagen 6A1Col6A1Bethlem myopathy, Ullrich CMDXM_693161, partial1607CR925698Chr11 35.3 MbzK287I12On one sideSyntenic with Col6A2 on genome and BAC.

Collagen 6A2Col6A2Bethlem myopathy, Ullrich CMDXM_691072The first 320 bases are likely not part of this transcript.1607CR925698-BX323597Chr11 35.2 MbThe first 320 bases are located in multiple places on other chromosomeszK287I12 – zC227N13The first 320 bases are located on zC184B9.On one sideSyntenic with Col6A1 on genome and BAC.

Collagen 6A3Col6A3Bethlem myopathy, Ullrich CMDXM_679796XM_687365 is also orthologous to mammalian Col6A3, but is more similar to a second predicted Col6A3 mammalian locus.1361-1360No dataChr9 19.0 Mb and 15.0 MbThe beginning is located on scaffold 1361, the repeating middle elements are on both scaffolds, and the end is on 1360. Note that the genomic locus may be misorganized.zC5M6Unfinished BAC covers entire transcript on various fragmentsYesSyntenic with MLPH on Chr9 and with COPS8 on Chr9 and clone zC5M6.

DesminDESDCM1, CMD1I, several skeletal and/or cardio-myopathiesNM_1309631342No dataChr9 7.3 MbSeveral loci are orthologous to human desmin. Most ruled out due to closer homology with other proteins. Additional loci on Chr20 (scaffold 2945), and Chr13 (scaffold 1885) could not be ruled out and may be duplications.NoneHomologous sequences were found, but none were near-exact matches to the zebrafish transcript sequence.NoChr9 locus is not syntenic with the other desmin-like genes, either.

FukutinFCMDFukuyama CMDXM_686729, partial792, 793CR753888, CT027618, BX072578Chr5 78.4–79.0 MbFull match on the first two clones, partial match on the third. Likely a genomic misalignment.zC286A10, zC154E10Full coverage of the partial transcript on bothOn one sideFSD1CL is syntenic on both genome and BAC.

Filamin CFLNCMyofibrillar myopathyXM_693754, XM_687344, partialDuplicated. Divergent nucleotide sequences. First contains the Human FLNC unique region. Second transcript is only partial.505, 3643AL954190, No dataChr4 7.5 Mb, Chr25 32.9 MbHuman FLNC unique region is not part of XM_687344, but is located immediately after it on Chr25.zC284B12, zK3006Both BACs match XM_693754. No BACs for XM_687344On one sideChr4 locus not syntenic, though flanking genes are elsewhere on Chr4. Partial NAG6 matches on Chr25.

Integrin Alpha 7ITGA7CMD with integrin deficiencyNone foundClosest EST is a closer match to mammalian ITGA61560No dataChr11 2.5 MbLocation identified using human ITGA7 onlyzC245G15Used human ITGA7NoFlanking genes are not syntenic with each other, either

Acetyl-glucosaminyl-transferase-like proteinLARGEMDC1DNM_001004537LARGE1B (NM_001004538) is highly orthologous.570No dataChr4 39.4 MbLARGE1B located on Chr18, scaffold 2725, clone BX908385.NoneLARGE1B located on both zC282N12 & zC206G24NoThe closest flanking genes are predictions

Laminin alpha 2LAMA2Merosin-deficient CMDXM_694983Partial, predicted2875No dataChr20 3.8 MbAligns with LAMA2 predicted transcripts GenScan01065 and FGENESH78171NoneOn one sideSyntenic with ARHGAP18, but NOT the highly similar LAMA1 locus (on Chr24)

Polyadenylate-binding protein, nuclear 1PABPN1Oculo-pharyngeal MDBC079522NM_213259 also matches but diverges over the 3' non-coding end. NM_213259 3' end is discontinuous with its 5' end on the genome and BACs and may not represent a real transcript.3471BX294113 and CT583644Chr24 21.4 Mb and 21.6 MbDuplication on Chr24 clones is likely due to genomic misalignment since clones overlap in the Sanger fingerprinted contigs.zKp73G8NoSLC22A17 is located on Chr24, but not in the same region.

Protein O-Mannose Beta-1,2-N-Acetyl-glucosaminyl-transferasePOMGNT1Muscle-eye-brain (MEB)BC097123985No dataChr6 69.0 MbzK170G13, zC156B18Sequencing of first BAC is unfinishedOn one sideSyntenic with TSPAN1 on both genome and BAC

Protein-O-mannosyl-transferase 1POMT1LGMD2K, Walker Warburg syndromeXM_693177723BX511209 and No dataChr5 56.2 Mb & 56.3 MbSplit between 3 loci. Exons 1–3 at first location, exons 3–17 at second location. Exons 17–22 potentially on Chr17 at 37.47 Mb.zC129A6Covers only first 3 exons. No matches for other exons.No

Sarcoglycan epsilonSGCEMyoclonic dystoniaNM_001002594Close homology with SCGA2827BX640469Chr19 41.07 MbzK104M9On one sideSyntenic with CASD1 on both genome and BAC

Selenoprotein N, 1SEPN1Rigid spine MD1 (RSMD1), Multiminicore diseaseNM_0010042942451BX323794 & R626962Chr17 1.8 Mb & 2.3 MbDuplication likely due to genome misalignment since the BACs overlap. Both clones have full transcript coverage.zC247C16, zC15D5BACs overlap, suggesting that the genomic duplication is a misalignment.On one sideSyntenic with FAM54B on genome and BAC

CMD/DCM-Congenital Muscular Dystrophy, MD-Muscular Dystrophy, nt-nucleotides.

No orthologous transcript sequence was identified in the zebrafish Genbank database for the non-congenital MD gene, myotilin, or the congenital MD (CMD) gene, integrin alpha 7 (ITGA7). However, interrogation of Version 5 of the Sanger Centre Zebrafish Genome with human myotilin protein sequence identified a highly conserved ENSEMBL-predicted zebrafish myotilin transcript. Complete and contiguous sequence for this ENSEMBL-predicted myotilin was also found on a single BAC clone. However, the predicted transcript is no longer contiguous within Version 6 of the genome, suggesting that the current genomic alignment through this region may be incorrect. For integrin alpha 7 (ITGA7), a putative genomic and BAC location was identified by similarity to human ITGA7 over other integrins, though no transcript sequence has yet been identified. These data suggest that zebrafish have gene orthologs for all known human MD genes. In combination with mutant and morpholino data demonstrating zebrafish dystrophy phenotypes upon down-regulation of several MD gene orthologs, these data recommend the zebrafish as an excellent model organism for genetic screens to identify additional vertebrate MD-causing genes and pathogenic pathways.

Genomic positions of zebrafish dystrophy orthologs

Genomic loci of zebrafish orthologs were identified in version 6 of the Sanger Centre Zebrafish Genome using the blastn algorithm with zebrafish RNA sequences. Locations were independently confirmed using the tblastn algorithm with human protein sequences. Human protein sequences were also used in case gene duplications were present but not reported in the EST database. Human sequences often returned several locations, sometimes correlating with related genes within a gene family. Additional loci were ruled out where possible by performing similar analyses with paralogs and/or by synteny with paralogs. All 29 genes could be placed in whole or in part on Version 6 of the Sanger Centre Zebrafish Genome. TRIM32, responsible for Limb Girdle Muscular Dystrophy 2H (LGMD 2H), resides on an orphan scaffold that has not yet been integrated into the chromosomal organization of the genome. The remaining 28 genes are scattered across 18 chromosomes with the majority of chromosomes having only one dystrophy ortholog (Fig. 1). Only Chr 9 (Collagen 6A3, desmin, and duplicate titin genes) and Chr 11 (ITGA7 and two syntenic collagen genes) contain more than two dystrophy orthologs. It is interesting to note that there is currently no identified sex chromosome in zebrafish. Indeed, dystrophin and emerin, genes that reside on the human X chromosome, are found on different chromosomes in zebrafish, and characterization of the zebrafish dystrophin mutant, sapje, has demonstrated an autosomal recessive inheritance pattern.
Figure 1

Distribution of zebrafish muscular dystrophy orthologs. Orthologs of the 25 muscular dystrophy-associated genes and 4 additional myopathy-associated genes were identified on 20 of the 25 zebrafish chromosomes by computer searches of the Sanger Centre Zebrafish Genome. Duplicate loci were found for FLNC and TTN orthologs. TRIM32 is located on an orphan scaffold that has not yet been integrated within the genome.

Genomic loci identified in the Sanger Centre Database frequently showed non-contiguous organization of transcript sequences, suggesting that the genome is not yet correctly organized in these regions. Thus, BAC clone locations were identified within the Sanger Centre Zebrafish Clone Database to allow rapid updating of dystrophy ortholog positions as the genome assembly continues to evolve. Clone data was also used where possible to distinguish duplications due to genomic misalignments versus real duplications by determining if the associated clones overlapped. Using both zebrafish nucleotide and human protein sequences, at least partial BAC coverage was identified for 24 out of the 29 genes of interest.

Genome loci verification

Version 6 of the Sanger Centre Zebrafish Genome contains better sequence coverage of the dystrophy-associated genes than the previous version, due in large part to use of physical data to integrate the clone sequence and whole shotgun method sequence (data not shown). Genomic positions could be found for all 29 genes, and 19 of these genes show conservation with humans of syntenic relationships with at least one neighboring gene, including TRIM32. Though more complete than previous versions, Version 6 of the zebrafish genome is not yet entirely correct, since several transcripts appear split between distant genomic loci or have portions (usually corresponding to exons found singly on a BAC) multiply identified in close proximity on the genome. In particular, genes with repeat-rich or modular elements, like dystrophin and collagen 6A3, may be more difficult to align electronically, resulting in genomic sequences that do not agree with BAC sequence data. However, 24 of the transcripts were found with nearly complete coverage spanning one or more BAC clones which should provide better local sequence coverage until complete clone information has been incorporated into the genome assembly. To test these data, we compared the in silico identification of genomic loci with those previously identified. To date, only four dystrophy-associated orthologs have been physically localized in the zebrafish genome. By radiation hybrid (RH) mapping using the T51 RH panel, researchers have mapped dystrophin to Chr 1 [13,17] and delta sarcoglycan to Chr 21 [12]. Similarly, caveolin 3 has been localized to Chr 6 [14]. Finally, a BAC walk between two genetic markers on Chr 9 identified and positioned titin in the interval [18]. All four positions agree with the data presented here from Version 6 of the Sanger Centre Zebrafish Genome. In addition, genetic mapping using polymorphic microsatellites within and flanking zebrafish titin has confirmed the duplication of zebrafish titin that was found in silico (data not shown). To expand the set of genes for which we have physical position information, calpain-3 was located using radiation hybrid analysis with the T51 RH panel. In silico mapping places calpain-3 on Chr 17. However, RH mapping of calpain-3 places the orthologous zebrafish transcript on Chr 22 nearest to marker fa11a04.s1. To reconfirm computer-based findings, an independent analysis was performed and again returned Chr 17 as the most likely calpain-3 locus with synteny to neighboring genes. While BLAST analysis did identify other loci with some similarity to human calpains, none were located on Chr 22. These data, the appearance of genomic duplications of genes in whole or in part, and the identification of non-contiguous transcripts in the genome suggest that the current Sanger Centre Zebrafish Genome still contains regions of misassembly, especially where continuity and singleness of transcripts is confirmed within the clone database. Nonetheless, locations for a greatly increased number of gene orthologs could be identified in Version 6 of the genome as compared with Version 5, suggesting improvement of the genome assembly over time (data not shown). In combination with the 80% success rate in the 5 genes with physical mapping data, this suggests a strong correlation between rough physical gene location and the current genome assembly.

Gene duplications

Multiple distinct zebrafish transcripts were identified for each of four genes: Filamin C, emerin, dystrophin, and titin. For emerin, the two identified transcripts differ only by a single 7 basepair internal fragment, suggesting differential splicing, or mis-prediction of one of the transcripts. Both transcripts identified the same genomic and BAC locus, further suggesting a single gene locus. In the case of dystrophin, the multiple transcripts all appear to be partial sequences of the large dystrophin mRNA (> 14 kb in humans) [19], and position to Chr 1. Two putative zebrafish FLNC transcripts were identified in Genbank that position to different genomic loci. Of these, only the FLNC predicted-transcript XM_693754 contains an exon highly orthologous to human exon 47, the conserved FLNC-unique region in mammals. A second FLNC transcript, XM_687344, is only a partial transcript and does not contain this exon. However, comparison of human FLNC exon 47 with the zebrafish genome identified a second locus for this exon immediately following the genomic locus of XM_687344, suggesting that a full transcript sequence would identify this gene as FLNC. Filamin C (FLNC) appears to be a true duplication, with transcripts divergent at the nucleotide and protein levels. Titin, which codes for an enormous mRNA in humans (> 82 kb) [20], shows multiple transcripts due to its length, as well as an apparent gene duplication event. Duplicated loci were found in head-to-tail juxtaposition with genes divergent at both the nucleotide and protein levels. Due to the large number of titin transcripts, only two transcripts from each gene locus have been listed. Two additional genes, dysferlin and desmin, may also have genomic duplications, identified by multiple zebrafish genomic loci orthologous to the human protein sequences. While many of these loci were ruled out due to closer homology with other human proteins within the gene families, not all additional loci could be eliminated. Studies of the Hox gene clusters in fish suggest a full genome duplication event in ancestral teleost lineages after the divergence of ray-finned fish (from which zebrafish derive) and lobe-finned fish (from which mammals derive) [21]. Further comparative genomics studies report that at least 20% of gene duplicates have been maintained in zebrafish, often by divergence of regulation between the duplicate loci that imposes an evolutionary constraint on both genes [22]. Of the zebrafish gene orthologs in this study, however, we find that only two genes show strong evidence of duplicate gene maintenance – titin and FLNC – with at most four gene duplications suggested by the genomic sequence (including dysferlin and desmin). Further, the juxtaposition of duplicate titin loci strongly suggests a tandem gene duplication event after the teleost ancestral genome duplication. Thus, at least one and at most 3 of the 29 genes studied (3–10%) show evolutionary maintenance of duplicate gene sequences from the whole genome duplication event, below the 20% previously reported [22]. Given the widespread distribution of these genes (Fig. 1), it is unlikely that the absence of dystrophy gene duplications is due to lack of duplication of a specific chromosomal region, or to secondary loss of a specific chromosomal region after polyploidization. It is also unlikely that the low number of dystrophy gene duplications in zebrafish is the result of an overall detrimental affect of duplicate copies of these genes since paralogs of many dystrophy genes are found in both mammals and fish. While it is quite possible that all existing duplicates were not identified in this study, it is also possible that these genes may evolve more slowly, preventing divergence of duplicate loci that would subject both to evolutionary constraint.

Conclusion

To aid in the development of zebrafish as a suitable candidate for genetic screens for dystrophy-causing mutations and to create a genomic map of dystrophy-associated zebrafish genes, we searched existing zebrafish sequence databases to identify zebrafish orthologs of dystrophy-causing genes. Using Genbank and Sanger Centre databases, 28 out of 29 genes studied showed identifiable ortholog transcripts. These data suggest that zebrafish may express muscle genes orthologous to those previously shown in mammals to be required for normal muscle maintenance and/or regeneration. Genomic loci were also identified for all 29 genes (though one, TRIM32, is currently located on an orphan scaffold). Comparison of in silico ortholog mapping with published physical mapping confirms that the current genome and in silico techniques were able to identify correct chromosomal locations for at least 4 genes out of 5 genes with available positional information. Only 3–10% of dystrophy gene duplicates appear to have been maintained since the teleost genome duplication, fewer than other gene groups studied in fish, indicating that the dystrophy-related genes may be slow to evolve independent functions or regulation. These data should aid in the genetic mapping of zebrafish dystrophy mutants, creation of mutant lines for high-throughput testing of dystrophy therapies, and identification of novel dystrophy-causing genes.

Methods

Computer identification of orthologous zebrafish ESTs

For each gene (Table 1 and Table 2), human transcript sequences (starting with NM) and human protein sequences (starting with NP) were identified in NCBI databases [23]. Zebrafish ESTs orthologous to the human protein sequences were identified by BLAST into the NCBI zebrafish nr database using the tblastn algorithm. Responses were prioritized by percentage similarity and amount of coverage. Where more than one reasonable candidate EST was returned, all such ESTs were reciprocally compared with mammalian sequences in NCBI (nr database) using the tblastx algorithm to determine which one was most similar to the mammalian gene being studied. Sequences starting with NM representing known EST sequences were preferred. Predicted sequences (starting with XM) were used only when they showed high percentage similarity to mammalian sequences and when no other highly correlated zebrafish ESTs were returned. In some cases, zebrafish ortholog candidates could still not be distinguished and all such candidates were noted and pursued for the following identification steps. Where more than one human isoform is known for a given gene, all isoforms were independently queried against zebrafish databases as above. In no case, however, did different isoforms of a single human gene identify disparate zebrafish genes.

Computer identification of genomic location

Zebrafish ESTs were then compared with the current zebrafish genome assembly, Zv6, in the Wellcome Trust Sanger Institute databases as of April 2006. To identify genomic locations of zebrafish ortholog ESTs, the Ensembl blast program [24] was used with the blastn algorithm and "Near exact match" parameters. Returned hits were ranked by e value and assessed for transcript coverage. Note that percentage of sequence identity was typically > 95% over short stretches (likely corresponding to exons). Where more than one location had similar levels of coverage and sequence identity, all such locations are noted. To confirm genomic loci (or if no zebrafish EST was identified), human protein sequences were compared with the genome using a tblastx algorithm and the parameter "Allow some local mismatch". Multiple loci were frequently identified with the human protein, but could often be ruled out based on a closer orthology to other genes within a gene group (using the analyses methods herein).

Computer identification of position on sequenced clones

Because the genome assembly is still not complete and certain regions may be misaligned, we also identified the clone locations of zebrafish genes where possible. Zebrafish ESTs were compared with finished and unfinished clone sequences using the Sanger D. rerio Blast Server [25]. The blastn algorithm was used with a filter for low complexity regions and Repeatmasker to mask short repeat sequences. Returned sequences were ordered by e value and analyzed for coverage and exon breaks corresponding to those seen in genomic locations. All finished clone sequences with complete coverage are listed. Unfinished (incompletely sequenced) clones are noted only where there was no reasonable alignment with a finished clone. In the case where a gene spans more than one clone, clones are noted with plusses between them. Again, loci were confirmed by homology to human protein sequences using the tblastn algorithm (without Repeatmasker).

Determination of synteny

Neighboring genes and their orientations with respect to the human gene of interest were determined using NCBI Entrez GeneView. In many cases, closest neighbors were predicted or non-coding RNAs. Non-coding RNAs were not used. Some predicted genes did retain syntenic relationships and are listed. Where neighboring predicted genes were not found in the zebrafish genome, the closest known coding gene was used instead. Genomic loci for neighboring genes were determined as above, using the human protein sequence and tblastn algorithm in either the Zebrafish Genome or in the Clone Database.

Radiation hybrid mapping

Primers were designed to calpain-3 sequence NM_001004571 and used in PCR reactions with the zebrafish T51 radiation hybrid panel as previously described [26,27]. SAMapper was used to obtain LOD scores and map distances to known zebrafish markers [28]. Primers used were: CAPN3 (Forward): 5'- CACTAGTGTCACAGGCAGCGTTTC-3' CAPN3 (Reverse): 5'- GTTGCCGTCCATCATGAGCTTTGAG-3'

Authors' contributions

LSS identified ortholog sequences and genomic map locations and drafted the manuscript. JRG and LMK conceived of the project and assisted in drafting the manuscript. JRG also performed initial sequence searches. EDV and TJP participated in initial sequence searches. RB and YZ performed the radiation hybrid mapping of calpain-3 and assisted in project design. All authors read and approved the final manuscript.
  25 in total

1.  Small molecules that delay S phase suppress a zebrafish bmyb mutant.

Authors:  Howard M Stern; Ryan D Murphey; Jennifer L Shepard; James F Amatruda; Christian T Straub; Kathleen L Pfaff; Gerhard Weber; John A Tallarico; Randall W King; Leonard I Zon
Journal:  Nat Chem Biol       Date:  2005-12       Impact factor: 15.040

2.  Essential and overlapping roles for laminin alpha chains in notochord and blood vessel formation.

Authors:  Steven M Pollard; Michael J Parsons; Makoto Kamei; Ross N W Kettleborough; Kevin A Thomas; Van N Pham; Moon-Kyoung Bae; Annabelle Scott; Brant M Weinstein; Derek L Stemple
Journal:  Dev Biol       Date:  2006-01-01       Impact factor: 3.582

3.  Neuromuscular disorders: gene location.

Authors: 
Journal:  Neuromuscul Disord       Date:  2006-01       Impact factor: 4.296

Review 4.  Construction and characterization of zebrafish whole genome radiation hybrids.

Authors:  C Kwok; R Critcher; K Schmitt
Journal:  Methods Cell Biol       Date:  1999       Impact factor: 1.441

5.  An STS-based radiation hybrid map of the human genome.

Authors:  E A Stewart; K B McKusick; A Aggarwal; E Bajorek; S Brady; A Chu; N Fang; D Hadley; M Harris; S Hussain; R Lee; A Maratukulam; K O'Connor; S Perkins; M Piercy; F Qin; T Reif; C Sanders; X She; W L Sun; P Tabar; S Voyticky; S Cowles; J B Fan; C Mader; J Quackenbush; R M Myers; D R Cox
Journal:  Genome Res       Date:  1997-05       Impact factor: 9.043

6.  The use of zebrafish for assessing ototoxic and otoprotective agents.

Authors:  Christopher Ton; Chuenlei Parng
Journal:  Hear Res       Date:  2005-07-12       Impact factor: 3.208

7.  Zebrafish as a model for caveolin-associated muscle disease; caveolin-3 is required for myofibril organization and muscle cell patterning.

Authors:  Susan J Nixon; Jeremy Wegner; Charles Ferguson; Pierre-François Méry; John F Hancock; Peter D Currie; Brian Key; Monte Westerfield; Robert G Parton
Journal:  Hum Mol Genet       Date:  2005-05-11       Impact factor: 6.150

8.  Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a Bardet-Biedl syndrome gene (BBS11).

Authors:  Annie P Chiang; John S Beck; Hsan-Jan Yen; Marwan K Tayeh; Todd E Scheetz; Ruth E Swiderski; Darryl Y Nishimura; Terry A Braun; Kwang-Youn A Kim; Jian Huang; Khalil Elbedour; Rivka Carmi; Diane C Slusarski; Thomas L Casavant; Edwin M Stone; Val C Sheffield
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-10       Impact factor: 11.205

9.  Zebrafish hox clusters and vertebrate genome evolution.

Authors:  A Amores; A Force; Y L Yan; L Joly; C Amemiya; A Fritz; R K Ho; J Langeland; V Prince; Y L Wang; M Westerfield; M Ekker; J H Postlethwait
Journal:  Science       Date:  1998-11-27       Impact factor: 47.728

10.  Delta-sarcoglycan is required for early zebrafish muscle organization.

Authors:  Jeffrey R Guyon; Alycia N Mosley; Susan J Jun; Federica Montanaro; Leta S Steffen; Yi Zhou; Vincenzo Nigro; Len I Zon; Louis M Kunkel
Journal:  Exp Cell Res       Date:  2004-12-08       Impact factor: 3.905

View more
  27 in total

Review 1.  High-content drug screening with engineered musculoskeletal tissues.

Authors:  Herman Vandenburgh
Journal:  Tissue Eng Part B Rev       Date:  2010-02       Impact factor: 6.389

2.  Drug screening in a zebrafish model of Duchenne muscular dystrophy.

Authors:  Genri Kawahara; Jeremy A Karpf; Jennifer A Myers; Matthew S Alexander; Jeffrey R Guyon; Louis M Kunkel
Journal:  Proc Natl Acad Sci U S A       Date:  2011-03-14       Impact factor: 11.205

Review 3.  Duchenne muscular dystrophy animal models for high-throughput drug discovery and precision medicine.

Authors:  Nalinda B Wasala; Shi-Jie Chen; Dongsheng Duan
Journal:  Expert Opin Drug Discov       Date:  2020-01-30       Impact factor: 6.098

Review 4.  Protein O-mannosylation in animal development and physiology: from human disorders to Drosophila phenotypes.

Authors:  Naosuke Nakamura; Dmitry Lyalin; Vladislav M Panin
Journal:  Semin Cell Dev Biol       Date:  2010-04-01       Impact factor: 7.727

Review 5.  Recent advancements in understanding mammalian O-mannosylation.

Authors:  M Osman Sheikh; Stephanie M Halmo; Lance Wells
Journal:  Glycobiology       Date:  2017-09-01       Impact factor: 4.313

Review 6.  Recent advances using zebrafish animal models for muscle disease drug discovery.

Authors:  Lisa Maves
Journal:  Expert Opin Drug Discov       Date:  2014-06-14       Impact factor: 6.098

Review 7.  Other model organisms for sarcomeric muscle diseases.

Authors:  John Sparrow; Simon M Hughes; Laurent Segalat
Journal:  Adv Exp Med Biol       Date:  2008       Impact factor: 2.622

8.  Zebrafish models of collagen VI-related myopathies.

Authors:  W R Telfer; A S Busta; C G Bonnemann; E L Feldman; J J Dowling
Journal:  Hum Mol Genet       Date:  2010-03-25       Impact factor: 6.150

9.  Zebrafish models for human FKRP muscular dystrophies.

Authors:  Genri Kawahara; Jeffrey R Guyon; Yukio Nakamura; Louis M Kunkel
Journal:  Hum Mol Genet       Date:  2009-12-01       Impact factor: 6.150

10.  Regulation of slow and fast muscle myofibrillogenesis by Wnt/beta-catenin and myostatin signaling.

Authors:  Jin-Ming Tee; Carina van Rooijen; Rick Boonen; Danica Zivkovic
Journal:  PLoS One       Date:  2009-06-11       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.