Literature DB >> 26114548

A Comparative Transcriptome Analysis between Wild and Albino Yellow Catfish (Pelteobagrus fulvidraco).

Ming Zou1, Xiaoting Zhang2, Zechao Shi3, Li Lin4, Gang Ouyang5, Guirong Zhang1, Huan Zheng2, Kaijian Wei1, Wei Ji4.   

Abstract

Body colours are important and striking features for individual survival and reproductive success, in particular in vertebrates where mating behaviour and mate preference may be strongly influenced by non-normal phenotypes. Pigmentation disorders may be generated by disruption of one or many independent genes as well as by environmental factors. The first discovery of albino yellow catfish (Pelteobagrus fulvidraco Richardson) with golden skin colour from fish farms in China provides us valuable material to study the molecular mechanism underlying the abnormalities of pigmentation. In this study, transcriptome sequencing of fin tissues corresponding to the distinct body colours, wild type and mutant albino yellow catfish, were performed using Illumina sequencing technology. Based on next-generation sequencing technology and de novo assembly, we generated a transcriptome of P. fulvidraco. A number of genes differentially expressed between the wild types and albinos were identified, suggesting their contribution to the different phenotypes and fitness. However, non-synonymous mutations result from single nucleotide substitutions residing in coding regions may not contribute to such differences. Based on the high-throughput expression data generated for the two different types of P. fulvidraco, we found that alterations of expression pattern may be more common than non-synonymous mutations. The transcriptome of P. fulvidraco will be an invaluable resource for subsequent comparative genomics and evolutionary analyses of this economically important fish.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26114548      PMCID: PMC4482592          DOI: 10.1371/journal.pone.0131504

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Body colours and colour patterns are significant and striking phenotypic traits in vertebrates and have important functions such as contributing to mate selection, species recognition, crypsis, and warning or threatening of predators. Disorders of pigmentation can cause variation in body colour, among which albinism was reported as an inborn error of metabolism proposed by Garrod [1]. Albinism is characterized by a series of genetic abnormalities, which is caused by a deficiency of melanocytes or the malfunction of melanin production [2]. Albino individuals occur in a variety of vertebrates including mammals [3], reptiles [4], amphibians [5], birds [6] and fishes [7] and are usually accompanied by a series of pathology changes, such as anemia, osteopetrosis,reduced eyes,deafness,neurological and skeletal defects [8-10]. Albino individuals typically experience much higher levels of selection (i.e., phenotype-specific mortality and/or reduced reproductive fitness) in the natural environment than normal individuals and as a consequence albinos in the wild are usually rare. Pigment cells of vertebrates are derived from the dorsal neural crest during embryonic development. So far, at least seven types of pigment cells have been reported, including melanocytes (black or dark brown), xanthophores (yellow), erythrophores (red/orange), iridophores (reflective silver, blue, or gold), leucophores (white), cyanophores (blue) and irido-erythrophores (violet) [11, 12]. Only bony fishes possess all seven types of pigment cells [11]. These cells are concerned because of their medical importance since a number of human diseases result from abnormal melanocyte development, such as piebaldism, vitiligo, albinism, and abnormal growth and dispersal of melanoma. And the studies of colour pattern formation, neural crest development, and the cell biology and biochemistry of melanin formation and distribution in animals are also performed [13]. At present, over 100 mutations associated with albinism have been identified and described in mice and humans (see albinism database: http://www.ifpcs.org/albinism/). Colour mutations are also a common phenomenon in fish, especially in tropical ornamental fish, such as in group of Cichlid fish [7], paradise fish (Macropodus percularis), swordtail (Xiphophorus helleri) and goldfish (Carassius auratus) [14]. Albinism has also been observed in medaka (Oryzias zatipes), as well as in grass carp (Ctenopharyngodon idella) [15], rainbow trout (Oncorhynchus mykiss) [16], Japanese flounder (Paralichthys olicaceus) [17], black scabbard-fish (Aphanopus carbo) [9] and the channel catfish [18]. In most of these fishes, albinism is inherited either in an autosomal recessive manner or in an incomplete dominance manner [15, 19, 20]. Due to its biological and medical significance, the molecular mechanism of melanin biosynthesis has been extensively studied and the conserved melanin formation pathway has been elucidated in vertebrates [21-23]. Melanin is produced in the melanosome of melanocytes, which originate from melanoblast precursors. These precursor cells were developed from the dorsal neural tube during embryonic development, and migrate to their targeted destinations along defined pathways where they mature into functional melanocytes [1]. These processes are strictly controlled by many genes, among which tyrosinase (TYR) is the most important melanogenesis-related enzyme involved in the tyrosine metabolism pathway. Tyrosinase can catalyse the conversion of L-tyrosine to L-DOPA and regulate both the speed and specificity of melanogenesis [24]. In addition, tyrosinase-related protein 1 (TYRP1) and dopachrome tautomerase (DCT or TYRP2) are also important enzymes in melanin synthesis [25] and mainly regulate the eumelanin pathway. Recently, the slc7a11 (encoding solute carrier family 7 member 11) gene has been reported to be a critical genetic regulator for pheomelanin synthesis in hair and melanocytes [26]. Pigmentation in coat colour mutations have been studied over 100 years, while most studies focused on the mammals and birds [3, 6, 27]. Limited studies were reported about fish colour mutations, of which most reports were about colour gene cloning and expression. In recent years, there were sporadic reports of genomic analysis, which are performed to determine the complete set of duplicated pigmentation genes in fish to better understand how pigment synthesis pathways have been affected by the FSGD (fish specific whole-genome duplication) [27]. Xu et al. not only presented a draft genome of domesticated common carp, but also performed transcriptomic analyses between two distinct domesticated strains with body colour mutation (Songpu and Hebao), which provided valuable resources for genetic breeding and molecular studies of pigmentation formation in common carp [26]. Wang et al. also compared the skin transcriptome between two oujiang color varieties of common carp [28]. More studies of different fish species with colour mutations are needed to better understand the molecular mechanism underlying the pigmentation in teleosts. Yellow catfish, a teleost belonging to the family Bagridae, is one of the most important economic freshwater species in China due to its excellent meat quality. This species is extensively cultured in East and South Asia in semi-intensive and intensive systems [29]. In 2012, total production of yellow catfish reached 256,650 tonnes in China [30]. During the breeding of yellow catfish, we found a small number of fish with golden-pink eyes and golden colour all over the body (Fig 1). We consider this colour variation to result from an absence of melanin pigmentation, just like albinism reported in other species. Given the complexity of colour variation and albinism, an understanding of how melanin pigmentation is regulated has a long way to go. Transcriptome analysis of yellow catfish with golden colour will help us to better understand the mechanism for pigment biosynthesis. In this study, we sequenced the transcriptome of the normal and the newly discovered albino yellow catfish strain (thereafter called wild type and albino type, respectively) using Illumina sequencing technology. We report the differentially expressed genes (DEs) and mutations that may contribute to the phenotypic differences between these two strains, which may provide genetic information for fish colour variation.
Fig 1

Wild and albino yellow catfish.

Materials and Methods

Ethics statements

This study was approved by the Institutional Animal Care and Use Committees (IACUC) of Huazhong Agricultural University. No specific permits were required for the field studies described here. The study area is not privately-owned or protected in any way, and the field studies did not involve endangered or protected species.

2.1 Sampling

Wild type and albino yellow catfish for one year old (30 individuals for each type, ~20g each) were collected from a fish pond in Jingzhou, Hubei Province, and were transported to Huazhong Agricultural University. The yellow catfish were acclimated in an indoor tank with sufficient oxygen and fed daily at 9:00 am for 2 weeks before the experiment.

2.2 RNA extraction, cDNA library creation and Illumina sequencing

In order to identify potential genes subject to single-base substitution mutation and differential expression that may result in different phenotypes between wild types and albinos, we extracted and sequenced the transcriptome of both strains. For each type, we extracted and mixed equal quantity of total RNA extracted from anal fins from two individuals, and the same procedure was repeated one more time but using ten individuals instead to increase the reliability of the data. Because sequence data from public databases for P. fulvidraco is scarce, we extracted and mixed equal quantities of total RNA from skin, liver, spleen, kidney, head-kidney, and muscle of one individual for each strain to obtain coding sequences for this species. All individuals were euthanized with 300 mg/L tricaine methanesulfonate (MS 222) before dissection and total RNA was extracted using Trizol reagent (Invitrogen, USA) according to the manufacturer’s instructions. The RNA purity and integrity were evaluated by the ratio of OD260/OD280 and RIN value, respectively. RNA samples with OD260/OD280 ratio > 1.7 and RIN value > 7.5 were selected for deep sequencing. For transcriptome sequencing using Illumina’s sequencing technology, six cDNA libraries with insert sizes of 300–500 base pairs (bp) were prepared (Table 1). All libraries were sequenced using Illumina (San Diego, CA, USA) HiSeq 2000 platforms according to the manufacturer’s protocols with a read length of 100 bp. It should be noted that libraries of mixed total RNA from anal fins of ten individuals for each type (samples 5 and 6 in Table 1) were sequenced twice, once as paired-end sequence and once as single-end.
Table 1

Profiles of each library for the two strains subject to deep sequencing.

SampleStrainsNumber of individualsTissueReads typeNumber of clean reads1Number of clean reads2SRA accession number
1wild1skin, liver, spleen, kidney, head-kidney, and musclepaired-end23,206,40623,006,146SRR1630913
2albino1skin, liver, spleen, kidney, head-kidney, and musclepaired-end19,813,29919,632,145SRR1630912
3wild2anal finpaired-end19,920,42619,732,153SRR1630911
4albino2anal finpaired-end24,111,88623,910,559SRR1630910
5wild10anal finpaired-end25,564,94025,462,585SRR1630906
single-end23,880,719-SRR1630902
6albino10anal finpaired-end22,479,47522,388,185SRR1630909
single-end19,996,564-SRR1630908

Clean reads are clipped and trimmed reads.

Clean reads are clipped and trimmed reads.

2.3 Data filtering, de novo assembly and annotation

Quality controls for original reads of each library were mainly processed using the fastx-toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). First, sequencing adapters were clipped using fastx_clipper. Then fastq_quality_trimmer was used to trim the remaining data and clipped reads < 20 bp or nucleotides with phred quality scores lower than 20. Subsequently, fastx_quality_stats was used to summarize each library and GALAXY (http://galaxyproject.org/) was used to draw quality score boxplots and nucleotide distribution charts to evaluate the sequencing quality. We combined all reads1 and reads2 separately and assembled them into contigs using Trinity software [31]. To reduce redundancy and to preserve as many coding sequences as possible, we extracted open reading frames for each contig using TransDecoder (http://sourceforge.net/projects/transdecoder/) and clustered identical proteins using CD-HIT [32]. The corresponding nucleotide sequences were retained to generate a non-redundant transcriptome that was used for subsequent analyses. To evaluate the assembly, we compared the non-redundant transcriptome to protein sequence data deposited in UniProt (http://www.uniprot.org/) using BLASTX [33]. We employed an E value of 10−20, and calculated the coverage between the transcript and its top hit protein. The analyses were performed for Swiss-Prot and TrEMBL, separately. Functional annotations of the assembled sequences were achieved following the SFG pipeline [34]. Briefly, each sequence of the non-redundant transcriptome was first translated into amino acids in six frames and searched against GenBank's non-redundant protein database (NR), UniProt Swiss-Prot, and TrEMBL protein databases using local BLASTX with an E value of 10−4. The proteins with the highest sequence similarity were retrieved and annotated to each contig. Gene ontology (GO) terms [35] were obtained after parsing the corresponding UniProt flat files for each contig, and WEGO software [36] was then used to produce a GO functional classification for all the contigs to interpret the distribution of the species’ gene functions.

2.4 Differential expression analysis

Paired-end reads generated from mixed total RNA of anal fin for each sample (samples 3, 4, 5, and 6 in Table 1) were used to identify genes probably subject to differential expression between wild types and albinos. For each sample, paired-end reads were mapped back to the non-redundant transcriptome using bowtie [37]. The program rsem-calculate-expression within the software package RSEM [38] was used to estimate the expression levels for each sequence using the expectation—maximization (EM) algorithm. The expression profiles generated for each sample were combined and were used to detect differentially expressed genes between the two strains using edgeR [39] and EBSeq [40]. For edgeR, only those contigs with at least ten counts across samples were tested for diff-expression. For both tests, contigs with fold change value > 2 and the adjusted P value < 0.05 according to the False Discovery rate method of Benjamini and Hochberg [41] were selected as differentially expressed candidates.

2.5 Mutation sites analysis

All reads generated by this work were used to identify mutation sites by comparing the albino strain with the wild type strain. For each sample, all reads were mapped back to the non-redundant transcriptome using bwa (-n 0.005-k 5) [42], and all duplicate reads were removed using the MarkDuplicates program from the software package Picard (http://broadinstittute.github.io/picard). Then we realigned the alignments around indels using the Genome Analysis Toolkit (GATK, https://www.broadinstitute.org/gatk), and reads with minimum mapping quality < 20 were removed using samtools [43]. The processed alignments for all samples were combined using samtools and were used to generate a synchronized file using the program mpileup2sync included in the software package PoPoolation2 [44]. The synchronized file listed allele frequencies for every population at every base in the non-redundant transcriptome in a concise format. Sites with base frequencies more than five in every wild type samples but absent in every albino samples were identified as albefaction-related mutations and were compared with the predicted transcript structure to decide if they were located in five prime UTR (untranslated region), three prime UTR, or coding regions.

2.6 Pathway enrichment analysis

Contigs with mutation sites that changed the encoded amino acids or were identified as differentially expressed candidates were separately subject to pathway enrichment analysis using KOBAS2.0 [45] with Danio rerio (zebrafish) as the reference species.

2.7 Colour genes analysis

Using the reciprocal best blast hit method (RBB), orthologs between D. rerio and P. fulvidraco were identified. Briefly, the proteome of D. rerio was downloaded from Ensembl (http://www.ensembl.org/index.html, version 75), and RBB was performed between proteomes of the two species using blastp with the E value set to 1e-9. We also required that ortholog pairs overlapped more than 100 amino acids and more than 60% of them were identical. After that, colour-related genes in P. fulvidraco were extracted according to gene names (http://www.espcr.org/micemut/) and patterns of their expressions and mutations were scrutinized.

2.8 Experimental verification

In order to validate the RNA-Seq results, quantitative realtime PCR (qRT-PCR) was performed on the Roche LightCycler 480 Real-Time PCR System with three replicates using the SYBR Green PCR kit (Roche). The samples used for qRT-PCR were different from the samples used for RNA-Seq, and were collected from the anal fins of five new individuals for each strain. The gene-specific primers were designed based on the differentially expressed genes. The β-actin was used as an internal reference. The relative expression levels of DE genes were normorlized to β-actin (2-ΔΔCt), and all data were expressed as means±SEM. Statistical analysis was performed using GraphPad Prism 5.0 (unpaired t test). The gene and primer information is shown in Table 2.
Table 2

List of primers for the qRT-PCR validation of DEs identified by RNA-Seq.

GenePrimerssequences(5'to3')Annealing temperature(°C)Product Size(bps)Contig name
tyr Tyr-1278-F CTACATGGCACCCTTCATACC 6093comp71391_c1_seq1
Tyr-1370-R AGGGTCTTGAAGGTAGGAGTA
ednrbla ednrbla-F1 CTGTGGAAGTAACCCTCATCTG 58166comp76216_c0_seq17
ednrbla-R1 CAGCCACCAGTCCTTTATATCC
tyrp1 TYRP1-RT-F2 GAATCCAGCAGGGAACGTGA 58145comp70296_c0_seq1
TYRP1-RT-R2 GCCCTCGATGGTGTTTCTGA
slc7a11 slc7a11-RT-F1 GGAATCTTCATCGCACCC 58115comp77711_c1_seq25
slc7a11-RT-R1 CAGCGTAGGACAAGGCAC
Melar-5Melar-5-RT-F1 TATGTGCTTCATGTCCCACT 58116comp52331_c0_seq1
Melar-5-RT-R1 TCTCCTTGAGCGTCTTGC
CD8aCD8a-RT-F1 GGACAGCGGCACTTACAG 58287comp73354_c0_seq23
CD8a-RT-R1 TGACCAGAAGCAGGAGCA
GIMAP7GIMAP7-RT-F1 GGATTGTGCTGGTGGGAAAG 58289comp77845_c0_seq21
GIMAP7-RT-R1 TTGGTGAATCTTCCGGGTTGT
UtrophinUtrophin-RT-F1 GTGCAACTTTCGTCCACCAC 58266comp80248_c1_seq8
Utrophin-RT-R1 AGGGAATGGTAGAGCTCGGT
LipoproteinLipop-R-RT-F1 ATGGTGGATTGCGTCATA 58249comp71875_c1_seq1
receptorLipop-R-RT-R1 TCACTTGGTTCCCTTCTT
ETFSEFTS-RT-F1 CAAGACTACGGGCTACAC 58204comp69376_c0_seq2
EFTS-RT-R1 TTCACCTCCACCATAACA
CD99CD99-RT-F1 CAAGACTACGGGCTACAC 58212comp75592_c3_seq4
CD99-RT-R1 TCTCACAGTTCACCTCCA
HSP70HSP70-RT-F1 AAACTGGCTGTATGAGGAT 58101comp76788_c1_seq2
HSP70-RT-R1 TGTATCGGTCTTGAATGG
IRF-7IRF7-RT-F1 CTCAGGCATCTACGGCTA 58114comp80561_c1_seq7
IRF7-RT-R1 AGTTCATTCTGGCTTTGG
MYCBP2MYCBP2-F1 CATCAGGAACCGAAAGGAGAA 58150comp81235_c1_seq1
MYCBP2-R1 TGGAGACCAGGCATTGTAATAG
comp73484_c0_seq4LOC-RT-F1 GCACAGGCGTCTTTCTCAAC 58121comp73484_c0_seq4
LOC-RT-R1 AGTCTCAGCAATAGCAGCCG
NeoverrucotoxinNeo-RT-F1 TGGAGACCAGGCCTGTGATA 58154comp73484_c0_seq5
Neo-RT-R1 ATCCCCCGTCATGTCTCTGA
TRIMP-29TRIMP-RT-F2 TGCTGCTCCGATTCTTGTCC 58188comp73484_c0_seq10
TRIMP-RT-R2 GCTTAGGTCCAGTTGCGACA
comp73484_c0_seq7comp–RT-F2 GAGGGTTGGCAGTCCATTCA 58159comp73484_c0_seq7
comp–RT-R2 GGACCATACGCGAACAGGAA
CalpastatinCalpas-l-RT-F2 ACGACTCACTTCCTCCCG 58237comp73354_c0_seq5
Calpas-l-RT-R2 CCAGTTGGTCCATTACCG
PtprcPtprc-RT-F1 GACAGTGCGGACACAGAGAA 58287comp81041_c0_seq19
Ptprc-RT-R1 GCTCCCTCTTCAGGCGATAC

Results

3.1 Sequencing, assembly and annotation

The number of clean reads (clipped and trimmed reads) obtained was > 20 M from most libraries (Table 1). Integrating these data and using a de novo assembly algorithm, we obtained 310,143 contigs > 200 bp. According to the TransDecoder predictions, there were 106,201 contigs encoding 144,487 proteins longer than 100 amino acids. Identical predicted protein sequences were clustered and the corresponding nucleotide sequences were extracted to form a non-redundant transcriptome, which included 64,398 sequences with the total length reached more than 135 Mb. The lengths of most contigs were < 1,000 bp, but there were also many contigs > 5,000 bp (Fig 2). The N50 contig length, median contig length, and average contig length of the non-redundant transcriptome were 3,087 bp, 1,604 bp and 2,098 bp, respectively.
Fig 2

Contig length distribution of the P. fulvidraco transcriptome.

By comparing with protein sequences deposited in NR, Swiss-Prot and TrEMBL, we annotated 61,184 (95%), 54,495(84.6%), and 59,314 (92.1%) sequences within the non-redundant transcriptome of the yellow catfish, respectively. Table 3 shows summary of the coverage of predicted proteins overlapping the homologous proteins deposit in Swiss-Prot and TrEMBL. GO terms were assigned to 55,032 sequences according to their homologs annotated by UniProt. Within these sequences, about 89.6% (49,318) were assigned to cellular component terms, 82.6% (45,473) were assigned to molecular function terms, and 80% (44,003) were assigned to biological process terms. Most represented GO terms in cellular components, molecular functions and biological processes were “cell” and “cell part”, “binding” and “cellular process”, respectively (Fig 3).
Table 3

The coverage of predicted proteins compared with the best hit homologs available in UniProt Swiss-Prot and TrEMBL.

Coverage (%)Number of proteins deposit in Swiss-ProtNumber of proteins deposit in TrEMBL
90 ~ 1008,66013,078
80 ~ 902,0492,442
70 ~ 801,5042,024
60 ~ 701,3592,105
50 ~ 601,3932,182
40 ~ 501,4152,297
30 ~ 401,5362,329
20 ~301,6292,378
10 ~ 201,2991,828
0 ~ 1000
Fig 3

The distribution of GO terms assigned to the P. fulvidraco transcriptome.

3.2 Differentially expressed genes (thereafter called DEs)

Using edgeR, we detected 27 genes that were probably DEs between the wild type and albino strains (|logFC| > 1 and adjusted P value < 0.05), of which 17 (63%) were up-regulated in wild types and 10 (37%) were up-regulated in albino types. While using EBSeq, the number of DEs between wild types and albinos reached 813 (PostFC < 0.5 or > 2 and adjusted PPDE < 0.05), of which 286 (35.2%) were up-regulated in wild types and the remaining 527 (64.8%) were up-regulated in albino types. All contigs detected by edgeR except for contig comp72223_c2_seq1, which was annotated as “high mobility group protein B2”, were also detected by EBSeq. Summaries of the DEs are shown in S1 and S2 Tables compares results generated by edgeR and EBSeq for all DEs identified in this report. Summaries of DEs with adjusted P values less than 0.001 according to both of the two methods, which should be regarded as definitely DEs, are shown in Table 4. According to the pathway enrichment analyses, the DEs identified by EBSeq may participate in a number of KEGG pathways such as “Mismatch repair”, “ECM-receptor interaction”, etc (S3 Table).
Table 4

Summaries of differentially expressed genes (adjusted P value or PPDE < 0.001) between wild and albino types according to both edgeR and EBSeq.

ContigNamePPDEPostFCFDR1logFCPValueFDR2UniprotmatchevalueDescription
comp80248_c1_seq810.0014540-12.6892.47E-111.15E-06P469390Utrophin
comp73484_c0_seq41678.8816012.609867.67E-111.32E-06E7FBQ27.00E-110Uncharacterized protein
comp73484_c0_seq51622.9236012.485358.46E-111.32E-06A0ZSK32.00E-39Neoverrucotoxin subunit alpha
comp73354_c0_seq3810.0016670-12.42785.90E-106.88E-06C6KE062.00E-58CD8 alpha
comp73484_c0_seq101444.998011.999362.38E-092.22E-05Q8R2Q09.00E-07Tripartite motif-containing protein 29
comp73484_c0_seq71538.12011.930334.23E-093.29E-05E7FBQ22.00E-136Uncharacterized protein
comp73354_c0_seq431627.7454012.495471.30E-088.68E-05Q2HYK02.00E-69Calpastatin-like protein
comp73354_c0_seq2310.00260-11.79521.61E-089.40E-05C6KE062.00E-58CD8 alpha
comp81235_c1_seq11388.2891011.8011.19E-070.000615O755920E3 ubiquitin-protein ligase MYCBP2
comp71875_c1_seq110.0028160-11.66591.99E-070.000821Q288320Low-density lipoprotein receptor
comp73354_c0_seq51271.256409.404462.09E-070.000821Q2HYK03.00E-69Calpastatin-like protein
comp81041_c0_seq191356.9776011.679112.11E-070.000821Q9IBD80Receptor-type tyrosine-protein phosphatase C
comp77845_c0_seq2110.0022460-11.55982.51E-070.000902Q8NHV15.00E-34GTPase IMAP family member 7

PPDE posterior probability of differential expression

FC fold change

FDR1 adjusted P value

FDR2 adjusted PPDE

PPDE posterior probability of differential expression FC fold change FDR1 adjusted P value FDR2 adjusted PPDE

3.3 Mutation analyses

We detected 806 sites resident in 616 genes that possessed one or more nucleotides present in all wild types but absent in all albinos. That is to say, we assume that single-base substitution mutations at these sites may disable the normal function of the gene and result in albinism. However, none of these sites was homozygote in all samples of the wild type; summaries of these sites are shown in S4 Table. According to the analytical predictions, 572 (71%) mutation sites within 455 genes were probably located in coding regions, and 70 (8.7%) mutation sites within 54 genes were probably located in the five prime UTR, with 255 (31.6%) mutation sites being located within 203 genes that were probably located in three prime UTR. Amino acids were changed by 129 single-base substitution mutations that were possibly located in coding regions. The results of pathway enrichment analyses for the genes where these sites reside in are shown in S3 Table, and the most represented GO terms assigned to these genes were “integral component of membrane”, “cytoplasm” and “protein binding” (S5 Table). Interestingly, we found one of the differentially expressed candidates between normal and albino fishes (serial number: comp81166_c0_seq1, FC > 15, S2 Table), which was annotated as “MHC class I antigen”, harboring one nonsynonymous site in the coding region according to the mutation analyses. We also detected 2 deletions in 2 contigs that were homozygote in albino types, but they were both located in the three prime UTR, which will not result in a frameshift or premature stop codons that may disable the protein.

3.4 Colour-related genes and experimental verification

We assigned D. rerio genes as orthologs for 11,978 P. fulvidraco contigs using the RBB method, within which 140 contigs were identified as differentially expressed genes using edgeR and/or EBSeq (S2 Table). However, this number should be increased because of the long time divergence between the orders Cypriniformes and Siluriformes (about 250 Mya) [46] and the stringent parameters we used for identifying orthologs. Five colour genes, namely tyr, tyrp1, slc7a11, MREG, and ednrb1a, were identified as up-regulated candidates in albinos according to EBSeq. According to edgeR, the expression pattern was similar, but the adjusted P value was not significant. Mutations of these genes will result in disorders of pigmentation such as albinism (http://www.espcr.org/micemut/). Quantitative RT-PCR experiments were performed to confirm the expression patterns of the four color-related genes since their up-regulation in albino strains compared to wild strains is a bit counter-intuitive. Four genes, tyr, tyrp1, slc7a11 and ednrb1a, were successfully amplified and their expression patterns were in accord with the results obtained using high-throughput data. Moreover, no single-base substitution mutations were found in colour-related genes. To further validate the DE genes detected by RNA-Seq, another sixteen candidates (twelve were from Table 4 and another 4 were randomly selected from S2 Table) were selected and qRT-PCR amplification confirmed the expression patterns of 13 genes. For the other three DE candidates (lipoprotein receptor, Utrophin and Ptprc), no significant difference was detected (Fig 4).
Fig 4

The results of qRT-PCR for the DEs identified by RNA-Seq and subject to validation.

Discussion

In this report, a transcriptome represented most genes expressed in P. fulvidraco was generated and annotated based on the sequence homology to proteins deposit in public databases. The coverage of predicted proteins showed that 14,965 (23.2%) sequences overlapping for more than 50% of the length of homologous proteins deposit in Swiss-Prot, of which 10,709 (16.6%) sequences covered more than 80% (Table 3). Compared with homologous proteins in TrEMBL, the number of sequences overlapping for more than 50% of their length reached 21,831 (33.9%), of which 15,520 (24.1%) sequences overlapped more than 80%. Given the very strict parameter used (E value = 1e-20) when searching for homologs, the statistics presented above are expected to be conservative. Nonetheless, these statistics suggest that we have generated a high quality transcriptome for P. fulvidraco. A portion of contigs without any hits may represent lineage-specific genes and may partially explain the uniqueness of P. fulvidraco. Compared to other RNA-Seq reports, the ratio of annotated contigs was moderately high [47]: this is because we removed all contigs except those encoding more than 100 amino acids. Based on the de novo assembled transcriptome, a number of differentially expressed genes were identified. Elsewhere, Seyednasrollah et al. compared eight packages detecting DEs based on real data and claimed that the number of detected DEs increased when the number of replicate samples was increased for most packages, including edgeR and EBSeq [48]. Specifically, edgeR showed a large range of variability when the number of replicates was very small, while EBSeq may be too liberal. As a result, EBSeq detected many more DEs than edgeR when using only two replicate samples; increasing the number of replicates is expected to narrow the gap. Simulation and real data showed that most genuine DEs ranked at the top when sorting their significance such as nominal P value from the smallest to the largest [48, 49]. This method is not affected by the direction of differential expression and edgeR performed better than EBSeq when the sample size was very small. Correspondingly, based on the results obtained by edgeR and EBSeq, most DEs identified in this report (list in S2 Table) ranked in the top 2,000 in order of their significance. Pathway enrichment analyses of differentially expressed genes identified by EBSeq (S3 Table) revealed that eleven candidates take part in the pathway “Melanogenesis” and most of these genes, including the important tyrosinase (TYR) and tyrosinase-related protein 1 (TYRP1), were up-regulated in albino types (Fig 5). This may result from the up-regulation of cystine/glutamate transporter in albino types (slc7a11, S2 Table), which transports cystine into melanocytes, switch off the eumelanin (brown to black pigment) synthesis pathway and promote the synthesis of pheomelanin (yellow to red pigment). The lack of substrates for the synthesis of eumelanin may result in the high level of relevant enzymes and corresponding transcripts and reduce the melanin synthesis, which contribute to the abnormal body colour of the albino strain. The similar results were found in common carp [26]. Moreover, KEGG pathways such as “mismatch repair”, “base excision repair”, etc., were significantly enriched in differentially expressed genes (adjusted P value < 0.05, S3 Table). These are cellular mechanisms that repair damaged DNA. We suggest that without the protection of sufficient melanin, albino fishes were much more easily subject to DNA damage caused by ultraviolet (UV). Exposure to UV will trigger intracellular repair mechanisms with the result that the expression levels of many related genes in albino types were significantly different from those in wild types.
Fig 5

Differentially expressed genes between wild and albino types in anal fins and their involvement in the melanogenesis pathway.

Genes marked in red are up-regulated and in yellow represented genes are down-regulated in wild type versus in albino type, respectively. Comparison between RNA-Seq results (identified by both edgeR and EBSeq) and qRT-PCR validation results. X-axis shows genes validated in this study; Y-axis shows Log2Ratio of expression of wild versus albino types. “*” indicates a significant difference between the two strains.

Differentially expressed genes between wild and albino types in anal fins and their involvement in the melanogenesis pathway.

Genes marked in red are up-regulated and in yellow represented genes are down-regulated in wild type versus in albino type, respectively. Comparison between RNA-Seq results (identified by both edgeR and EBSeq) and qRT-PCR validation results. X-axis shows genes validated in this study; Y-axis shows Log2Ratio of expression of wild versus albino types. “*” indicates a significant difference between the two strains. Genes harboring mutations that may influence their normal functions in albinos were also identified. However, no pathway potentially related to different fish body colour was detected among these genes with nonsynonymous mutations (S3 Table). Thus, we conclude that phenotypic differences between wild types and albinos were mainly the result of differential expression of related genes rather than single-base substitution mutations in coding regions. This was in accordance with published analyses from the genome resequencing projects for Saccharomyces cerevisiae (yeast) and Gasterosteus aculeatus (threespine stickleback). For both it was claimed that at the beginning of divergence, alterations of expression pattern should be much more frequent than expected [50, 51], and should be even more frequent than substitutions of encoded amino acids [51].

Conclusion

We have generated the transcriptome of the Chinese yellow catfish, P. fulvidraco, which will be important for subsequent analyses of this economically important fish. More widely, it should also be important for comparative genomics and evolutionary analyses in the Siluriformes. We conclude that the dominating molecular events underlying phenotypic difference between albino and normal P. fulvidraco individuals result from differential expression of related genes rather than single nucleotide mutations located in coding regions. Moreover, we found genes that take part in pathways “mismatch repair” and “base excision repair” that may be responses to ultraviolet exposure and result in cellular damage, which were significantly enriched in differentially expressed genes between wild types and albinos.

Summaries of differentially expressed genes between wild and albino types identified by edgeR and EBSeq.

(XLS) Click here for additional data file.

Comparisons of results generated by edgeR and EBSeq for differentially expressed genes between wild and albino types.

(XLS) Click here for additional data file.

Pathway enrichment analyses for differentially expressed genes between wild and albino types identified by EBSeq and genes in which non-synonymous mutations reside.

(XLS) Click here for additional data file.

Summaries of mutation sites with one kind of nucleotide present in the wild type strain but absent in the albino strain.

Note that some contigs may encode more than one protein sequence according to the prediction. (XLS) Click here for additional data file.

Single-base substitution that was possibly located in coding regions and result in non-synonymous mutations.

(XLS) Click here for additional data file.
  38 in total

Review 1.  The genetic and evolutionary basis of colour variation in vertebrates.

Authors:  Michael Hofreiter; Torsten Schöneberg
Journal:  Cell Mol Life Sci       Date:  2010-03-14       Impact factor: 9.261

2.  EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.

Authors:  Ning Leng; John A Dawson; James A Thomson; Victor Ruotti; Anna I Rissman; Bart M G Smits; Jill D Haag; Michael N Gould; Ron M Stewart; Christina Kendziorski
Journal:  Bioinformatics       Date:  2013-02-21       Impact factor: 6.937

Review 3.  The colours of extant mammals.

Authors:  Tim Caro
Journal:  Semin Cell Dev Biol       Date:  2013-04-05       Impact factor: 7.727

Review 4.  Adaptive colouration in amphibians.

Authors:  Andreas Rudh; Anna Qvarnström
Journal:  Semin Cell Dev Biol       Date:  2013-05-08       Impact factor: 7.727

5.  Inheritance of Albinism in the Medaka, ORYZIAS LATIPES, with Special Reference to Gene Interaction.

Authors:  T O Yamamoto
Journal:  Genetics       Date:  1969-08       Impact factor: 4.562

6.  Regulation of pigmentation in zebrafish melanophores.

Authors:  Darren W Logan; Sally F Burn; Ian J Jackson
Journal:  Pigment Cell Res       Date:  2006-06

7.  Identification of genes related to white and black plumage formation by RNA-Seq from white and black feather bulbs in ducks.

Authors:  Shijun Li; Cui Wang; Wenhua Yu; Shuhong Zhao; Yanzhang Gong
Journal:  PLoS One       Date:  2012-05-15       Impact factor: 3.240

8.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

9.  The genomic basis of adaptive evolution in threespine sticklebacks.

Authors:  Felicity C Jones; Manfred G Grabherr; Yingguang Frank Chan; Pamela Russell; Evan Mauceli; Jeremy Johnson; Ross Swofford; Mono Pirun; Michael C Zody; Simon White; Ewan Birney; Stephen Searle; Jeremy Schmutz; Jane Grimwood; Mark C Dickson; Richard M Myers; Craig T Miller; Brian R Summers; Anne K Knecht; Shannon D Brady; Haili Zhang; Alex A Pollen; Timothy Howes; Chris Amemiya; Jen Baldwin; Toby Bloom; David B Jaffe; Robert Nicol; Jane Wilkinson; Eric S Lander; Federica Di Palma; Kerstin Lindblad-Toh; David M Kingsley
Journal:  Nature       Date:  2012-04-04       Impact factor: 49.962

10.  Comparison of software packages for detecting differential expression in RNA-seq studies.

Authors:  Fatemeh Seyednasrollah; Asta Laiho; Laura L Elo
Journal:  Brief Bioinform       Date:  2013-12-02       Impact factor: 11.622

View more
  8 in total

1.  Identification of Quantitative Trait Loci for Resistance to RSIVD in Red Sea Bream (Pagrus major).

Authors:  Eitaro Sawayama; Shiho Tanizawa; Shin-Ichi Kitamura; Kei Nakayama; Kohei Ohta; Akiyuki Ozaki; Motohiro Takagi
Journal:  Mar Biotechnol (NY)       Date:  2017-11-10       Impact factor: 3.619

2.  A deletion in the Hermansky-Pudlak syndrome 4 (Hps4) gene appears to be responsible for albinism in channel catfish.

Authors:  Yueru Li; Xin Geng; Lisui Bao; Ahmed Elaswad; Kevin W Huggins; Rex Dunham; Zhanjiang Liu
Journal:  Mol Genet Genomics       Date:  2017-03-13       Impact factor: 3.291

3.  In Silico Comparative Transcriptome Analysis of Two Color Morphs of the Common Coral Trout (Plectropomus Leopardus).

Authors:  Le Wang; Cuiping Yu; Liang Guo; Haoran Lin; Zining Meng
Journal:  PLoS One       Date:  2015-12-29       Impact factor: 3.240

4.  The Dynamic Immune Response of Yellow Catfish (Pelteobagrus fulvidraco) Infected With Edwardsiella ictaluri Presenting the Inflammation Process.

Authors:  Xu Zhou; Gui-Rong Zhang; Wei Ji; Ze-Chao Shi; Xu-Fa Ma; Zun-Lan Luo; Kai-Jian Wei
Journal:  Front Immunol       Date:  2021-02-26       Impact factor: 7.561

5.  Analysis of yellow mutant rainbow trout transcriptomes at different developmental stages reveals dynamic regulation of skin pigmentation genes.

Authors:  Shenji Wu; Jinqiang Huang; Yongjuan Li; Lu Zhao; Zhe Liu
Journal:  Sci Rep       Date:  2022-01-07       Impact factor: 4.379

6.  Identification of Potential Blind-Side Hypermelanosis-Related lncRNA-miRNA-mRNA Regulatory Network in a Flatfish Species, Chinese Tongue Sole (Cynoglossus semilaevis).

Authors:  Yangzhen Li; Yuanri Hu; Peng Cheng; Songlin Chen
Journal:  Front Genet       Date:  2022-02-03       Impact factor: 4.599

7.  Molecular Characterization and Expression Analyses of the Complement Component C8α, C8β and C9 Genes in Yellow Catfish (Pelteobagrus fulvidraco) after the Aeromonas hydrophila Challenge.

Authors:  Huan Zheng; Wei Ji; Gui-Rong Zhang; Xiao-Ting Zhang; Ze-Chao Shi; Kai-Jian Wei; Rui-Bin Yang; Jonathan P A Gardner
Journal:  Int J Mol Sci       Date:  2016-03-08       Impact factor: 5.923

8.  Tracing key genes associated with the Pinctada margaritifera albino phenotype from juvenile to cultured pearl harvest stages using multiple whole transcriptome sequencing.

Authors:  Pauline Auffret; Jérémy Le Luyer; Manaarii Sham Koua; Virgile Quillien; Chin-Long Ky
Journal:  BMC Genomics       Date:  2020-09-25       Impact factor: 3.969

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.