| Literature DB >> 32200760 |
Yingshu Li1,2,3, Hang Yang1,2,3, Hujun Zhang1,2,3, Yongjie Liu1,2,3, Hanqiao Shang1,2, Herong Zhao1,2,3, Ting Zhang1,2, Qiang Tu4,5,6.
Abstract
Many differential gene expression analyses are conducted with an inadequate number of biological replicates. We describe an easy and effective RNA-seq approach using molecular barcoding to enable profiling of a large number of replicates simultaneously. This approach significantly improves the performance of differential gene expression analysis. Using this approach in medaka (Oryzias latipes), we discover novel genes with sexually dimorphic expression and genes necessary for germ cell development. Our results also demonstrate why the common practice of using only three replicates in differential gene expression analysis should be abandoned.Entities:
Keywords: Differential expression; Germ cell; Medaka; RNA-seq; Replication
Mesh:
Year: 2020 PMID: 32200760 PMCID: PMC7087377 DOI: 10.1186/s13059-020-01966-9
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Performance evaluation of Decode-seq. a Differential expression analysis with 3 pairs of replicates. True positive (TP, red dots), mouse genes which were called DE; false negative (FN, orange dots), mouse genes which were called non-DE; true negative (TN, gray dots), human genes which were called non-DE; false positive (FP, blue dots), human genes which were called DE. Specificity = TN/(TN + FP), and it was fixed to 95% in all calculation. Sensitivity = TP/(TP + FN). False discovery rate = FP/(TP + FP). b DE analysis with 30 pairs of replicates. The sensitivity increased to 95.1%, and the false discovery rate dropped to 14.2%. c DE performance related to replicate number calculated by random downsampling of 30-pair data. Each replicate number was calculated 100 times. Sensitivity and false discovery rate were improved dramatically when the number of replicates increased. d Spearman’s correlations among replicates of Decode-seq and BRB-seq. Each replicate was compared with all other replicates of Decode-seq and BRB-seq, respectively. The distribution of these correlations was shown as the box for each replicate. In the Decode-seq group, replicate has higher correlations with each other, indicating the higher reproducibility. e DE performance of Decode-seq and BRB-seq. Bars in three colors represent DE performance of three sets: Decode-seq, BRB-seq with the same filter parameters as Decode-seq, and BRB-seq with the same total gene as Decode-seq. When using the same filter parameters, BRB-seq detected fewer genes. When using loose filter parameters to ensure the same total gene, BRB-seq gave a lower sensitivity
Fig. 2Differential gene expression analysis of sex determination of medaka. a Expression profiling of 30 pairs of medaka male/female fry with Decode-seq. Top 300 genes were color coded. Blue dots, genes expressed higher in male; red dots, genes expressed higher in female; black circled dots, known genes with dimorphic expression. b qPCR validation of identified DEGs, including 4 known markers and 25 novel DEGs. Error bar: SE. *p < 0.05, **p < 0.01. c Sample size downsampling shows much fewer DEGs would be identified with fewer replicates. d Hybridization chain reaction validation of cd74a. Green signal: vasa, expressed in germ cells; red signal: sox9b, expressed in somatic cells; blue signal: cd74a, expressed in germ cells. Scale bar: 10 μ m. e–h Functional validation of ENSORLG00000007290 using genetic knockout. Medaka fry at stage 39 (hatching) was used. Arrows indicate the GFP-labeled germ cells. Scale bar, 100 μ m. e Wild-type female fry, a cluster of germ cells with strong GFP signal are visible. f Mutant female fry, germ cells are barely visible. g Wild-type male fry, a cluster of germ cells are visible. h Mutant male fry, germ cells are largely depleted