| Literature DB >> 28053876 |
Jonas Bybjerg-Grauholm1, Christian Munch Hagen1, Sok Kean Khoo2, Maria Louise Johannesen1, Christine Søholm Hansen1, Marie Bækvad-Hansen1, Michael Christiansen3, David Michael Hougaard1, Mads V Hollegaard1.
Abstract
Neonatal dried blood spots (DBS) are routinely collected on standard Guthrie cards for all-comprising national newborn screening programs for inborn errors of metabolism, hypothyroidism and other diseases. In Denmark, the Guthrie cards are stored at - 20 °C in the Danish Neonatal Screening Biobank and each sample is linked to elaborate social and medical registries. This provides a unique biospecimen repository to enable large population research at a perinatal level. Here, we demonstrate the feasibility to obtain gene expression data from DBS using next-generation RNA sequencing (RNA-seq). RNA-seq was performed on five males and five females. Sequencing results have an average of > 30 million reads per sample. 26,799 annotated features can be identified with 64% features detectable without fragments per kilobase of transcript per million mapped reads (FPKM) cutoff; number of detectable features dropped to 18% when FPKM ≥ 1. Sex can be discriminated using blood-based sex-specific gene set identified by the Genotype-Tissue Expression consortium. Here, we demonstrate the feasibility to acquire biologically-relevant gene expression from DBS using RNA-seq which provide a new avenue to investigate perinatal diseases in a high throughput manner.Entities:
Keywords: DBS, Dried blood spots; DNSB, Danish Neonatal Screening Biobank; Dried blood spots; GTEx, Genotype-Tissue Expression consortium; Gene expression; Guthrie Cards; Limited material; Neonatal Screening; RNA-seq
Year: 2016 PMID: 28053876 PMCID: PMC5198792 DOI: 10.1016/j.ymgmr.2016.12.004
Source DB: PubMed Journal: Mol Genet Metab Rep ISSN: 2214-4269
RNA input and RNA-seq data of 10 dB (5 females and 5 males).
| Sample ID | Input RNA | Read length range | Average read length | Raw bases Q10 + | Raw bases Q20 + | Raw bases Q30 + | Read count | Total sequence generated |
|---|---|---|---|---|---|---|---|---|
| PKU_F_0 | 314 | 32–75 | 74.51 | 99.99 | 99.75 | 93.18 | 41,164,517 | 3,067,015,571 |
| PKU_F_1 | 70 | 32–75 | 74.48 | 99.99 | 99.52 | 93.64 | 49,891,753 | 3,715,888,452 |
| PKU_F_2 | 40 | 32–75 | 74.57 | 99.98 | 99.74 | 93.99 | 47,901,645 | 3,571,868,192 |
| PKU_F_3 | 40 | 35–75 | 74.41 | 99.87 | 99.42 | 89.84 | 28,409,088 | 2,113,940,315 |
| PKU_F_4 | < 0.2 | 35–75 | 63.65 | 76.11 | 51.85 | 44.73 | 4,520,783 | 287,745,978 |
| PKU_M_0 | 60 | 32–75 | 74.48 | 99.99 | 99.68 | 93.28 | 45,422,192 | 3,382,887,870 |
| PKU_M_1 | 500 | 32–75 | 74.51 | 99.99 | 99.81 | 94.51 | 19,988,616 | 1,489,344,449 |
| PKU_M_2 | 1000 | 35–75 | 74.44 | 99.88 | 99.44 | 88.95 | 58,476,079 | 4,352,678,456 |
| PKU_M_3 | 1000 | 35–75 | 74.54 | 99.96 | 99.48 | 89.81 | 20,580,446 | 1,533,986,335 |
| PKU_M_4 | 1000 | 35–75 | 74.54 | 99.97 | 99.48 | 89.82 | 10,949,721 | 816,165,766 |
Supplementary Fig. 1A) Group level FPKM box plots. B) Sample level FPKM box plots. C) Group level density plot. D) Sample level density plot. Variation exists within the generated data set especially at the level ofindividual samples. Looking at the group level performance is homogenous with average read counts of 34.4 ± 18.7 million and 31.1 ± 20.0 million (Mean ± SD), respectively for female and male.
Supplementary Fig. 2Based on the gene level data we calculated A) Multi-Dimensional Scaling B) Principle component analysis. Both dimensional reductions agree that PKU_M_3 and PKU_F_1 are outliers. PKU_M_0 is an outlayer but unique to A.
Fig. 1A) Hierarchical clustering of FPKM values based on differentially-expressed genes between males and females in blood genes identified by the GTEx consortium. B) Heatmap of FPKM values based on differentially-expressed genes between males and females in blood genes identified by the GTEx consortium.
Fig. 2Hierarchical clustering (HC) and heat maps (HM) based on a gene set enrichment of hemoglobin species. A) HC of all the identified blood genes. B) HM of all the identified blood genes. C) HC of the identified blood genes, excluding HBA1. D) HM of all the identified blood genes, excluding HBA1. E) HC based on the entire gene set (unfiltered).