Literature DB >> 29509186

Small RNA-seq analysis of circulating miRNAs to identify phenotypic variability in Friedreich's ataxia patients.

Marta Seco-Cervera1,2,3, Dayme González-Rodríguez2,4, José Santiago Ibáñez-Cabellos1,2,3, Lorena Peiró-Chova2,5, Federico V Pallardó1,2,3, José Luis García-Giménez1,2,3.   

Abstract

Friedreich's ataxia (FRDA; OMIM 229300), an autosomal recessive neurodegenerative mitochondrial disease, is the most prevalent hereditary ataxia. In addition, FRDA patients have shown additional non-neurological features such as scoliosis, diabetes, and cardiac complications. Hypertrophic cardiomyopathy, which is found in two thirds of patients at the time of diagnosis, is the primary cause of death in these patients. Here, we used small RNA-seq of microRNAs (miRNAs) purified from plasma samples of FRDA patients and controls. Furthermore, we present the rationale, experimental methodology, and analytical procedures for dataset analysis. This dataset will facilitate the identification of miRNA signatures and provide new molecular explanation for pathological mechanisms occurring during the natural history of FRDA. Since miRNA levels change with disease progression and pharmacological interventions, miRNAs will contribute to the design of new therapeutic strategies and will improve clinical decisions.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29509186      PMCID: PMC5839159          DOI: 10.1038/sdata.2018.21

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Friedreich’s ataxia (FRDA) is the most common hereditary ataxia, with elevated differences in symptomatology between individuals and within families. Instability of the GAA expansion size is responsible for approximately 50% of the variability found in disease onset[1]. Ataxia stems from spinocerebellar degeneration, peripheral sensory neuropathy, and cerebellar and vestibular pathology, and pyramidal disabilities begin to appear after its onset[2]. Other non-neurological features are related to Friedreich’s ataxia. For example, hypertrophic cardiomyopathy, which is the primary cause of death in these patients, is found very often in FRDA patients. Also, diabetes mellitus[3] and scoliosis are associated with FRDA[4,5]. MicroRNAs (miRNAs) are made up of about 22 nucleotides and are the best-characterized small non-coding RNAs (sncRNAs). There is a continuously increasing number of human transcripts corresponding to miRNAs, and their sequences and annotations have been published[6]. miRNA can target mRNAs and control their degradation when there is enough complementarity between the two or when translational repression of protein expression ocurrs[7,8]. Some miRNAs are released from cells in membrane-bound vesicles which protect them from RNase activity[9], and for this reason miRNAs can be detected in circulating fluids such as plasma, serum, urine and saliva[10-12]. Besides their role in specific tissues, and recently as a stable molecule in circulating fluids[13], miRNAs have been proposed as biomarkers for some diseases such as cancer[14,15], diabetes[16], neurodegenerative diseases[17], etc. A small RNA profiling dataset for FRDA patients and healthy controls was generated to identify different miRNA signatures that could explain physiological and molecular pathways underlying this disease and to help determine the phenotypic variability of patients[18]. We found different expression profiles of miRNAs (hsa-miR-128-3p, hsa-miR-625-3p, hsa-miR-130b-5p, hsa-miR-151a-5p, hsa-miR-330-3p, hsa-miR-323a-3p, and hsa-miR-142-3p) between samples from patients and samples from healthy subjects. In addition, we found that hsa-miR323a-3p is a biomarker for phenotypic differentiation in FRDA patients suffering from cardiomyopathy. To the best of our knowledge, this data set represents the largest public small RNA-seq data on a cohort of FRDA patients. The potential for identifying miRNA signatures in FRDA goes beyond the discovery of physiological and molecular pathways underlying this disease. Understanding the phenotypic variability of patients is also necessary for designing the most appropriate therapy for each of them, according to their specific pattern of disease progression. In this study, blood samples of FRDA patients (e.g.; FRDA patients with cardiomyopathy “N+C”, FRDA patients with diabetes “N+D”, FRDA patients with only neurological features “N”) and healthy controls were processed and plasma samples were obtained. Plasma samples were used to purify small RNA fractions, then were used to construct the small RNA-seq libraries, and finally were sequenced using the Illumina HiScanSQ platform (Figure 1). The sequence reads were mapped against the human Hg38 build (UCSC Genome Browser). After that, the intersection between the aligned position of the reads and the miRNA coordinates taken from miRBase v21 was performed. In short, we have provided a small RNA-seq data resource on FRDA patients, which is useful to understand phenotypical variability of the disease. Furthermore, the data resource provides an opportunity to identify the biomarkers of diagnosis, prognosis, and treatment monitoring in FRDA.
Figure 1

Overview of the study design.

Plasma samples from Friedreich’s Ataxia (n=25) and healthy subjects (n=17) were analyzed. FRDA patients were classified into 3 groups: only neurological disorder (N; n=11), neurological disorder plus cardiomyopathy (N+C; n=9), and neurological disorder plus diabetes (N+D; n=6). One FRDA patient with neurologic symptoms additionally showed both comorbidities, cariomyopahty and diabetes, and thus this patient was classified in both neurological disorder plus cardiomyopathy (N+C), and neurological disorder plus diabetes (N+D) groups. Small RNA from the plasma samples of each FRDA patient and healthy subject was isolated and sequenced to obtain a miRNA expression profile. Next, mapping of the sequencing reads provided whole miRNome status of individual samples. Differential miRNA expression between FRDA patients and healthy subjects and within patients was performed to identify the miRNA signatures of FRDA patients and their concomitant diseases.

Methods

We have already presented some of the methods and tools used in our primary publication[18]. With this paper, we want to expand our previous descriptions and provide a comprehensive resource for reproducing both the experimental and computational analyses. The experimental and analytical procedure we used is described in Figure 1.

Patient and healthy subject recruitment and clinical features

Patients diagnosed with FRDA without neoplastic diseases or active infection were recruited. Data about age, sex, history of diabetes, cardiopathy, and number of GAA repeats in both alleles were recorded (Table 1; n=25). The scale for assessment and rating of ataxia (SARA), a neurological examination-based method to assess the disease, was used to measure the clinical severity of the disease[19]. FRDA patients were enrolled in the study following study approval by the Biomedical Research Ethics Committee (CEIB) of Hospital La Paz (Madrid). Plasma biospecimens from FRDA patients were preserved in a public repository of FRDA in the CIBERER Biobank (www.ciberer-biobank.es; Spanish Biobank Registry number: 000161X02). Healthy volunteers (n=25) with no neoplastic diseases, active infection, cardiomyopathy, heart problems, hypertension, or diabetes were enrolled by the Basque Biobank for Research-OEHUN (www.biobancovasco.org) and by the Biobank for Biomedical Research and Public Health of the Valencian Community (IBSP-CV) through the Spanish National Biobank Network (RNBB 2013/12). The subjects of both groups (healthy volunteers and FRDA patients) were matched by sex and age and were processed identically. The selection process and all experimental methods were carried out in accordance with the relevant clinical guidelines, following standard operation procedures, and with the approval of the ethics and scientific committees. All experimental protocols were approved by the Biomedical Research Ethics Committee (CEIB) of Hospital La Paz (Madrid) and the ethics and scientific committees of the IBSP-CV. Informed consent was obtained from all participants.
Table 1

Description of FRDA patients and healthy control samples.

IDSexAgeNeuropathy (N)Diabetes (N+D)Cardiopathy (N+C)GAA repetitions (allele 1/allele 2)Raw Data File
0: absence of neuropathy, diabetes or cardiopathy. 1: presence of neuropathy (N), diabetes (N+D) or cardiopathy (N+C).       
27FEMALE26100885/115027_ACAGTG_L007_R1_001
38FEMALE32101850/85038_CCGTCC_L007_R1_001
42FEMALE6810175/7542_GTGGCC_L007_R1_001
41FEMALE48100250/25041_GTGAAA_L007_R1_001
2FEMALE39100380/9122_CGATGT_L005_R1_001
6FEMALE5610167/4506_GCCAAT_L005_R1_001
18FEMALE38110500/50018_CCGTCC_L005_R1_001
5FEMALE46100720/9505_ACAGTG_L005_R1_001
25FEMALE46101767/76725_GTGGCC_L005_R1_001
29MALE28100112/11229_CGTACG_L005_R1_001
1MALE34101634/9671_ATCACG_L005_R1_001
4MALE35100700/8344_TGACCA_L005_R1_001
13MALE35100985/98513_GGCTAC_L005_R1_001
30MALE32100480/58030_GAGTGG_L005_R1_001
14MALE4110010/10014_CTTGTA_L005_R1_001
39FEMALE49100314/31439_GTAGAG_L007_R1_001
17MALE47110650/85017_ATGTCA_L005_R1_001
15MALE37100912/108015_AGTCAA_L005_R1_001
16MALE39110600/73416_AGTTCC_L005_R1_001
3MALE52100185/3853_TTAGGC_L005_R1_001
26FEMALE37101350/98026_GCCAAT_L007_R1_001
40FEMALE37110480/58040_GTCCGC_L007_R1_001
43MALE2111125/2543_GTTTCG_L007_R1_001
37MALE191011050/118537_ATGTCA_L007_R1_001
28FEMALE29100634/100028_GTTTCG_L005_R1_001
33FEMALE24000Not referred to33_TGACCA_L007_R1_001
34FEMALE33000Not referred to34_AGTCAA_L007_R1_001
49FEMALE56000Not referred to49_CGATGT_L007_R1_001
46FEMALE53000Not referred to46_GGTAGC_L007_R1_001
47FEMALE38000Not referred to47_ATCACG_L007_R1_001
31FEMALE44000Not referred to31_GGTAGC_L005_R1_001
35MALE30000Not referred to35_AGTTCC_L007_R1_001
10MALE32000Not referred to10_GATCAG_L005_R1_001
22MALE40000Not referred to22_GTGAAA_L005_R1_001
7FEMALE54000Not referred to7_CAGATC_L005_R1_001
20MALE47000Not referred to20_GTAGAG_L005_R1_001
21MALE39000Not referred to21_GTCCGC_L005_R1_001
9MALE51000Not referred to9_ACTTGA_L005_R1_001
11FEMALE37000Not referred to11_TAGCTT_L005_R1_001
50MALE20000Not referred to50_TTAGGC_L007_R1_001
44MALE16000Not referred to44_CGTACG_L007_R1_001
45FEMALE31000Not referred to45_GAGTGG_L007_R1_001

Sample collection and small RNA extraction and quantification

Plasma samples were extracted from FRDA patients and healthy participants. For that, blood samples were collected in EDTA tubes and centrifuged at 2500 rpm for 10 min. Once plasma was obtained, each sample was stored at −80 °C until RNA extraction. 500 μL of plasma were used to isolate cell-free total RNA (including miRNAs) using the miRNeasy Serum/Plasma kit (Qiagen, Valencia, CA, USA) following the manufacturer’s protocol. The RNA was eluted with 25 μL of RNase-free water. The concentration of cell-free total RNA (including miRNAs) was quantified using NanoDrop ND 2000 UV-spectrophotometer (Thermo Scientific, Wilmington, DE, USA).

Small RNA-seq library preparation and sequencing

Small RNA samples were converted to Illumina sequencing libraries using the NEBNext Multiplex Small RNA Library Prep Set for Illumina (Set 1&2) (New England Biolabs, MA, USA), following the manufacturer’s protocol. Briefly, 5′ and 3′ adapters were ligated with small RNA molecules purified from plasma, followed by a cDNA library construction and incorporation of index tags by reverse transcription-PCR (RT-PCR). The products of this RT-PCR were purified using 6% non-denaturing polyacrylamide gel electrophoresis, and then size selection of 145–160 bp fraction was performed. The cDNA library samples were hybridized to a paired end flow cell and individual fragments were clonally amplified by bridge amplification on the Illumina cBot cluster generation. Then, the flow cell was loaded on the HiScanSQ platform and sequenced using Illumina’s sequencing by synthesis chemistry, generating 50 bp single end reads.

Pre-processing and processing of the reads

The quality of the small RNA libraries was first evaluated using FastQC v0.11.5 software (Figure 2). The most important metrics checked were the overall sequence quality: mean of phred quality per base and per read greater than 30; the GC percentage distribution per read: the data (red curve) is expected to approximately follow the theoretical distribution (blue curve) (Figure 2c). The peaks on the left or on the right side are an indicator of the presence of adapters in the reads. We also checked the presence/absence of overrepresented sequences. Based on the results obtained, the sequence reads were trimmed to remove the following adapter: 5′ AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC- NNNNNN- ATCTCGTATGCCGTCTTCTGCTTG 3′ from each sample. The 6 N’s sequence is unique per sample and allows it to be barcoded. After this step, the bases at the end of the sequences with a quality less than 20 were removed. Finally, all the sequences with length less than 18 nucleotides were discarded. These operations were performed using the tool Cutadapt[20]. The remaining sequences were aligned to the human genome reference (Hg38) from the UCSC Genome Browser. The expression of every miRNA per sample was measured using an annotation file from miRBase (v21). It contained all the mature miRNAs known in humans so far. The alignment and quantification steps were performed using the Subread[21] and RSubread[22] packages (http://subread.sourceforge.net/), respectively.
Figure 2

A representative example of quality control metrics of RNA sequenced reads as indicated by FastQC after the preprocessing steps (sample: 11_TAGCTT_L005_R1_001).

(a) Phred quality score distribution over all reads in each base. (b) Quality score distribution over all sequences. (c) GC content (%) distribution over all sequences.

Differential expression analysis

Differential expression analysis between FRDA patients and healthy subjects

Firstly, a differential expression analysis of the miRNAs was performed between patients (n=25) and controls (n=17). In order to do it, we filtered the miRNAs with less than 1 cpm (counts per million) in 17 samples (the size of the smallest group). Subsequently, we performed a correction of factors using the TMM method[23] and calculated the effective library sizes. We also estimated the specific dispersions per miRNA with the quantile-adjusted conditional maximum likelihood (qCML) method[24]. The differential expression analysis was executed using the exact test[24,25]. In addition, we carried out a normalization of the original counts estimating the effective library sizes using the geometric mean. The normalized values of the most significant miRNAs (FDR <1e-4) obtained from the differential expression test were used to assess their correlation. Those miRNAs with a level of correlation lower than 0.7 were used to fit a logistic regression model with LASSO penalty[26]. In order to select the most important miRNAs of the model, a leave one out cross validation was performed. Those miRNAs that had non-zero coefficients at the value of λ and that gave minimum mean cross-validated error were selected.

Differential expression analysis between FRDA patients grouped by phenotype

We divided the FRDA patients into 3 subgroups according to their phenotype, considering the features described in Table 1. Thus, patients were grouped as 1) patients showing only neurological symptoms (n=11); 2) patients showing neurological symptoms “N” and suffering cardiomyopathy “N+C” (n=9), and 3) patients showing neurological symptoms and diabetes “N+D” (n=6). One FRDA patient with neurologic symptoms additionally showed both comorbidities, cardiomyopathy and diabetes, and thus this patient was classified in both the neurological disorder plus cardiomyopathy (N+C), and neurological disorder plus diabetes (N+D) groups. After this stratification, those miRNAs which did not reach 1 cpm (count per million) in at least 5 samples (size of the smallest group) were filtered out. The data were normalized using the TMM method. Afterwards, a Cox-Reid dispersion[27] per miRNA was estimated. To find the differentially expressed miRNAs among the three groups compared, the GLM (generalized linear model) [25] approach was used. Additionally, we performed new analyses taking into account other variables such as age, sex and disease onset. The last variables were organized as a dichotomy variable according to median values: 37 and 13 years, respectively. Finally, we carried out every comparison between the different groups using the GLM approach. All statistical analyses were performed using R software (version 3.2.2) and the following packages: edgeR (version 3.12.0), DESeq (version 1.22.0), caret (version 6.0–58), glmnet (version 2.0-2), ROCR (version 1.0-7).

Data Records

RNA-seq data files in FastQ format were deposited at NCBI Sequence Read Archive (Data Citation 1). This accession contains a total of 42 FastQ files resulting from the single end runs for each of the 42 samples. The FastQ format data serves as the raw data from sequencing, which are subjected to further downstream processing. The processed data were deposited at NCBI Gene Expression Omnibus (Data Citation 2).

Technical Validation

Sequencing quality control

We used FastQC v0.11.5 software to perform quality control assessments of the FastQ files before and after the pre-processing steps (filtering, quality trimming and adapter removal). We analysed several measurements, including the overall sequence quality, the GC percentage distribution (i.e. the proportion of guanine and cytosine bp across the reads) and the presence/absence of overrepresented sequences. A representative summary plot after the pre-processing steps is shown (11_TAGCTT_L005_R1_001). Here, the quality scores per base were high, with a median quality score above 30 suggesting high quality sequences across all bases (Figure 2a). The quality score distribution over all sequences was analyzed to see if a subset of sequences had universally poor quality. The average quality for most sequences was high, with scores above 37, which indicated that a significant proportion of the sequences in a run had overall high quality (Figure 2b). The GC distribution per base over all sequences was examined. Despite the GC composition pattern being more similar to the theoretical distribution after the pre-processing steps, it still had a bias (Figure 2c). In addition, overrepresented sequences were examined. Before the adapter removal, some of them were identified as the Illumina indexed adapters used in the sequencing process. After this step, the adapters were no longer identified but we still had overrepresented sequences, possibly because of highly expressed miRNAs (Table 2). All other FastQC files were shown to have similar quality metrics compared to sample (11_TAGCTT_L005_R1_001).
Table 2

A representative example of quality control metrics of RNA sequenced reads as indicated by FastQC after the preprocessing steps (sample: 11_TAGCTT_L005_R1_001).

SequenceCount%Possible Source
Overrepresented sequences showing count, percentage and possible source for each sequence.   
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG126461241,255TruSeq Adapter, Index 10 (100% over 49 bp)
GGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTC883482,882TruSeq Adapter, Index 10 (100% over 50 bp)
GGCTGGTCCGATGGTAGTGGGTTATCAGAACTAGATCGGAAGAGCACACG534041,742No Hit
TGGAGTGTGACAATGGTGTTTAGATCGGAAGAGCACACGTCTGAACTCCA242030,790Illumina Multiplexing PCR Primer 2,01 (100% over 29 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTGCTTATCTCGTATG183970,600TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGCTTATCTCGTATG98710,322TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATTTCGTATG90350,295TruSeq Adapter, Index 10 (97% over 49 bp)
AAACCGTTACCATTACTGAGTAGATCGGAAGAGCACACGTCTGAACTCCA81330,265Illumina Multiplexing PCR Primer 2,01 (100% over 29 bp)
TCCTGTACTGAGCTGCCCCGAAGATCGGAAGAGCACACGTCTGAACTCCA77120,252Illumina Multiplexing PCR Primer 2,01 (100% over 29 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGGCTTATCTCGTATG72820,238TruSeq Adapter, Index 10 (97% over 49 bp)
CGCGACCTCAGATCAGACGTGGCGACCCGCTGAATTTAGATCGGAAGAGC70560,230No Hit
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTATCTTATCTCGTATG67220,219TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGCTTATCTCGTATG63210,206TruSeq Adapter, Index 10 (97% over 49 bp)
GGCTGGTCCGATGGTAGTGGGTTATCAGAACAGATCGGAAGAGCACACGT61490,201No Hit
TGGAGTGTGACAATGGTGTTTGAGATCGGAAGAGCACACGTCTGAACTCC59770,195Illumina Multiplexing PCR Primer 2,01 (100% over 28 bp)
AGATCGGAAGAGCACACGTCTAAACTCCAGTCACTAGCTTATCTCGTATG59020,193TruSeq Adapter, Index 10 (97% over 49 bp)
GGCTGGTCCGAAGGTAGTGAGTTATCTCAATAGATCGGAAGAGCACACGT57450,187No Hit
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGCTTATCTCGTATG54920,179TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGGTTATCTCGTATG51670,169TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAGCTCCAGTCACTAGCTTATCTCGTATG51630,168TruSeq Adapter, Index 10 (97% over 49 bp)
GGCTGGTCCGATGGTAGTGGGTTATCAGAACCAGATCGGAAGAGCACACG50220,164No Hit
AGGTCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG46970,153TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTGGTTATCTCGTATG44470,145TruSeq Adapter, Index 3 (97% over 37 bp)
AGATCGGAGGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG44250,144TruSeq Adapter, Index 10 (97% over 49 bp)
GGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG42490,139TruSeq Adapter, Index 10 (100% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGATTATCTCGTATG42220,138TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCCGAACTCCAGTCACTAGCTTATCTCGTATG41360,135TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCGCACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG40220,131TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTCATCTCGTATG40150,131TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGCCACTAGCTTATCTCGTATG39970,130TruSeq Adapter, Index 10 (97% over 49 bp)
GGCTGGTCCGATGGTAGTGGGTTATCAGAACCAGATCGGAAGAGCACACG50220,164No Hit
AGGTCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG46970,153TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCGGTCACTAGCTTATCTCGTATG39290,128TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGGAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG38350,125TruSeq Adapter, Index 10 (97% over 49 bp)
GGCTGGTCCGATGGTAGTGGGTTATCAGAACTTAGATCGGAAGAGCACAC38180,125No Hit
AGATCGGAAGAGCACACGTCTGAACCCCAGTCACTAGCTTATCTCGTATG37640,123TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGGACTCCAGTCACTAGCTTATCTCGTATG37250,122TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACGCGTCTGAACTCCAGTCACTAGCTTATCTCGTATG36780,120TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGGGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG35990,117TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTACCTCGTATG34770,113TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGTTTATCTCGTATG32750,107TruSeq Adapter, Index 10 (97% over 49 bp)
AAATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATG31320,102TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCCTATCTCGTATG31290,102TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTAACTTATCTCGTATG31270,102TruSeq Adapter, Index 10 (97% over 49 bp)
AGATCGGAAGAGCACACGCCTGAACTCCAGTCACTAGCTTATCTCGTATG31100,101TruSeq Adapter, Index 10 (97% over 49 bp)

Real-time qPCR validation of selected miRNAS from Small RNA-seq

Reverse transcription reactions were performed using the TaqMan miRNA Reverse Transcription kit and miRNA-specific stem-loop primers (Part No. 4366597, Applied Biosystems. Inc, CA; USA) and 100 ng of input cell-free RNA in a 20 μL RT reaction. Real-time PCR reactions were performed in triplicate, in scaled-down 10 μL reaction volumes using 5 μL TaqMan 2x Universal PCR Master Mix (Applied Biosystems. Inc, CA; USA) with No UNG, 0.5 μL TaqMan Small RNA assay (20x) (Applied Biosystems. Inc, CA; USA) [hsa-miR-128-3p (002216), hsa-miR-625-3p (002432), hsa-miR-130b-5p (002114), hsa-miR-151a-5p (002642), hsa-miR-330-3p (000544), hsa-miR-323a-3p (002227), hsa-miR-142-3p (000464), hsa-miR-16-5p (000391)], 3.5 μL of nuclease free water and 1 μL of RT product. Real-time PCR was carried out on an Applied BioSystems 7900HT thermocycler (Applied Biosystems. Inc, CA; USA) programmed as follows: 50 °C for 2 min, 95 °C for 10 min followed by 45 cycles of 95 °C for 15 s and 60 °C for 1 min. We used hsa-miR-16-5p (000391), one of the most stable miRNAs in terms of read counts, and which has been used previously as an endogenous control[27], to normalize the expression in plasma samples. RNU48 (001006), meanwhile, was used to normalize the expression in cell-line samples. All the fold-change data were obtained using the delta-delta CT method (2−ΔΔCT)[28]. The seven differentially expressed miRNAS detected after small RNA-seq were validated using RT-qPCR. All miRNAs were present in plasma at higher levels in patients (n=25) compared to healthy controls (n=25), in agreement with the results obtained by small RNA-seq. Relative expression levels of the miRNAs in plasma from FRDA patients compared to healthy subjects were shown in Seco-Cervera et al.[18]

Usage Notes

Before processing the raw reads (Data Citation 1) we performed a visual exploration of them by looking for the adapter used in the sequencing process. We saw adapters ligated to the 5′ end for some reads and to the 3′ end in other reads. Despite expecting to always find it at the 3′ end, the opposite situation can sometimes occur. Therefore, we removed the adapter specifying the -b option in Cutadapt. It indicates to the program that the adapter may appear at the beginning (even degraded), within the read, or at the end of the read (even partially). The alignment can be performed using standard tools, such as Bowtie2[29], STAR[30], or Burrows-Wheeler Aligner (BWA)[31]. In our study, we selected the Subread aligner because it is more accurate and faster than previous aligners (nearly four times as fast as the nearest competitor, Bowtie2)[21]. Additionally, the parameters needed when mapping miRNA-seq reads have been well documented. On the other hand, although known miRNA sequences from miRbase can be used as a reference, we suggest using the whole human genome. In this way, the reads aligning to miRNA sequences and to many other features in the genome at the same time can be discarded. The quantification of microRNA expression can be performed using tools like bedtools intersect[32] or featureCounts[22]. In this step, it is important to allow multiple hits of each read when mapping, since there are multiple copies of some microRNAs in the genome and if it is not allowed, the results might be misleading, or wrong. Regarding differential expression analysis, we recommend using the popular R packages EdgeR[25] and DESeq[33]. In the case of using the EdgeR package, it is necessary to filter miRNAs which are not expressed in any condition since they can add some noise to the analysis. Another important aspect to note is the use of an appropriate method according to the different types of comparisons performed. When considering a single study factor, qCML is a good method to estimate the dispersions per miRNA and the exact test to do the differential expression analysis. However, when two or more study factors are included in the analysis, it is highly recommended to estimate dispersions per miRNA with the CR method and to use the likelihood ratio test GLM for differential expression.

Additional information

How to cite this article: Seco-Cervera, M. et al. Small RNA-seq analysis of circulating miRNAs to identify phenotypic variability in Friedreich’s ataxia patients. Sci. Data 5:180021 doi: 10.1038/sdata.2018.21 (2018). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
  28 in total

1.  An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells.

Authors:  S M Hammond; E Bernstein; D Beach; G J Hannon
Journal:  Nature       Date:  2000-03-16       Impact factor: 49.962

2.  Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method.

Authors:  K J Livak; T D Schmittgen
Journal:  Methods       Date:  2001-12       Impact factor: 3.608

3.  Glucose metabolism alterations in Friedreich's ataxia.

Authors:  G Finocchiaro; G Baio; P Micossi; G Pozza; S di Donato
Journal:  Neurology       Date:  1988-08       Impact factor: 9.910

4.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

5.  Typical Friedreich's ataxia without GAA expansions and GAA expansion without typical Friedreich's ataxia.

Authors:  D J McCabe; F Ryan; D P Moore; S McQuaid; M D King; A Kelly; K Daly; D E Barton; R P Murphy
Journal:  J Neurol       Date:  2000-05       Impact factor: 4.849

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

8.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

9.  The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote.

Authors:  Yang Liao; Gordon K Smyth; Wei Shi
Journal:  Nucleic Acids Res       Date:  2013-04-04       Impact factor: 16.971

10.  miRBase: annotating high confidence microRNAs using deep sequencing data.

Authors:  Ana Kozomara; Sam Griffiths-Jones
Journal:  Nucleic Acids Res       Date:  2013-11-25       Impact factor: 16.971

View more
  10 in total

1.  Transcriptomic Analysis Provides Insights to Reveal the bmp6 Function Related to the Development of Intermuscular Bones in Zebrafish.

Authors:  Huan Xu; Guangxiang Tong; Ting Yan; Le Dong; Xiaoxing Yang; Dongyu Dou; Zhipeng Sun; Tianqi Liu; Xianhu Zheng; Jian Yang; Xiaowen Sun; Yi Zhou; Youyi Kuang
Journal:  Front Cell Dev Biol       Date:  2022-05-12

Review 2.  Oxidative Stress, a Crossroad Between Rare Diseases and Neurodegeneration.

Authors:  Carmen Espinós; Máximo Ibo Galindo; María Adelaida García-Gimeno; José Santiago Ibáñez-Cabellos; Dolores Martínez-Rubio; José María Millán; Regina Rodrigo; Pascual Sanz; Marta Seco-Cervera; Teresa Sevilla; Andrea Tapia; Federico V Pallardó
Journal:  Antioxidants (Basel)       Date:  2020-04-15

3.  COMPSRA: a COMprehensive Platform for Small RNA-Seq data Analysis.

Authors:  Jiang Li; Alvin T Kho; Robert P Chase; Lorena Pantano; Leanna Farnam; Sami S Amr; Kelan G Tantisira
Journal:  Sci Rep       Date:  2020-03-12       Impact factor: 4.379

Review 4.  Antioxidant Therapies and Oxidative Stress in Friedreich´s Ataxia: The Right Path or Just a Diversion?

Authors:  Laura R Rodríguez; Tamara Lapeña; Pablo Calap-Quintana; María Dolores Moltó; Pilar Gonzalez-Cabo; Juan Antonio Navarro Langa
Journal:  Antioxidants (Basel)       Date:  2020-07-24

5.  Blood gene expression predicts intensive care unit admission in hospitalised patients with COVID-19.

Authors:  Rebekah Penrice-Randal; Xiaofeng Dong; Andrew George Shapanis; Aaron Gardner; Nicholas Harding; Jelmer Legebeke; Jenny Lord; Andres F Vallejo; Stephen Poole; Nathan J Brendish; Catherine Hartley; Anthony P Williams; Gabrielle Wheway; Marta E Polak; Fabio Strazzeri; James P R Schofield; Paul J Skipp; Julian A Hiscox; Tristan W Clark; Diana Baralle
Journal:  Front Immunol       Date:  2022-09-20       Impact factor: 8.786

6.  A Comprehensive Transcriptome Analysis Identifies FXN and BDNF as Novel Targets of miRNAs in Friedreich's Ataxia Patients.

Authors:  Julia O Misiorek; Anna M Schreiber; Martyna O Urbanek-Trzeciak; Magdalena Jazurek-Ciesiołka; Lauren A Hauser; David R Lynch; Jill S Napierala; Marek Napierala
Journal:  Mol Neurobiol       Date:  2020-04-14       Impact factor: 5.590

7.  liqDB: a small-RNAseq knowledge discovery database for liquid biopsy studies.

Authors:  Ernesto Aparicio-Puerta; David Jáspez; Ricardo Lebrón; Danijela Koppers-Lalic; Juan A Marchal; Michael Hackenberg
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  Assessment of methods for serum extracellular vesicle small RNA sequencing to support biomarker development.

Authors:  Swetha Srinivasan; Manuel X Duval; Vivek Kaimal; Carolyn Cuff; Stephen H Clarke
Journal:  J Extracell Vesicles       Date:  2019-11-05

9.  Iron Hack - A symposium/hackathon focused on porphyrias, Friedreich's ataxia, and other rare iron-related diseases.

Authors:  Gloria C Ferreira; Jenna Oberstaller; Renée Fonseca; Thomas E Keller; Swamy Rakesh Adapa; Justin Gibbons; Chengqi Wang; Xiaoming Liu; Chang Li; Minh Pham; Guy W Dayhoff Ii; Ben Busby; Rays H Y Jiang; Linh M Duong; Luis Tañón Reyes; Luciano Enrique Laratelli; Douglas Franz; Segun Fatumo; Atm Golam Bari; Audrey Freischel; Lindsey Fiedler; Omkar Dokur; Krishna Sharma; Deborah Cragun
Journal:  F1000Res       Date:  2019-07-19

Review 10.  Emerging roles of non-coding RNAs in scoliosis.

Authors:  Zheng Li; Xingye Li; Jianxiong Shen; Lin Zhang; Matthew T V Chan; William K K Wu
Journal:  Cell Prolif       Date:  2019-12-12       Impact factor: 6.831

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.