| Literature DB >> 26115441 |
Albert Geskin1, Elizabeth Legowski1, Anish Chakka2, Uma R Chandran2, M Michael Barmada3, William A LaFramboise2, Jeremy Berg4, Rebecca S Jacobson5.
Abstract
Next Generation Sequencing (NGS) methods are driving profound changes in biomedical research, with a growing impact on patient care. Many academic medical centers are evaluating potential models to prepare for the rapid increase in NGS information needs. This study sought to investigate (1) how and where sequencing data is generated and analyzed, (2) research objectives and goals for NGS, (3) workforce capacity and unmet needs, (4) storage capacity and unmet needs, (5) available and anticipated funding resources, and (6) future challenges. As a precursor to informed decision making at our institution, we undertook a systematic needs assessment of investigators using survey methods. We recruited 331 investigators from over 60 departments and divisions at the University of Pittsburgh Schools of Health Sciences and had 140 respondents, or a 42% response rate. Results suggest that both sequencing and analysis bottlenecks currently exist. Significant educational needs were identified, including both investigator-focused needs, such as selection of NGS methods suitable for specific research objectives, and program-focused needs, such as support for training an analytic workforce. The absence of centralized infrastructure was identified as an important institutional gap. Key principles for organizations managing this change were formulated based on the survey responses. This needs assessment provides an in-depth case study which may be useful to other academic medical centers as they identify and plan for future needs.Entities:
Mesh:
Year: 2015 PMID: 26115441 PMCID: PMC4483235 DOI: 10.1371/journal.pone.0131166
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Number of respondents by use of NGS.
| Label | Description | n (%) |
|---|---|---|
|
| Currently using NGS data OR plan to in next 2 years |
|
|
|
|
|
|
|
|
|
|
| Not currently using NGS data and don’t plan to in next 2 years |
|
Publicly available NGS datasets used for analysis (Current Users).
| Dataset | n (%) |
|---|---|
| TCGA | 11/32 (34%) |
| NCBI | 2/32 (6%) |
| SRA | 2/32 (6%) |
| 1000 Genomes | 2/32 (6%) |
| Numerous | 6/32 (19%) |
| Other | 9/32 (28%) |
| Not specified | 1/32 (3%) |
Reasons respondents are unable to sequence samples and analyze sequences (All Users).
| Category | Reason Cited by Respondent | n (%) |
|---|---|---|
|
| Cost/limited funds | 20/31 (65%) |
| Resources at university | 7/31 (23%) | |
| Time/waiting for results | 2/31 (6%) | |
| Other | 2/31 (6%) | |
|
| Lack of Expertise | 8/28 (29%) |
| Time | 6/28 (21%) | |
| Lack of help/support | 3/28 (11%) | |
| Lack of resources | 3/28 (11%) | |
| Funding | 2/28 (7%) | |
| Too much data | 2/28 (7%) | |
| Not complete dataset/recent acquisition of data | 2/28 (7%) | |
| Low throughput by collaborator | 1/28 (4%) | |
| Ongoing | 1/28 (4%) |
Research objectives of survey respondents (All Users).
| Research Objectives | n (%) |
|---|---|
| Cancer disease-specific variants, structural variation, or copy-number changes | 38/104 (37%) |
| Non-cancer disease-specific variants, structural variation, or copy-number changes | 41/104 (39%) |
| Population biology | 8/104 (8%) |
| Evolutionary biology | 8/104 (8%) |
| Metagenomics | 6/104 (6%) |
| DNA modification | 10/104 (10%) |
| Protein-DNA binding | 18/104 (17%) |
| Discovery of novel transcripts (gene discovery) | 22/104 (21%) |
| Discovery of novel splice forms | 13/104 (13%) |
| Small RNA discovery | 18/104 (17%) |
| Gene expression | 56/104 (54%) |
| Systems modeling and prediction | 23/104 (22%) |
| Other | 17/104 (16%) |
Applications and platforms/methods identified as best to suit objectives (All Users).
| Category | Applications and Platforms/Methods | n (%) |
|---|---|---|
|
| Targeted sequencing (Ampli-Seq or Target Seq) | 43/104 (41%) |
| Whole exome sequencing | 38/104 (37%) | |
| Whole genome sequencing | 41/104 (39%) | |
| RNAseq for gene expression | 68/104 (65%) | |
| RNAseq for intron splice junctions (novel RNA discovery) | 14/104 (13%) | |
| RNAseq for miRNA | 28/104 (27%) | |
| MethylSeq | 22/104 (21%) | |
| CHiPSeq | 28/104 (27%) | |
| Not sure | 9/104 (9%) | |
| Other | 8/104 (8%) | |
|
| Ion semiconductor (Ion Torrent sequencing | 29/104 (28%) |
| Pyrosequencing (Roche 454) | 13/104 (13%) | |
| Sequencing by synthesis (Illumina: HiSeq or MiSeq) | 49/104 (47%) | |
| Sequencing by ligation (Life SOLiD sequencing) | 7/104 (7%) | |
| Chain termination (Sanger sequencing) | 11/104 (11%) | |
| Not sure | 47/104 (45%) | |
| Other | 8/104 (8%) |
Characteristics of Next Generation Sequencing workforce.
| Category | Training and Skills | n (%) |
|---|---|---|
|
| Entirely self-taught | 8/20 (40%) |
| Bioinformatics short course | 7/20 (35%) | |
| Masters in bioinformatics, computational biology, computer science, or a related field | 8/20 (40%) | |
| PhD in bioinformatics, computational biology, computer science, or a related field | 10/20 (50%) | |
|
| Unix and shell scripting | 24/42 (57%) |
| Object oriented programming | 15/42 (36%) | |
| Database development and management | 15/42 (36%) | |
| Statistical programming | 22/42 (52%) | |
| Not sure | 13/42 (31%) | |
|
| Unix and shell scripting | 15/23 (65%) |
| Object oriented programming | 16/23 (70%) | |
| Database development and management | 13/23 (57%) | |
| Statistical programming | 17/23 (74%) | |
| Other (genetic and medical models) | 1/23 (4%) |
Current and future storage methods for NGS data (Current Users).
| Category | Storage Method | n (%) |
|---|---|---|
|
| External hard drive | 40/65 (62%) |
| Servers (total) | 44/65 (68%) | |
| Servers in lab | 26/65 (40%) | |
| Servers outside lab | 27/65 (42%) | |
| Cloud storage | 6/65 (9%) | |
|
| External hard drive | 53/65 (82%) |
| Servers (total) | 47/64 (73%) | |
| Servers in lab | 39/65 (60%) | |
| Servers outside lab | 23/64 (36%) | |
| Cloud storage | 22/65 (34%) |
Funding allotted per year for Next Generation Sequencing (Current Users).
| Category | Funding | n (%) Past 3 Years | n (%) Next 3 Years |
|---|---|---|---|
|
| None | 15/68 (22%) | 6/68 (9%) |
| Less than $10,000 | 11/68 (16%) | 6/68 (9%) | |
| $10,000-$49,999 | 28/68 (41%) | 35/68 (51%) | |
| $50,000-$99,999 | 7/68 (10%) | 13/68 (19%) | |
| $100,000-$250,000 | 5/68 (7%) | 6/68 (9%) | |
| More than $250,000 | 2/68 (3%) | 2/68 (3%) | |
|
| None | 12/68 (18%) | 5/68 (7%) |
| Less than $10,000 | 30/68 (44%) | 22/68 (32%) | |
| $10,000-$49,999 | 16/68 (24%) | 22/38 (32%) | |
| $50,000-$99,999 | 5/68 (7%) | 7/68 (10%) | |
| $100,000-$250,000 | 3/68 (4%) | 10/68 (15%) | |
| More than $250,000 | 2/68 (3%) | 2/68 (3%) |
Challenges to use of Next Generation Sequencing (All Users).
| Category | Challenges | Average Difficulty |
|---|---|---|
|
| Cost | 3.7 |
| Finding a person to perform the analysis | 3.6 | |
| Access to computing power to perform the analysis | 3.4 | |
| Rapidly changing tools | 3.3 | |
| Management of the data | 3.2 | |
| Availability of storage space | 3.2 | |
| Lack of standardization of data formats | 3.1 | |
| Data transfer (networking) | 3.1 | |
| Difficulty of using open source software | 3.1 | |
| Compliance with regulations and policies | 2.8 | |
| Access control/security | 2.7 | |
| Other | 1.0 | |
|
| Data analysis and construction | 3.5 |
| Moving the data along the workflow | 3.1 | |
| Storage | 3.0 | |
| Sharing the data with collaborators | 2.7 | |
| Sequencing | 2.6 | |
| Sample prep or library construction | 2.4 | |
|
| Data transfer issues | 3.4 |
| Cost | 3.3 | |
| Security | 3.2 | |
| Knowledge | 3.2 | |
| Availability | 3.0 | |
| Not advanced enough | 2.9 | |
| Other | 2.3 |