| Literature DB >> 31114675 |
Abstract
The NCBI Sequence Read Archive (SRA) is the primary archive of next-generation sequencing datasets. SRA makes metadata and raw sequencing data available to the research community to encourage reproducibility and to provide avenues for testing novel hypotheses on publicly available data. However, methods to programmatically access this data are limited. We introduce the Python package, pysradb, which provides a collection of command line methods to query and download metadata and data from SRA, utilizing the curated metadata database available through the SRAdb project. We demonstrate the utility of pysradb on multiple use cases for searching and downloading SRA datasets. It is available freely at https://github.com/saketkc/pysradb.Entities:
Keywords: GEO; NCBI; NGS; SRA; bioinformatics; metadata
Mesh:
Year: 2019 PMID: 31114675 PMCID: PMC6505635 DOI: 10.12688/f1000research.18676.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
| study_accession | experiment_accession | sample_accession | run_accession |
| DRP003075 | DRX019536 | DRS026974 | DRR021383 |
| DRP003075 | DRX019537 | DRS026982 | DRR021384 |
| DRP003075 | DRX019538 | DRS026979 | DRR021385 |
| DRP003075 | DRX019540 | DRS026984 | DRR021387 |
| DRP003075 | DRX019541 | DRS026978 | DRR021388 |
| DRP003075 | DRX019543 | DRS026980 | DRR021390 |
| DRP003075 | DRX019544 | DRS026981 | DRR021391 |
| ERP013565 | ERX1264364 | ERS1016056 | ERR1190989 |
| study_accession | experiment_accession | sample_accession | run_accession |
| SRP010679 | SRX118285 | SRS290854 | SRR403882 |
| SRP010679 | SRX118286 | SRS290855 | SRR403883 |
| SRP010679 | SRX118287 | SRS290856 | SRR403884 |
| SRP010679 | SRX118288 | SRS290857 | SRR403885 |
| SRP010679 | SRX118289 | SRS290858 | SRR403886 |
| SRP010679 | SRX118290 | SRS290859 | SRR403887 |
| SRP010679 | SRX118291 | SRS290860 | SRR403888 |
| SRP010679 | SRX118292 | SRS290861 | SRR403889 |
| SRP010679 | SRX118293 | SRS290862 | SRR403890 |
| SRP010679 | SRX118294 | SRS290863 | SRR403891 |
| SRP010679 | SRX118295 | SRS290864 | SRR403892 |
| SRP010679 | SRX118296 | SRS290865 | SRR403893 |
| study_accession | experiment_accession | sample_accession | run_accession | sample_attribute |
| SRP010679 | SRX118285 | SRS290854 | SRR403882 | source_name: PC3 human
|
| SRP010679 | SRX118286 | SRS290855 | SRR403883 | source_name: PC3 human
|
| SRP010679 | SRX118287 | SRS290856 | SRR403884 | source_name: PC3 human
|
| SRP010679 | SRX118288 | SRS290857 | SRR403885 | source_name: PC3 human
|
| ... [truncated] | ||||
| run_accession | cell_line | sample_type | source_name | treatment |
| SRR403882 | pc3 | polya rna | pc3 human prostate cancer cells | vehicle |
| SRR403883 | pc3 | ribosome protected rna | pc3 human prostate cancer cells | vehicle |
| SRR403884 | pc3 | polya rna | pc3 human prostate cancer cells | rapamycin |
| SRR403885 | pc3 | ribosome protected rna | pc3 human prostate cancer cells | rapamycin |
| SRR403886 | pc3 | polya rna | pc3 human prostate cancer cells | pp242 |
| SRR403887 | pc3 | ribosome protected rna | pc3 human prostate cancer cells | pp242 |
| SRR403888 | pc3 | polya rna | pc3 human prostate cancer cells | vehicle |
| SRR403889 | pc3 | ribosome protected rna | pc3 human prostate cancer cells | vehicle |
| SRR403890 | pc3 | polya rna | pc3 human prostate cancer cells | rapamycin |
| SRR403891 | pc3 | ribosome protected rna | pc3 human prostate cancer cells | rapamycin |
| SRR403892 | pc3 | polya rna | pc3 human prostate cancer cells | pp242 |
| SRR403893 | pc3 | ribosome protected rna | pc3 human prostate cancer cells | pp242 |
| 999 | Bisulfite-Seq |
| 768 | ChIP-Seq |
| 121 | OTHER |
| 353 | RNA-Seq |
| 28 | WGS |
| study_alias | study_accession |
| GSE24355 | SRP003870 |
| GSE25842 | SRP005378 |
| study_alias | study_accession | experiment_accession | sample_accession | experiment_alias | sample_alias |
| GSE100007 | SRP109126 | SRX2916198 | SRS2282390 | GSM2667747 | GSM2667747 |
| GSE100007 | SRP109126 | SRX2916199 | SRS2282391 | GSM2667748 | GSM2667748 |
| GSE100007 | SRP109126 | SRX2916200 | SRS2282392 | GSM2667749 | GSM2667749 |
| GSE100007 | SRP109126 | SRX2916201 | SRS2282393 | GSM2667750 | GSM2667750 |
| GSE100007 | SRP109126 | SRX2916202 | SRS2282394 | GSM2667751 | GSM2667751 |
| GSE100007 | SRP109126 | SRX2916203 | SRS2282395 | GSM2667752 | GSM2667752 |
| GSE100007 | SRP109126 | SRX2916204 | SRS2282396 | GSM2667753 | GSM2667753 |
| GSE100007 | SRP109126 | SRX2916205 | SRS2282397 | GSM2667754 | GSM2667754 |
| GSE100007 | SRP109126 | SRX2916206 | SRS2282400 | GSM2667755 | GSM2667755 |
| study_alias | experiment_alias |
| GSE41637 | GSM1020640_1 |
| GSE41637 | GSM1020641_1 |
| GSE41637 | GSM1020642_1 |
| GSE41637 | GSM1020643_1 |
| GSE41637 | GSM1020644_1 |
| GSE41637 | GSM1020645_1 |
| GSE41637 | GSM1020646_1 |
| GSE41637 | GSM1020647_1 |
| GSE41637 | GSM1020648_1 |
| study_alias | experiment_alias | sample_attribute |
| GSE41637 | GSM1020640_1 | source_name: mouse_brain || strain: DBA/2J || tissue: brain |
| GSE41637 | GSM1020641_1 | source_name: mouse_colon || strain: DBA/2J || tissue: colon |
| GSE41637 | GSM1020642_1 | source_name: mouse_heart || strain: DBA/2J || tissue: heart |
| GSE41637 | GSM1020643_1 | source_name: mouse_kidney || strain: DBA/2J || tissue: kidney |
| GSE41637 | GSM1020644_1 | source_name: mouse_liver || strain: DBA/2J || tissue: liver |
| GSE41637 | GSM1020645_1 | source_name: mouse_lung || strain: DBA/2J || tissue: lung |
| GSE41637 | GSM1020646_1 | source_name: mouse_skm || strain: DBA/2J || tissue: skeletal muscle |
| GSE41637 | GSM1020647_1 | source_name: mouse_spleen || strain: DBA/2J || tissue: spleen |
| GSE41637 | GSM1020648_1 | source_name: mouse_testes || strain: DBA/2J || tissue: testes |
| study_alias | experiment_alias | source_name | strain | tissue |
| GSE41637 | GSM1020640_1 | mouse_brain | dba/2j | brain |
| GSE41637 | GSM1020641_1 | mouse_colon | dba/2j | colon |
| GSE41637 | GSM1020642_1 | mouse_heart | dba/2j | heart |
| GSE41637 | GSM1020643_1 | mouse_kidney | dba/2j | kidney |
| GSE41637 | GSM1020644_1 | mouse_liver | dba/2j | liver |
| GSE41637 | GSM1020645_1 | mouse_lung | dba/2j | lung |
| GSE41637 | GSM1020646_1 | mouse_skm | dba/2j | skeletal muscle |
| experiment_alias | run_accession |
| GSM1020640_1 | SRR594393 |
| GSM1020646_1 | SRR594399 |