| Literature DB >> 29064471 |
Kate V Atkinson1, Lisa A Bishop1,2, Glenn Rhodes3, Nicolas Salez4, Neil R McEwan5, Matthew J Hegarty5, Julie Robey6, Nicola Harding6, Simon Wetherell6, Robert M Lauder1, Roger W Pickup1, Mark Wilkinson2, Derek Gatherer1.
Abstract
Nasopharyngeal swabs were taken from volunteers attending a general medical practice and a general hospital in Lancaster, UK, and at Lancaster University, in the winter of 2014-2015. 51 swabs were selected based on high RNA yield and allocated to deep sequencing pools as follows: patients with chronic obstructive pulmonary disease; asthmatics; adults with no respiratory symptoms; adults with feverish respiratory symptoms; adults with respiratory symptoms and presence of antibodies against influenza C; paediatric patients with respiratory symptoms (2 pools); adults with influenza C infection (2 pools), giving a total of 9 pools. Illumina sequencing was performed, with data yields per pool in the range of 345.6 megabases to 14 gigabases after removal of reads aligning to the human genome. The data were deposited in the Sequence Read Archive at NCBI, and constitute a resource for study of the viral, bacterial and fungal metagenome of the human nasopharynx in healthy and diseased states and comparison with other metagenomic studies on the human respiratory tract.Entities:
Mesh:
Year: 2017 PMID: 29064471 PMCID: PMC5654362 DOI: 10.1038/sdata.2017.161
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Deep sequencing pools A-I, their sources and references.
| The number of bases per sequencing pool is given in gigabases (G) or megabases (M) for those pools generating less than 1G after removal of reads of human genome origin. COPD: chronic obstructive pulmonary disease. IgG: immunoglobulin G. | ||||||||
|---|---|---|---|---|---|---|---|---|
| A | SAMN05954283 | Paediatric, Low RNA | Coryza | 3 | SRX2310763 | SRS1768710 | SRR4733499 | 1.1 G |
| B | SAMN05954284 | Paediatric, High RNA | Coryza | 2 | SRX2310764 | SRS1768711 | SRR4733500 | 964.8 M |
| C | SAMN05954285 | Adult, High fluC IgG | Coryza | 10 | SRX2310765 | SRS1768712 | SRR4733501 | 345.6 M |
| D | SAMN05954286 | Adult | Fever | 8 | SRX2310766 | SRS1768713 | SRR4733502 | 714.3 M |
| E | SAMN05954287 | Adult | Asymptomatic | 10 | SRX2310759 | SRS1768706 | SRR4733495 | 909.7 M |
| F | SAMN05954288 | Adult | Asthmatic | 10 | SRX2310760 | SRS1768707 | SRR4733496 | 477.5 M |
| G | SAMN05954289 | Adult | COPD | 6 | SRX2310761 | SRS1768708 | SRR4733497 | 1.5 G |
| H | SAMN05954290 | adult | Influenza C positive | 1 | SRX2310762 | SRS1768709 | SRR4733498 | 14 G |
| I | SAMN05954291 | Adult | Influenza C positive | 1 | SRX2310758 | SRS1768705 | SRR4733494 | 1.2 G |
Figure 1Clinical flowchart.
From 148 nasopharyngeal swabs, 51 were chosen for allocation to 7 symptom groups, of which 2 were divided into two separate runs, making a total of 9 deep sequencing pools.
Figure 2Read processing flowchart.
The raw reads were cleaned and then subjected to sequential alignments to 3 versions of the human genome, with mapped reads discarded at each stage. The software used at each stage is shown.
Proportion of common reagent contaminants per sequencing pool.
| The pools are labelled A-I as in | |||
|---|---|---|---|
| A | 21 | 0.3 | 0.6 |
| B | 14 | 0.2 | 0.5 |
| C | 3.0 | 0.1 | 0.1 |
| D | 20 | 0.4 | 0.7 |
| E | 12 | 0.6 | 1.2 |
| F | 0.9 | 0.1 | 0.2 |
| G | 9.1 | 0.7 | 1.0 |
| H | 16 | 0.2 | 0.5 |
| I | 12 | 0.2 | 0.5 |