| Literature DB >> 29504912 |
Yu-Feng Huang1, Yen-Ju Chen1, Tan-Chi Fan2, Nai-Chuan Chang2, Yi-Jie Chen1, Mohit K Midha1,3, Tzu-Han Chen1, Hsiao-Hsiang Yang1, Yu-Tai Wang4, Alice L Yu2,5, Kuo-Ping Chiu6,7.
Abstract
BACKGROUND: Cell-free circulating DNA (cfDNA) is becoming a useful biopsy for noninvasive diagnosis of diseases. Microbial sequences in plasma cfDNA may provide important information to improve prognosis and treatment. We have developed a stringent method to identify microbial species via microbial cfDNA in the blood plasma of early-onset breast cancer (EOBC) patients and healthy females. Empirically, microbe-originated sequence reads were identified by mapping non-human PE reads in cfDNA libraries to microbial databases. Those mapped concordantly to unique microbial species were assembled into contigs, which were subsequently aligned to the same databases. Microbial species uniquely aligned were identified and compared across all individuals on MCRPM (Microbial CfDNA Reads Per Million quality PE reads) basis.Entities:
Keywords: Cell-free circulating DNA (cfDNA); Microbial cfDNA; Microbial cfDNA reads per million quality PE reads (MCRPM)
Mesh:
Substances:
Year: 2018 PMID: 29504912 PMCID: PMC5836824 DOI: 10.1186/s12920-018-0329-y
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Microbial databases employed in the study
| Bacteria | Fungi | Viruses | |
|---|---|---|---|
| #Contigs | 39,434 | 20 | 0 |
| #Scaffold | 36,076 | 170 | 3 |
| #Chromosome | 978 | 38 | 22 |
| #Complete genome | 6,711 | 7 | 7,175 |
| #Subtotal | 83,199 | 235 | 7,200 |
| #Species | 7,689 | 45 | 7,197 |
| #Sequences | 15,849 | 913 | 9,050 |
| Total #, after plasmid sequences excluded | 9,336 | 913 | 9,050 |
RefSeq genome: April 6, 2017
Library statistics
| Library | BBC (normal) | EJC (normal) | BC0145 (EOBC) | BC0190 (EOBC) | CGBC025 (EOBC) |
|---|---|---|---|---|---|
| Raw PE reads | 384,623,309 | 420,790,943 | 392,322,204 | 434,928,103 | 628,712,403 |
| Quality PE reads | 371,837,085 | 388,551,037 | 375,255,962 | 426,187,686 | 579,192,276 |
| Normalization factor | 372 | 389 | 375 | 426 | 579 |
| hg19-mapped reads (%) | 351,841,885 (94.62%) | 376,097,300 (96.79%) | 345,714,693 (92.13%) | 409,083,110 (95.99%) | 551,181,421 (95.16%) |
| hg19-unmappable PE reads (%) | 19,995,200 (5.38%) | 12,453,737 (3.21%) | 29,541,269 (7.87%) | 17,104,576 (4.01%) | 28,010,855 (4.84%) |
Statistics of contigs and alignment
| BBC | EJC | BC0145 | BC0190 | CGBC025 | ||
|---|---|---|---|---|---|---|
| hg19-unmapped PE reads | 19,995,200 | 12,453,737 | 29,541,269 | 17,104,576 | 28,010,855 | |
| Bacteria | Mapped PE reads (%) | 15,504 (0.08%) | 31,653 (0.25%) | 1,432,423 (4.85%) | 50,310 (0.29%) | 45,858 (0.16%) |
| Fungi | Mapped PE reads (%) | 451 (0.00%) | 456 (0.00%) | 590 (0.00%) | 1,153 (0.01%) | 996 (0.00%) |
| Viruses/Phages | Mapped PE reads (%) | 1,367 (0.01%) | 677 (0.01%) | 1,528 (0.01%) | 26,749 (0.16%) | 1,267 (0.00%) |
| Bacteria | #Contigs | 894 | 1,495 | 7,971 | 2,609 | 2,588 |
| Max contig length | 900 | 2,287 | 16,248 | 2,035 | 3,911 | |
| Min contig length | 64 | 64 | 64 | 64 | 64 | |
| Median contig length | 211 | 227 | 231 | 214 | 220 | |
| N50 | 215 | 266 | 1,616 | 219 | 237 | |
| #contigs w/ size ≥250 bp (%) | 80 (9.0%) | 558 (37.3%) | 3,602 (45.2%) | 333 (12.8%) | 748 (28.9%) | |
| #aligned contigs (also see Table | 54 | 381 | 2,456 | 225 | 451 | |
| Fungi | #contigs | 44 | 38 | 215 | 92 | 78 |
| Max contig length | 315 | 235 | 352 | 427 | 262 | |
| Min contig length | 65 | 64 | 64 | 64 | 64 | |
| Median contig length | 71 | 84 | 71 | 84 | 73.5 | |
| N50 | 99 | 192 | 78 | 192 | 127 | |
| #contigs w/ size ≥250 bp (%) | 2 (4.6%) | 0 (0.0%) | 3 (1.4%) | 3 (3.3%) | 1 (1.3%) | |
| #aligned contigs | 2 | 0 | 0 | 2 | 1 | |
| Viruses/ phages | #Contigs | 75 | 62 | 287 | 155 | 72 |
| Max contig length | 655 | 546 | 336 | 695 | 274 | |
| Min contig length | 64 | 64 | 64 | 64 | 64 | |
| Median contig length | 68 | 84 | 70 | 73 | 74 | |
| N50 | 194 | 210 | 85 | 189 | 126 | |
| #contigs w/ size ≥250 bp (%) | 9 (12.0%) | 7 (11.3%) | 10 (3.5%) | 13 (8.4%) | 1 (1.4%) | |
| #aligned contigs | 7 | 5 | 2 | 10 | 0 | |
Bacterial species identified
| ID | Species | No. of contigs | Total aligned length | Total no. of associated PE reads | Total no. of associated SE reads | Sum of PE and SE reads | MCRPM |
|---|---|---|---|---|---|---|---|
| BBC ctl | (sum/372) | ||||||
| 17 | 5,428 | 69 | 30 | 99 | 0.27 | ||
|
| 2 | 708 | 6 | 35 | 41 | 0.11 | |
|
| 4 | 2,124 | 73 | 70 | 143 | 0.38 | |
|
| 4 | 1,227 | 31 | 12 | 43 | 0.12 | |
| EJC ctl | (sum/389) | ||||||
|
| 12 | 4,345 | 59 | 30 | 89 | 0.23 | |
|
| 231 | 99,403 | 1,719 | 616 | 2,335 |
| |
|
| 4 | 1,948 | 30 | 7 | 37 | 0.10 | |
|
| 17 | 6,186 | 87 | 31 | 118 | 0.30 | |
|
| 4 | 1,770 | 43 | 16 | 59 | 0.15 | |
|
| 4 | 1,314 | 31 | 51 | 82 | 0.21 | |
|
| 2 | 588 | 2 | 48 | 50 | 0.13 | |
|
| 18 | 5,102 | 54 | 18 | 72 | 0.19 | |
|
| 4 | 2,238 | 53 | 14 | 67 | 0.17 | |
|
| 8 | 2,751 | 30 | 15 | 45 | 0.12 | |
|
| 8 | 2,620 | 36 | 63 | 99 | 0.25 | |
|
| 1 | 276 | 17 | 28 | 45 | 0.12 | |
|
| 7 | 2,478 | 20 | 34 | 54 | 0.14 | |
|
| 8 | 2,899 | 37 | 40 | 77 | 0.20 | |
|
| 11 | 3,533 | 27 | 28 | 55 | 0.14 | |
| BC 0145 | (sum/375) | ||||||
|
| 714 | 230,779 | 1,999 | 1,309 | 3,308 |
| |
|
| 2 | 1,824 | 254 | 78 | 332 | 0.89 | |
|
| 6 | 5,093 | 896 | 170 | 1,066 |
| |
|
| 1,675 | 2,678,493 | 918,733 | 109,787 | 1,028,520 |
| |
|
| 3 | 1,195 | 41 | 36 | 77 | 0.21 | |
|
| 3 | 1,717 | 220 | 21 | 241 | 0.64 | |
|
| 34 | 12,418 | 144 | 65 | 209 | 0.56 | |
| BC 0190 | (sum/426) | ||||||
|
| 2 | 1,078 | 29 | 17 | 46 | 0.11 | |
|
| 71 | 24,930 | 366 | 168 | 534 |
| |
|
| 4 | 1,172 | 24 | 76 | 100 | 0.23 | |
|
| 38 | 10,970 | 123 | 72 | 195 | 0.46 | |
|
| 5 | 2,599 | 67 | 28 | 95 | 0.22 | |
|
| 7 | 2,432 | 52 | 88 | 140 | 0.33 | |
|
| 6 | 2,195 | 38 | 31 | 69 | 0.16 | |
|
| 7 | 2,370 | 19 | 53 | 72 | 0.17 | |
|
| 17 | 5,030 | 63 | 44 | 107 | 0.25 | |
| CGBC 025 | (sum/579) | ||||||
|
| 24 | 6,818 | 118 | 48 | 166 | 0.29 | |
|
| 7 | 2,059 | 41 | 16 | 57 | 0.10 | |
|
| 8 | 3,140 | 108 | 33 | 141 | 0.24 | |
|
| 8 | 5,088 | 262 | 43 | 305 | 0.53 | |
|
| 336 | 157,346 | 5,397 | 2,503 | 7,900 |
| |
|
| 4 | 1,261 | 36 | 30 | 66 | 0.11 | |
|
| 51 | 21,410 | 720 | 271 | 991 |
|
Only microbial species with MCRPM ≥ 0.1 are listed. Those with MCRPM ≥ 1 are listed in bold.
Ctl control; MCRPM, microbial cfDNA per million quality PE reads; sp. (same as spp.), species with unspecified species name
Fig. 1Workflow showing the stepwise procedure of sequence data processing leading to the identification of microbes in the body