| Literature DB >> 28364362 |
Kazuma Nakano1, Akino Shiroma2, Makiko Shimoji2, Hinako Tamotsu2, Noriko Ashimine2, Shun Ohki2, Misuzu Shinzato2, Maiko Minami2, Tetsuhiro Nakanishi2, Kuniko Teruya2, Kazuhito Satou2, Takashi Hirano2.
Abstract
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.Entities:
Keywords: De novo assembly; Extra-long reads; PacBio RS II; Structural variations; Targeted sequencing
Mesh:
Year: 2017 PMID: 28364362 PMCID: PMC5486853 DOI: 10.1007/s13577-017-0168-8
Source DB: PubMed Journal: Hum Cell ISSN: 0914-7470 Impact factor: 4.174
List of genomes sequenced on PacBio RS II on the Okinawa genome projects
| Sample name | Methods | Replicon name | Genome length (b) | G+C content (%) | Hard-to-sequence regions | Accession no. | Published year [Ref.] |
|---|---|---|---|---|---|---|---|
|
| PacBio | Chromosome | 4,415,078 | 65.60 | G+C content of 80% region (2,000 bp), 117 sets of >1000-bp identical sequence pairs | AP014573 | 2015 [ |
| Multidrug-resistant | PacBio | Chromosome | 4,000,970 | 39.15 | 41 sets of >1000-bp identical sequence pairs (5355-bp maximum) | AP014649 | 2015 [ |
| Plasmid | 189,354 | 39.53 | AP014650 | ||||
| Multidrug-resistant | PacBio | Chromosome | 6,850,954 | 65.96 | 6 sets of >10,000-bp identical sequence pairs (27,239-bp maximum) | AP014646 | 2016 [ |
|
| PacBio | Chromosome 1 | 4,238,972 | 35.00 | Plasmid, methylation | CP011931 | 2015 [ |
| Chromosome 2 | 358,378 | 34.91 | CP011932 | ||||
| Plasmid pLIMLP1 | 70,055 | 34.54 | CP011933 | ||||
|
| Chromosome 1 | 4,238,922 | 35.00 | CP011934 | |||
| Chromosome 2 | 358,377 | 34.91 | CP011935 | ||||
| Plasmid pLIMLP1 | 70,055 | 34.54 | CP011936 | ||||
|
| PacBio | Chromosome | 1,633,212 | 38.81 | 8227-bp identical pair, methylation | CP006820 | 2014 [ |
|
| Chromosome | 1,637,925 | 38.81 | CP006821 | |||
|
| Chromosome | 1,553,826 | 38.97 | CP006822 | |||
|
| Chromosome | 1,599,700 | 38.80 | G+C content of 28.7% region (2000 bp) | CP006823 | ||
|
| Chromosome | 1,634,852 | 38.83 | CP006824 | |||
|
| Chromosome | 1,595,058 | 38.82 | CP006825 | |||
|
| Chromosome | 1,600,345 | 38.80 | CP006826 | |||
|
| Chromosome | 1,634,875 | 38.83 | CP006827 | |||
| Influenza virus Okinawa strain | PacBio | cDNA | Data not published | Data not published | Full-length sequencing of all eight segments without assembly or resequencing | Data not published | This article |
|
| PacBio | BAC | 124,623 | 70.74 | G+C content of 76.2% region (2000 bp) | LC006086 | 2014 [ |
|
| PacBio | Not opened | Not opened | Not opened | Not opened | LC095592 | 2015 [ |
|
| PacBio | Chromosome | 1,848,756 | 42.1 | 43 sets of >1000-bp identical sequence pairs (3118-bp maximum), G+C content of 26.9% region | CP016028 | 2016 [ |
|
| PacBio, SOLiD3 | Chromosome | 1,451,062 | 47.00 | 39 sets of >1000-bp identical sequence pairs | NZ-AP014563 | 2014 [ |
| Endosymbiont of | PacBio, 454, Sanger, Illumina | Chromosome | 1,469,434 | 38.70 | Subpopulation | AP013042 | 2015 [ |
| Bacterial symbiont “TC1” of | PacBio | Chromosome | 1,586,453 | 32.8 | 207 sets of >1000-bp identical sequence pairs, G+C content of 23.5% region | CP014606 | 2016 [ |
| Plasmid | 35,795 | 29.7 | CP014607 | ||||
|
| PacBio, Illumina | 11 Chromosome | All scaffolds | – | Repetitive regions (50.6% of genome) | AP015034-AP017294 | 2015 [ |
|
| PacBio | Chromosome | 4,793,299 | 52.20 | 5420-bp identical sequence pair | CP009102 | 2014 [ |
| Plasmid | 38,457 | 40.70 | CP009103 | ||||
|
| PacBio | Chromosome | 2,755,072 | 32.86 | 29 sets of >1000-bp identical sequence pairs (,063-bp maximum), tandem repeats (384 bp × 5 copies) | CP011526 | 2015 [ |
| Plasmid | 27,490 | 30.69 | CP011527 | ||||
|
| PacBio | Chromosome | 6,317,050 | 66.52 | 5288-bp identical pair, 183 tandem repeats (246 bp × 20.7 copies) | CP012001 | 2015 [ |
|
| PacBio | Chromosome | 4,142,990 | 27.98 | 86 sets of >1000-bp identical sequence pairs (4911-bp maximum), 380 tandem repeats (369 bp × 8.5 copies maximum), variable number tandem repeat | CP011663 | 2015 [ |
| CM-SJS/TEN-associated | PacBio | Targeted long amplicon | Data not published | Data not published | Targeted region (17 kb) of diploid genome | Data not published | This article |
Fig. 1Targeted sequencing of CM-SJS/TEN-associated IKZF1 SNPs region in Japanese using PacBio RS II. (1) Bacterial artificial chromosome (BAC) (RP11_663L2) sequences, including CM-SJS/TEN-associated IKZF1 3SNPs, were obtained by PacBio RS II sequencing. (2) Targeted region (17 kb) of Refseq (GRCh38_Chr7) sequence was validated by BAC (RP11_663L2) sequences. (3) RefSeq and BAC sequences had no differences in the targeted region (17 kb), including 3 SNPs. (4) Primers were designed based on RefSeq that cover target SNPs, where the expected size of each product was 8 kb. (5) DNA from Japanese (NA18940) cell line was amplified by PCR with the primer pairs. (6) Minimum number of primer pairs (5 products) were selected. (7) Long reads were produced by the PacBio RS II sequencing platform for the selected primer pairs. (8) PacBio RS II single molecule sequencing technology produced maternal and paternal reads separately for an individual. (9) Reference diploid genome sequence was then constructed with the maternal/paternal-molecule-originated reads