| Literature DB >> 25918685 |
Hongming Wang1, Yongxin Yu1, Taigang Liu2, Yingjie Pan3, Shuling Yan4, Yongjie Wang3.
Abstract
Two genomic fragments (5,662 and 1,269 nt in size, GenBank accession no. JQ756122 and JQ756123, respectively) of novel, positive-strand RNA viruses that infect archaea were first discovered in an acidic hot spring in Yellowstone National Park (Bolduc et al., 2012). To investigate the diversity of these newly identified putative archaeal RNA viruses, global metagenomic datasets were searched for sequences that were significantly similar to those of the viruses. A total of 3,757 associated reads were retrieved solely from the Yellowstone datasets and were used to assemble the genomes of the putative archaeal RNA viruses. Nine contigs with lengths ranging from 417 to 5,866 nt were obtained, 4 of which were longer than 2,200 nt; one contig was 204 nt longer than JQ756122, representing the longest genomic sequence of the putative archaeal RNA viruses. These contigs revealed more than 50% sequence similarity to JQ756122 or JQ756123 and may be partial or nearly complete genomes of novel genogroups or genotypes of the putative archaeal RNA viruses. Sequence and phylogenetic analyses indicated that the archaeal RNA viruses are genetically diverse, with at least 3 related viral lineages in the Yellowstone acidic hot spring environment.Entities:
Keywords: Putative archaeal RNA viruses; Sequence assembly; Viral diversity; Yellowstone acidic hot spring
Year: 2015 PMID: 25918685 PMCID: PMC4405519 DOI: 10.1186/s40064-015-0973-z
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Figure 1Schematic presentation of the sequence assembly procedures.
Data on the metagenomic assembly of nine novel genomic sequences of putative archaeal RNA viruses
|
|
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
| 1 | 5,866 | 3,273 | 5,344 | 98.4 | 195.5 | 8 | 463 | 50.6 |
| 2 | 2,929 | 1,437 | 2,551 | 97.7 | 169.2 | 6 | 361 | 50.5 |
| 3 | 2,439 | 142 | 2,397 | 98.8 | 21.5 | 2 | 45 | 49.6 |
| 4 | 2,241 | 99 | 2,202 | 98.6 | 16.5 | 2 | 40 | 52.0 |
| 5 | 986 | 17 | 970 | 97.3 | 5.8 | 2 | 13 | 55.1 |
| 6 | 863 | 20 | 851 | 98.7 | 8.1 | 2 | 16 | 53.5 |
| 7 | 663 | 72 | 647 | 97.9 | 36.4 | 7 | 62 | 49.6 |
| 8 | 631 | 11 | 529 | 99.0 | 6.6 | 1 | 11 | 50.4 |
| 9 | 417 | 4 | 315 | 99.2 | 3.3 | 1 | 4 | 54.9 |
Figure 2Schematic illustration of sequence similarity (Red, 90-100%; blue, 70-90%; and gray 50-70%) between the 9 contigs and JQ756122 (A) / JQ756123 (B) and the alignment position. The squares represent reverse repeat sequences, while the dots represent palindromic sequences. Repeat sequences of the same color represent the same repeat sequences. The RNA dependent RNA polymerase gene is labeled with arrow box. RT, reverse transcriptase_like family domain; CP, capsid protein domain described in Bolduc et al..
Repeat sequences in the genomic sequences of putative archaeal RNA viruses
|
|
|
|
|
|
|---|---|---|---|---|
| JQ756122 | R1 | 16 |
| 2.10e-3 |
| R2 | 15 |
| 8.40e-3 | |
| 1 | R1 | 16 |
| 2.25e-3 |
| R2 | 15 |
| 9.01e-3 | |
| 2 | R1 | 16 |
| 5.62e-4 |
| R2 | 15 |
| 2.25e-3 | |
| 3 | P1 | 14 |
| 6.23e-3 |
| 5 | P2 | 14 |
| 1.02e-3 |
| 6 | P3 | 14 |
| 7.80e-4 |
| R3 | 13 |
| 3.12e-3 | |
| 7 | R4 | 18 |
| 1.80e-6 |
| 9 | R5 | 14 |
| 1.82e-4 |
R represents reverse repeat sequences. P represents palindromic repeat sequences. The arrows indicate repeat units.
Figure 3Unrooted phylogenetic tree (maximum likelihood; model: HKY85; 1000 bootstrap replicates) showing 3 lineages of the putative archaeal RNA viruses as marked in different background colors.