| Literature DB >> 29875375 |
Jessy Vibin1,2, Anthony Chamings1,2, Fiona Collier1,2,3, Marcel Klaassen2, Tiffanie M Nelson1,2, Soren Alexandersen4,5,6.
Abstract
We present an optimised metagenomics method for detection and characterisation of all virus types including single and double stranded DNA/RNA and enveloped and non-enveloped viruses. Initial evaluation included both spiked and non-spiked bird faecal samples as well as non-spiked human faecal samples. From the non-spiked bird samples (Australian Muscovy duck and Pacific black ducks) we detected 21 viruses, and we also present a summary of a few viruses detected in human faecal samples. We then present a detailed analysis of selected virus sequences in the avian samples that were somewhat similar to known viruses, and had good quality (Q20 or higher) and quantity of next-generation sequencing reads, and was of interest from a virological point of view, for example, avian coronavirus and avian paramyxovirus 6. Some of these viruses were closely related to known viruses while others were more distantly related with 70% or less identity to currently known/sequenced viruses. Besides detecting viruses, the technique also allowed the characterisation of host mitochondrial DNA present and thus identifying host species, while ribosomal RNA sequences provided insight into the "ribosomal activity microbiome"; of gut parasites; and of food eaten such as plants or insects, which we correlated to non-avian host associated viruses.Entities:
Mesh:
Year: 2018 PMID: 29875375 PMCID: PMC5989203 DOI: 10.1038/s41598-018-26851-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Method variations with a different combination of virus enrichment techniques carried out. The figure gives the flowchart of the different virus enrichment techniques used among the six variations A to F for enriching the faecal sample with virus particles. The sample preparation, homogenisation, centrifugation and filtration using 0.8 µm PES filter to remove larger particles, nuclease treatment and finally nucleic acid extraction remained constant in each variation. However, detergent treatment, ultracentrifugation and filtration using 0.45 µm filter were tried in different combinations to identify the optimal method for virus detection from faecal samples.
Viruses detected in the Muscovy duck (MUD) faecal sample.
| Virus | Family | Characteristics | NCBI virus reference sequence used for mapping | Percentage of identity to the closest virus for individual NGS reads | Likely source |
|---|---|---|---|---|---|
| Ngewotan virus (Nam Dinh like virus) |
| Monopartite, linear ssRNA (+) genome; enveloped, spherical, about 60–80 nm in diameter. | MF176279 | 96–100% | Mosquitoes |
| Hubei chryso- like virus 1 | unclassified RNA viruses; | dsRNA genome; 4 segments | MF176261-MF176264 | 99–100% | Mosquitoes |
| Culex Negev-like virus 3 (Biggie/Goutanap virus like) | unclassified RNA viruses; Negev virus related | ssRNA positive-strand genome | MF176277 | 92–99% | Mosquitoes |
| Virus related to Hubei reo-like virus 7 | unclassified RNA viruses | dsRNA genome | KX884635 | 88–95% | Mosquitoes |
| Israeli acute paralysis like virus |
| Monopartite, linear ssRNA(+) genome; non enveloped, icosahedral capsid, about 30 nm in diameter | NC009025 | 98–99% | Bees |
| Virus related to invertebrate iridescent virus 30 |
| Linear, dsDNA genome, polyhedral virions, capsid present, envelope may or may not be present | NC023611 | 77–92% | Moths |
| black grass cryptic like virus 2 like virus | Segmented dsRNA genome other details unknown | NC026799 | 76–98% | Plants | |
| Virus related to Hordeum vulgare endornavirus |
| Linear dsRNA genome, No true capsid. | NC028949 | 82% | Plants |
| Enterobacteria phage phi 92 |
| Linear, dsDNA genome; Non-enveloped, head-tail structure | FR775895 | 97–99% | Bacteria |
This table displays the viruses detected and characterised in the MUD faecal sample. The virus family and its characteristics are shown. We identified these viruses using the reference sequences mentioned in column 4 to which the NGS reads were mapped. We found how much the reads of our viruses are identical to the closest virus from the NCBI dataset using MEGA 6 or 7 software. However, for individual reads generated by NGS, we used BLASTN to identify the closest virus from the NCBI dataset. Their likely source of origin in our sample was determined using the NGS reads generated and correlating them from literature.
Viruses detected in the six juvenile Pacific black duck (MAD) faecal sample pool.
| Virus | Family | Characteristics | NCBI virus reference sequence used for mapping | Percentage of identity to the closest virus for individual NGS reads | Likely source |
|---|---|---|---|---|---|
| Avian paramyxovirus 6 |
| Negative-stranded RNA linear genome; enveloped, spherical. Diameter of about 150 nm. | AB759118 | 95–98% | Host |
| Gammacoronavirus |
| Monopartite, linear ssRNA(+) genome; enveloped, spherical, about 120 nm in diameter | KM454473 | 81–100% | Host |
| Deltacoronavirus |
| Monopartite, linear ssRNA(+) genome; enveloped, spherical, about 120 nm in diameter | NC016994 | 82–89% | Host |
| Virus related to chicken/duck/goose megrivirus | Monopartite, linear ssRNA(+) genome; non-enveloped, spherical, about 30 nm in diameter | KC663628 and NC023857 | 77–83% | Host | |
| Rotavirus G |
| Segmented linear dsRNA genome; non-enveloped with a double capsid structure | NC021580 | 80–89% | Host |
| Virus related to goose adenovirus 4 and duck adenovirus 2 |
| Non-segmented, linear dsDNA; non-enveloped capsid with a pseudo T = 25 icosahedral symmetry | NC024486 and NC017979 | 75–87% | Host |
| Virus related to duck dependovirus/AAV |
| Linear, ssDNA genome, Non-enveloped, round, T = 1 icosahedral symmetry, 18–26 nm in diameter with capsid | KX583629 | 88% | Host |
| Avian encephalomyelitis virus |
| Monopartite, linear ssRNA(+) genome; non-enveloped, spherical, about 30 nm in diameter | AY517471 | 82–86% | Host |
| Avian calicivirus: related to chicken/goose calicivirus |
| Monopartite, linear ssRNA(+) genome, non-enveloped, capsid of about 27–40 nm in diameter, with T = 3 icosahedral symmetry | NC024078 | 70–84% | Host |
| Virus related to Hubei picorna-like virus 19 | Unclassified RNA viruses | positive sense ssRNA genome | KX883724 | 79–82% | Leech |
| Virus related to Hubei picorna-like virus 51 | Unclassified RNA viruses | positive sense ssRNA genome | KX883953 | 98% | Dragonfly |
| Bacteriophage related to Enterobacteria phage N4 |
| Linear, dsDNA genome; non-enveloped, head-tail structure | EF056009 | 79–87% | Bacteria |
This table displays the viruses detected and characterised in the MAD faecal sample pool. The virus family and its characteristics are shown. We identified these viruses using the reference sequences mentioned in column 4 to which the NGS reads were mapped. We found how much the reads of our viruses are identical to the closest virus from the NCBI dataset using MEGA 6 or 7 software. However, for individual reads generated by NGS, we used BLASTN to identify the closest virus from the NCBI dataset. Their likely source of origin in our sample was determined using the NGS reads generated and correlating them from literature.
Description of the sequences on the representative phylogenetic trees generated for four selected viruses analysed using MEGA 6 or 7 software.
| Figure and Table | Long Name (Format: Sample-virus-protein-length-quality-coverage-year) | Short name (Format: Sample-virus-protein-length) | Coverage | No. of nucleotides | Mapping quality threshold | NCBI accession number |
|---|---|---|---|---|---|---|
|
| ||||||
| 2 and 4 | MAD-Avian-paramyxovirus-6-large-polymerase-protein-459nt-Q32-C-9-192-2016 | MAD-APMV6-Pol-459nt | 9–192 | 459 | 32 | MH000419 |
| 3 and 5 | MAD-Avian-paramyxovirus-6-hemagglutinin-neuraminidase-1839nt-Q32-C-5-103_2016 | MAD-APMV6-HN-1839nt | 5–103 | 1839 | 32 | MH000415 |
| 4 and 6 | MAD-Avian-paramyxovirus-6-fusion-protein-651nt-Q32-C-2-8-2016 | MAD-APMV6-FP-651nt | 2–8 | 651 | 32 | MH000412 |
|
|
|
|
| |||
|
| ||||||
| AB759118-Avian-paramyxovirus-6-viral-cRNA-complete-genome-strain:red-necked-stint/Japan/8KS0813/2008 | AB759118-APMV6-JP | Japan | 2008 | |||
| GQ406232-Avian-paramyxovirus-6-strain-duck/Italy/4524-2/07-complete-genome | GQ406232-APMV6-IT | Italy | 2007 | |||
| KP762799-Avian-paramyxovirus-6-isolate-red-crested-pochard/Balkhash/5842/2013-complete-genome | KP762799-APMV6-KZ | Kazakhstan | 2013 | |||
| AY029299-Avian-paramyxovirus-6-complete-genome | AY029299-APMV6-TW | Taiwan | — | |||
| KT962980-Avian-paramyxovirus-6-isolate-teal/Novosibirsk_region/455/2009-complete-genome | KT962980-APMV6-RU | Russia | 2009 | |||
| JN571486-Avian-paramyxovirus-6-strain-APMV6/mallard/Belgium/12245/07-nucleoprotein(NP)-phosphoprotein(P)-matrix-protein(M)-fusion-protein(F)-small-hydrophobic-protein(SH)-hemagglutinin-neuramis | JN571486-APMV6-BE | Belgium | 2007 | |||
| KF267717-Avian-paramyxovirus-6-isolate-mallard/Jilin/127/2011-complete-genome | KF267717-APMV6-CN | China | 2011 | |||
|
| ||||||
| 5 and 8 | MAD-Deltacoronavirus-Orf1a-10650nt-Q20-C-6-2326−2016 | MAD-DCoV-Orf1a-10650nt | 6–2326 | 10650 | 20 | MH013332 |
| 6 and 9 | MAD-Deltacoronavirus-Orf1b-polymerase-2076nt-Q20-C-5-475-2016 | MAD-DCoV-RPP-2076nt | 5–475 | 2076 | 20 | MH013331 |
| 7 and 10 | MAD-Deltacoronavirus-spike-glycoprotein-3702nt-Q20-C-36-6274-2016 | MAD-DCoV-SP-3702nt | 36–6274 | 3702 | 20 | MH013337 |
|
| ||||||
| JQ065049-Common-moorhen coronavirus-HKU21-strain-HKU21-8295-complete-genome | JQ065049-MCoV-CN | China | 2007 | |||
| JQ065046-Magpie-robin-coronavirus HKU18-strain-HKU18-chu3-complete-genome | JQ065046-MrCoV-CN | China | 2007 | |||
| FJ376622-Munia-coronavirus-HKU13-3514-complete-genome | FJ376622-MuCoV-CN | China | 2007 | |||
| FJ376621-Thrush-coronavirus-HKU12-600-complete-genome | FJ376621-TCoV-CN | China | 2007 | |||
| MF431743-Porcine-deltacoronavirus-strain-SD-complete-genome | MF431743-PDCoV-CN | China | 2014 | |||
| FJ376620-Bulbul-coronavirus-HKU11-796-complete-genome | FJ376620-BCoV-CN | China | 2007 | |||
| JQ065047-Night-heron-coronavirus-HKU19-strain-HKU19-6918-complete genome | JQ065047-NhCoV-CN | China | 2007 | |||
| JQ065048-Wigeon-coronavirus-HKU20-strain-HKU20-9243-complete-genome | JQ065048-WCoV-CN | China | 2008 | |||
|
|
|
|
|
|
|
|
|
| ||||||
| GoA4 | ||||||
| 8 and 11 | MAD-Adenovirus-encapsidation-protein-IVa2-279nt-Q20-C-4-69-2016 | MAD-AV-IVa2-279nt | 4–69 | 279 | 20 | MH028885 |
| DuA2 | ||||||
| 9 and 12 | MAD-Adenovirus-III-177nt-Q20-C-21-261-2016 | MAD-AV-III-177nt | 21–261 | 177 | 20 | MH028886 |
| 10 and 13 | MAD-Adenovirus-pVIII-114nt-Q32-C-19-67-2016 | MAD-AV-pVIII-114nt | 19–67 | 114 | 32 | MH028887 |
|
| ||||||
| KR135164-Duck-adenovirus-2-strain-CH-GD-12-2014-complete-genome | KR135164-DAd2-CN | China | 2014 | |||
| JF510462-Goose-adenovirus-4-strain-P29-complete-genome | JF510462-GAd4-HU | Hungary | — | |||
| FN824512-Pigeon-adenovirus-1-complete-genome-strain-IDA4 | FN824512-PAd1-NL | Netherlands | 1995 | |||
| KC493646-Fowl-adenovirus-5-strain-340-complete-genome | KC493646-FAd5-IE | Ireland | 1970 | |||
|
| ||||||
| 11 and 14 | MUD-Hubei-chryso-like-virus-1-seg1-RdRp-1496nt-Q20-C-4-58-2016 | MUD-HCLV1-s1-Rp-1496nt | 4–58 | 1496 | 20 | MH085092 |
|
|
|
|
| |||
|
| ||||||
| Segment 1 | ||||||
| MF176368-Hubei-chryso-like-virus-1-strain-mosWSgb49785-segment1-complete-sequence | MF176368-HCLV1-s1-WA | Western Australia | 2015 | |||
| MF176309-Hubei-chryso-like-virus-1-strain-mos191gb77171-segment1-complete-sequence | MF176309-HCLV1-s1-WA | Western Australia | 2015 | |||
| MF176261-Hubei-chryso-like-virus-1-strain-mos172gb42656-segment1-complete-sequence | MF176261-HCLV1-s1-WA | Western Australia | 2015 | |||
| MF176388-Hubei-chryso-like-virus-1-strain-mosWSX51080-segment-1-complete-sequence | MF176388-HCLV1-s1-WA | Western Australia | 2015 | |||
| MF176280-Hubei-chryso-like-virus-1-strain-mos172X13576-segment1-complete-sequence | MF176280-HCLV1-s1-WA | Western Australia | 2015 | |||
| KX882962-Hubei-chryso-like-virus-1-strain-mosHB233224-hypothetical-protein-gene-partial-cds | KX882962-HCLV1-CN | China | 2013 | |||
The table gives details of the sequences used for phylogenetic analysis by the Maximum Likelihood method of APMV6, DCoV, AV and HCLV1. The first half of the table for each virus provides details to identify the sample from which the virus was isolated, protein encoded by the consensus sequences that are being analysed, the length of the consensus sequences and gene being analysed, minimum mapping quality threshold used in IGV for the generation of the consensus sequences, the coverage of the consensus sequences from IGV, the year of collection of the sample and the NCBI accession number assigned to the particular consensus sequence. Corresponding short names have been used in the phylogenetic trees which contain the sample, virus, protein and the number of nucleotides. The second half of the table for each virus provides the details of the sequences that were used for the comparative molecular phylogenetic analysis. They were found to be the most closely related sequences to the consensus sequences generated using BLASTN. Corresponding short names have been used in the phylogenetic trees which contain the NCBI accession number, virus and the country of collection. The collection date is given as retrieved from the corresponding NCBI nucleotide reference dataset which together with the country of collection can provide an insight into the possible evolution of the virus through the years.
Figure 2Molecular Phylogenetic analysis by Maximum Likelihood method of APMV6 partial Pol gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura 3-parameter model[70]. The tree with the highest log likelihood (−1237.61) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.5082)). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 8 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 459 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
|
| |||||||
| AB759118-APMV6-JP | 12 | ||||||
| GQ406232-APMV6-IT | 15 | 3 | |||||
| KP762799-APMV6-KZ | 114 | 113 | 114 | ||||
| AY029299-APMV6-TW | 116 | 115 | 116 | 15 | |||
| KT962980-APMV6-RU | 115 | 114 | 115 | 12 | 9 | ||
| JN571486-APMV6-BE | 114 | 115 | 116 | 18 | 11 | 8 | |
| KF267717-APMV6-CN | 116 | 115 | 116 | 5 | 12 | 9 | 15 |
The number of base differences between sequences are shown. The analysis involved 8 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 459 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Figure 3Molecular Phylogenetic analysis by Maximum Likelihood method of APMV6 complete HN gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura 3-parameter model[70]. The tree with the highest log likelihood (−5515.26) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 54.95% sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 8 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 1842 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
|
| |||||||
| AB759118-APMV6-JP | 52 | ||||||
| GQ406232-APMV6-IT | 54 | 24 | |||||
| KP762799-APMV6-KZ | 532 | 541 | 537 | ||||
| AY029299-APMV6-TW | 532 | 541 | 537 | 47 | |||
| KT962980-APMV6-RU | 536 | 542 | 538 | 35 | 44 | ||
| JN571486-APMV6-BE | 535 | 540 | 538 | 72 | 50 | 63 | |
| KF267717-APMV6-CN | 537 | 544 | 540 | 24 | 51 | 39 | 74 |
The number of base differences between sequences are shown. The analysis involved 8 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 1842 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Figure 4Molecular Phylogenetic analysis by Maximum Likelihood method of APMV6 partial FP gene The evolutionary history was inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model[71]. The tree with the highest log likelihood (−1837.17) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.4985)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 59.29% sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 8 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 651 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
|
| |||||||
| AB759118-APMV6-JP | 20 | ||||||
| GQ406232-APMV6-IT | 21 | 7 | |||||
| KP762799-APMV6-KZ | 174 | 174 | 174 | ||||
| AY029299-APMV6-TW | 172 | 175 | 173 | 14 | |||
| KT962980-APMV6-RU | 174 | 176 | 174 | 12 | 6 | ||
| JN571486-APMV6-BE | 169 | 176 | 174 | 32 | 22 | 26 | |
| KF267717-APMV6-CN | 173 | 173 | 173 | 4 | 14 | 12 | 32 |
The number of base differences between sequences are shown. The analysis involved 8 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 651 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Assembled deltacoronavirus contigs divided into partial protein-coding genes identified using BLASTX and compared to the closest related deltacoronaviruses in the NCBI database.
| Length nucleotides | Percentage Identity nucleotides* | Percentage identity amino acids* | |
|---|---|---|---|
| Orf1a (NSP 3, 4, 5, 7, 8, 9 & 10) | 10650 | 51.3– | 41.2– |
| Orf1bStart (NSP 10, 12) | 1095 | 60.5– | 61.5– |
| Orf1b-pol (NSP 12) | 2076 | 67.4– | 76.6– |
| Orf1b (NSP11) | 1914 | 63.4– | 65.6– |
| Orf1b (NSP11 & 13) | 2619 | 58.6– | 58.1– |
| Spike-S1-S2 | 3702 | 56.5– | 48.8– |
| E | 261 | 53.6– | 40.7– |
| M | 654 | 56.4– | 53.2– |
| NS6 | 276 | 49.3– | 38.0– |
The table displays the percentage identity of nucleotides and amino acids of generated deltacoronavirus contigs to the closest related deltacoronaviruses in the NCBI database (Wigeon deltacoronavirus JQ065048). The percentage was determined using MEGA 6 software. *The highest percentage identity is indicated in boldface and was consistently found when compared to the Wigeon deltacoronavirus JQ065048.
Figure 5Molecular Phylogenetic analysis by Maximum Likelihood method of DCoV Orf1a gene The evolutionary history was inferred by using the Maximum Likelihood method based on the General Time Reversible model[72]. The tree with the highest log likelihood (−87186.89) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 1.4644)). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 9 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 10137 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Figure 7Molecular Phylogenetic analysis by Maximum Likelihood method of DCoV partial SP gene The evolutionary history was inferred by using the Maximum Likelihood method based on the General Time Reversible model[72]. The tree with the highest log likelihood (−30555.18) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 1.7612)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 11.02% sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 9 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 3312 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
| JQ065049-MCoV-CN | ||||||||
| JQ065046-MrCoV-CN | 3984 | |||||||
| FJ376622-MuCoV-CN | 3769 | 2302 | ||||||
| FJ376621-TCoV-CN | 3388 | 3579 | 3397 | |||||
| MF431743-PDCoV-CN | 3814 | 2979 | 2991 | 3311 | ||||
| FJ376620-BCoV-CN | 3500 | 3464 | 3356 | 2534 | 3223 | |||
| JQ065047-NhCoV-CN | 4883 | 5173 | 5054 | 4962 | 5073 | 4838 | ||
| JQ065048-WCoV-CN | 4855 | 5174 | 5085 | 4987 | 5107 | 4956 | 5052 | |
|
| 4782 | 5188 | 5127 | 4983 | 5148 | 4980 | 5100 | 4240 |
The number of base differences between sequences are shown. The analysis involved 9 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 10137 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Estimates of Evolutionary Divergence between Sequences.
| JQ065049-MCoV-CN | ||||||||
| JQ065046-MrCoV-CN | 1521 | |||||||
| FJ376622-MuCoV-CN | 1431 | 1531 | ||||||
| FJ376621-TCoV-CN | 1428 | 1547 | 1504 | |||||
| MF431743-PDCoV-CN | 1447 | 1566 | 1040 | 1493 | ||||
| FJ376620-BCoV-CN | 1341 | 1543 | 1102 | 1484 | 1058 | |||
| JQ065047-NhCoV-CN | 1535 | 1509 | 1590 | 1507 | 1552 | 1564 | ||
| JQ065048-WCoV-CN | 1489 | 1507 | 1533 | 1551 | 1544 | 1542 | 1599 | |
|
| 1548 | 1563 | 1612 | 1598 | 1594 | 1602 | 1593 | 1427 |
The number of base differences between sequences are shown. The analysis involved 9 nucleotide sequences. Codon positions included were 1st+2nd+3rd+ Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 3312 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Figure 8Molecular Phylogenetic analysis by Maximum Likelihood method of AV partial IVa2 gene The evolutionary history was inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model[71]. The tree with the highest log likelihood (−1171.71) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.8585)). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 5 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 279 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
| KR135164-DAd2-CN | ||||
| JF510462-GAd4-HU | 53 | |||
| FN824512-PAd1-NL | 83 | 89 | ||
| KC493646-FAd5-IE | 78 | 79 | 66 | |
|
| 69 | 50 | 112 | 90 |
The number of base differences between sequences are shown. The analysis involved 5 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 279 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Figure 9Molecular Phylogenetic analysis by Maximum Likelihood method of AV partial III gene The evolutionary history was inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model[71]. The tree with the highest log likelihood (−805.13) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+ G, parameter = 1.2536)). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 5 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 177 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
| KR135164-DAd2-CN | ||||
| JF510462-GAd4-HU | 40 | |||
| FN824512-PAd1-NL | 61 | 76 | ||
| KC493646-FAd5-IE | 55 | 50 | 63 | |
|
| 34 | 37 | 78 | 54 |
The number of base differences between sequences are shown. The analysis involved 5 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 177 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Figure 10Molecular Phylogenetic analysis by Maximum Likelihood method of AV partial pVIII gene The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura 3-parameter model[70]. The tree with the highest log likelihood (−415.13) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 62.84% sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 5 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 114 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
| KR135164-DAd2-CN | ||||
| JF510462-GAd4-HU | 13 | |||
| FN824512-PAd1-NL | 27 | 22 | ||
| KC493646-FAd5-IE | 29 | 33 | 28 | |
|
| 15 | 18 | 28 | 30 |
The number of base differences between sequences are shown. The analysis involved 5 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 114 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Figure 11Molecular Phylogenetic analysis by Maximum Likelihood method of HCLV1 partial Rp gene The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura 3-parameter model[70]. The tree with the highest log likelihood (−2284.45) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 7 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 1496 positions in the final dataset. Evolutionary analyses were conducted in MEGA7[63].
Estimates of Evolutionary Divergence between Sequences.
|
| ||||||
| MF176388-HCLV1-s1-WA | 2 | |||||
| MF176368-HCLV1-s1-WA | 2 | 2 | ||||
| MF176309-HCLV1-s1-WA | 2 | 2 | 0 | |||
| MF176280-HCLV1-s1-WA | 2 | 2 | 2 | 2 | ||
| MF176261-HCLV1-s1-WA | 2 | 2 | 0 | 0 | 2 | |
| KX882962-HCLV1-CN | 38 | 38 | 38 | 38 | 38 | 38 |
The number of base differences between sequences are shown. The analysis involved 7 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 1496 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Estimates of Evolutionary Divergence between Sequences.
| JQ065049-MCoV-CN | ||||||||
| JQ065046-MrCoV-CN | 497 | |||||||
| FJ376622-MuCoV-CN | 465 | 269 | ||||||
| FJ376621-TCoV-CN | 434 | 403 | 356 | |||||
| MF431743-PDCoV-CN | 504 | 323 | 327 | 388 | ||||
| FJ376620-BCoV-CN | 447 | 393 | 366 | 287 | 385 | |||
| JQ065047-NhCoV-CN | 590 | 613 | 577 | 597 | 585 | 573 | ||
| JQ065048-WCoV-CN | 572 | 656 | 625 | 615 | 634 | 599 | 628 | |
|
| 566 | 676 | 629 | 618 | 659 | 627 | 629 | 557 |
The number of base differences between sequences are shown. The analysis involved 9 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 2076 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.