| Literature DB >> 23555956 |
Liuying Tang1, Junjing An, Zhengde Xie, Shoaleh Dehghan, Donald Seto, Wenbo Xu, Yixin Ji.
Abstract
The genome of HAdV-B14p1 strain BJ430, isolated from a six-month-old baby diagnosed with bronchial pneumonia at the Beijing Children's Hospital in December 2010, was sequenced, analyzed, and compared with reference adenovirus genome sequences archived in GenBank. This genome is 34,762 bp in length, remarkably presenting 99.9% identity with the genome from HAdV14p1 strain 303600, which was isolated in the USA (2006). Even more remarkable, it is 99.7% identical with the HAdV-B14p (prototype "de Wit" strain) genome, isolated from The Netherlands in 1955. The patient and its parents presumably had no or limited contact with persons from the USA and Ireland, both of which reported outbreaks of the re-emergent virus HAdV-14p1 recently. These genome data, its analysis, and this report provide a reference for any additional HAdV-B14 outbreak in China and provide the basis for the development of adenovirus vaccines and molecular pathogen surveillance protocols in high-risk areas.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23555956 PMCID: PMC3612040 DOI: 10.1371/journal.pone.0060345
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Annotation of coding sequences from the genome of HAdV-B14p1 strain BJ430.
| Gene | Product | Location |
| E1A | 29.1 k Da protein | 586–1165, 1250–1458 |
| E1A | 25.7 kDa protein | 586–1072, 1250–1458 |
| E1A | 6.5 kDa protein | 586–660, 1250–1354 |
| E1B | 20 kDa protein | 1628–2170 |
| E1B | 54.9 kDa protein | 1933–3417 |
| pIX | pIX protein | 3500–3919 |
| IVa2 | IVa2 protein | 3987–5320, 5596–5608 (-) |
| E2B | DNA polymerase | 5087–8659, 13644–13652 |
| Hypothetical protein | L1 agnoprotein | 7865–8275 |
| E2B | pTP protein | 8458–10419, 13644–13652 |
| VA-RNA I | VA-RNA I | 10451–10612 |
| L1 | 43 kDa protein | 10669–11829 |
| L1 | protein IIIa precursor | 11855–13618 |
| L2 | penton protein | 13701–15377 |
| L2 | protein VII | 15382–15960 |
| L2 | protein V precursor | 16003–17058 |
| L2 | protein X | 17087–17317 |
| L3 | protein VI | 17398–18138 |
| L3 | hexon protein | 18254–21091 |
| L3 | 23 kDa protein | 21128–21757 |
| E2A | DNA binding protein | 21835–23391 (-) |
| L4 | 100 kDa hexon-assembly associated protein | 23422–25860 |
| L4 | 22 kDa protein | 25592–26167 |
| L4 | 33 kDa protein | 25592–25910, 26080–26441 |
| L4 | protein VIII | 26491–27174 |
| E3 | 11.7 kDa protein | 27174–27491 |
| E3 | 14.6 kDa protein | 27445–27840 |
| E3 | 18.4 kDa protein | 27825–28325 |
| E3 | 20.1 kDa protein | 28345–28890 |
| E3 | 20.8 kDa protein | 28908–29459 |
| E3 | 10.1 kDa protein | 29502–29777 |
| E3 | 14.9 kDa protein | 29782–30186 |
| E3 | 15 kDa protein | 30179–30586 |
| U | U protein | 30610–30774 (-) |
| L5 | fiber protein | 30789–31760 |
| E4 | Orf6/7 protein | 31796–32047, 32770–32943 (-) |
| E4 | Orf6 protein | 32044–32943(-) |
| E4 | Orf4 protein | 32846–33214(-) |
| E4 | Orf3 protein | 33223–33576(-) |
| E4 | Orf2 protein | 33573–33962(-) |
| E4 | Orf1 protein | 34005–34382(-) |
Nucleotide locations indicate the start and stop codons, with coding sequences transcribed from the complementary strand designated by “(-)”, e.g., “5596–5608 (-)”. Coding sequences with spliced regions are indicated by double entries in the location column.
Comparison of non-coding sequence motifs between HAdV-B14p1 strains BJ430 and 303600.
| MOTIF | function | BJ430 | 303600 | bp |
| CATCAT...TGACGT | inverted terminal repeat | 1–136 | 1–136 | 136 |
|
| DNApol-pTP bingding site | 9–18 | 9–18 | 10 |
|
| NFI bingding site | 26–39 | 26–39 | 14 |
| ctgtgtgg | Sp1 recognition site | 72–79 | 72–79 | 8 |
| TATTTA | TATA signal for E1A | 493–498 | 493–498 | 6 |
| AATAAA | polyA signal for E1A | 1517–1522 | 1517–1522 | 6 |
| TATATA | TATA signal for E1B | 1574–1579 | 1574–1579 | 6 |
| AGTAAA | polyA signal for E1B | 3467–3472 | 3467–3472 | 6 |
| TAAGGT | TATA signal for IX | 3415–3420 | 3415–3420 | 6 |
| AAAAAT | polyA signal for IX | 3470–3475 | 3470–3475 | 6 |
| TTGATT | polyA signal for IVa2 | 3963–3968 | 3963–3968 | 6 |
|
| inverted CAAT box for MLP | 5858–5867 | 5858–5867 | 10 |
| TTCACGTGA | upstream element for MLP | 5877–5885 | 5878–5886 | 9 |
|
| MAZ bingding site for MLP | 5898–5907 | 5899–5908 | 10 |
| TATAAAA | TATA signal for major late promoter(MLP) | 5908–5914 | 5909–5915 | 7 |
|
| MAZ/SP1 bingding site for MLP | 5915–5925 | 5916–5926 | 11 |
| TCACTGT | initiator element for MLP | 5937–5943 | 5938–5944 | 7 |
|
| DE1 for MLP | 6024–6034 | 6025–6035 | 11 |
|
| DE2a & DE2b for MLP | 6039–6054 | 6040–6055 | 16 |
| AATAAA | polyA signal for L1 | 13626–13631 | 13624–13629 | 6 |
| AATAAA | polyA signal for L2 | 17337–17342 | 17335–17340 | 6 |
| AATAAA | polyA signal for L3 | 21781–21786 | 21780–21785 | 6 |
| AATAAA | polyA signal for E2A | 21793–21798(-) | 21792–21797(-) | 6 |
| TATAAA | TATA box for E3 | 26856–26861 | 26855–26860 | 6 |
| AATAAA | polyA signal for L4 | 27335–27340 | 27334–27339 | 6 |
| AATAAA | polyA signal for E3 | 30597–30602 | 30597–30602 | 6 |
| AATAAA | polyA signal for L5 | 31763–31768 | 31763–31768 | 6 |
| AATAAA | polyA signal for E4 | 31779–31784 | 31779–31784 | 6 |
| TATATATA | TATA box for E4 | 34453–34460 | 34454–34461 | 8 |
| AACCTC…ATGATG | inverted terminal repeat | 34626–34762 | 34627–34763 | 137 |
Nucleotide signatures and putative functions are indicated, along with locations noted for the genome.
Figure 1Genome organization of HAdV-B14p1 coding sequences.
The genome is represented by a central black horizontal line marked at 5-kbp intervals. Protein encoding regions are shown as arrows indicating transcriptional orientation. Forward arrows (above the horizontal black line) show coding regions in the 5’ to 3’ direction and arrows pointing to the left (below the horizontal black line) show the coding regions encoded on the complementary strand. Spliced regions are indicated by a black line joining the coding sequences. Colors are added for contrast between the groups and not indicative of other relationships other than grouping the genes to their transcript, for example, the two red genes of “L1” have no relationships to the eight red genes of “E3”.
Percent identities of select HAdV-14p1 strain BJ430 proteins with representative HAdVs from all species and including all species B viruses.
| Protein | E1A 29.1-kDa protein | E1B 20-kDa protein | DNA polymerase | pTP | L1 43-kDa protein | L2 penton | L3 hexon | pVIII | L5 fiber | E4 34-kDa protein |
| HAdV-12 (A) | 44 | 44 | 70 | 76 | 75 | 72 | 77 | 77 | 31 | 53 |
| HAdV-11 (B2) | 97 | 98 | 99 | 99 | 99 | 97 | 92 | 99 | 92 | 99 |
| HAdV-14 (B2) | 98 | 100 | 99 | 99 | 100 | 99 | 99 | 100 | 99 | 100 |
| HAdV-34(B2) | 98 | 99 | 99 | 99 | 99 | 95 | 94 | 99 | 63 | 98 |
| HAdV-35 (B2) | 96 | 98 | 99 | 99 | 95 | 97 | 91 | 99 | 62 | 98 |
| HAdV-3(B1) | 77 | 88 | 90 | 93 | 98 | 85 | 85 | 94 | 57 | 98 |
| HAdV-7 (B1) | 78 | 89 | 90 | 93 | 92 | 85 | 86 | 94 | 91 | 97 |
| HAdV-16(B1) | 78 | 90 | 91 | 94 | 92 | 85 | 85 | 94 | 51 | 89 |
| HAdV-21 (B1) | 79 | 90 | 91 | 93 | 92 | 91 | 90 | 94 | 62 | 97 |
| HAdV-50 (B1) | 78 | 90 | 91 | 93 | 92 | 91 | 93 | 94 | 61 | 97 |
| SAdV-21 (B1) | 80 | 88 | 92 | 93 | 92 | 90 | 93 | 95 | 55 | 86 |
| HAdV-1(C) | 36 | 49 | 76 | 78 | 82 | 69 | 76 | 80 | 29 | 62 |
| HAdV-9 (D) | 41 | 54 | 80 | 79 | 80 | 77 | 82 | 80 | 29 | 69 |
| HAdV-4 (E) | 56 | 59 | 84 | 88 | 90 | 82 | 82 | 89 | 28 | 70 |
| HAdV-40 (F) | 37 | 47 | 70 | 74 | 78 | 72 | 78 | 79 | 35 | 47 |
| HAdV-52 (G) | 36 | 42 | 76 | 78 | 82 | 72 | 79 | 78 | 28 | 51 |
The proteins span the entire genome.
Figure 2Phylogenetic analysis.
E1A, fiber and hexon genes, as well as whole genome sequences of HAdV, are analyzed with respect to their phylogenetic relationships. Genes from the three recent HAdV-B14p1 strains are closely related to each other and to the prototype HAdV-B14p genome. It is remarkable that HAdV-B14p1 has a high level of sequence similarity to the prototype genome after approximately 50 years. Phylogenetic analysis was performed using the software MEGA v4.0 (Molecular Genetic Analysis Software; http://www.megasoftware.net), specifically applying a maximum-composite-likelihood method that generated neighbor-joining and bootstrapped trees of phylogeny with 1,000 replicates; all other parameters were set by default.