Literature DB >> 30254627

Adenoviromics: Mining the Human Adenovirus Species D Genome.

Ashrafali M Ismail¹, Ji Sun Lee¹, Jeong Yoon Lee^1,2, Gurdeep Singh^1,3, David W Dyer⁴, Donald Seto⁵, James Chodosh¹, Jaya Rajaiya¹.

Abstract

Human adenovirus (HAdV) infections cause disease world-wide. Whole genome sequencing has now distinguished 90 distinct genotypes in 7 species (A-G). Over half of these 90 HAdVs fall within species D, with essentially all of the HAdV-D whole genome sequences generated in the last decade. Herein, we describe recent new findings made possible by mining of this expanded genome database, and propose future directions to elucidate new functional elements and new functions for previously known viral components.

Entities: CellLine Chemical Disease Gene Species

Keywords: adenovirus; evolution; genome; interactome; transcription factor

Year: 2018 PMID： 30254627 PMCID： PMC6141750 DOI： 10.3389/fmicb.2018.02178

Source DB: PubMed Journal: Front Microbiol ISSN： 1664-302X Impact factor: 5.640

Introduction

Human adenovirus (HAdV) infections represent a significant source of morbidity and mortality, world-wide and at all ages, through highly transmittable infections at mucosal sites, including the eye, and urinary, respiratory, and gastrointestinal tracts (Horwitz, 1996). HAdV causes fatal acute respiratory distress syndrome in healthy adults and is especially lethal in infants and the immune compromised (Bhanthumkosol, 1998; Ryu et al., 2003; Wallot et al., 2006; Engelmann et al., 2016; Tan et al., 2016; Zhang et al., 2016). No FDA-approved therapy for acute HAdV infection is available. At resolution of acute infection, persistence may develop within nasopharyngeal lymphoid tissue (Neumann et al., 1987; Garnett et al., 2002, 2009; Zhang et al., 2010; Assadian et al., 2016), as yet uncharacterized cells in the gastrointestinal tract (Roy et al., 2009), and possibly the ocular surface (Kaye et al., 2005), permitting evolution of new HAdVs through homologous recombination between two or more HAdVs infecting the same cell(s) (Lee et al., 2005, 2018; Echavarria et al., 2006; McCarthy et al., 2009; Seto et al., 2010). HAdVs are divided phylogenetically into seven species (A-G), with a total of 90 recognized genotypes with whole genome sequences in GenBank, including the original 51 “serotypes”—determined by serum neutralization—which now all have been fully sequenced (Table 1) (Robinson et al., 2013a) Human adenovirus species D (HAdV-D) is the largest and most rapidly growing among all HAdV species, and contains viruses associated with epidemic keratoconjunctivitis (EKC), a severe, hyperacute ocular surface infection (Butt and Chodosh, 2006). A collaboration funded by the American Recovery and Reinvestment Act of 2009 came to fruition with the complete whole genome sequencing and analysis of all previously unsequenced HAdV-D serotypes (Robinson et al., 2013a), leading to a new understanding of adenovirus ontogeny (Jones et al., 2007; Robinson et al., 2008, 2009a,b; Robinson et al., 2011b,c; Robinson et al., 2013a,b; Walsh et al., 2009, 2010a,b; Arnold et al., 2010; Torres et al., 2010; Dehghan et al., 2011, 2013a,b; Walsh et al., 2011; Liu et al., 2011; Seto et al., 2011, 2013; Singh et al., 2012, 2013; Zhou et al., 2012)—including those HAdV-Ds associated with EKC (Robinson et al., 2008, 2009b, 2011b; Walsh et al., 2009; Zhou et al., 2012)—and ultimately to a new typing system for HAdV based on genomics (Seto et al., 2011).

Table 1

Species and type designations for the 51 human adenovirus (HAdV) serotypes.

Type	GenBank accession no.	Genome length	Year published
HAdV-C1	AC_000017	36001	2004
HAdV-C2	AC_000007	35937	2003
HAdV-B3	AY599834	35345	2006
HAdV-E4	AY599837	35964	2006
HAdV-C5	AY601635	35931	2006
HAdV-C6	FJ349096	35758	2011
HAdV-B7	KP670856.2	35239	2016
HAdV-D8	AB448767	34980	2009
HAdV-D9	AJ854486	35083	2008
HAdV-D10	JN226746	35105	2013
HAdV-B11	AF532578	34794	2003
HAdV-A12	X73487	34125	1979
HAdV-D13	JN226747	35209	2013
HAdV-B14	JQ824845	34767	2012
HAdV-D15	KF268204	35100	2013
HAdV-B16	JN860680	35384	2011
HAdV-D17	HQ910407	35139	2011
HAdV-A18	GU191019	34177	2010
HAdV-D19	JQ326209	35153	2011
HAdV-D20	JN226749	35181	2013
HAdV-B21	AY601633	35382	2006
HAdV-D22	FJ619037	35152	2009
HAdV-D23	JN226750	35050	2013
HAdV-D24	JN226751	35166	2013
HAdV-D25	JN226752	35248	2013
HAdV-D26	EF153474	35152	2007
HAdV-D27	JN226753	35154	2013
HAdV-D28	FJ824826	35130	2010
HAdV-D29	JN226754	35214	2013
HAdV-D30	JN226755	35178	2012
HAdV-A31	AM749299	33763	2005
HAdV-D32	JN226756	35248	2013
HAdV-D33	JN226758	35131	2013
HAdV-B34	AY737797	34775	2004
HAdV-B35	AC_000019	34794	2004
HAdV-D36	GQ384080	35152	2010
HAdV-D37	AB448775	35215	2009
HAdV-D38	JN226759	35221	2013
HAdV-D39	JN226760	35152	2013
HAdV-F40	NC_001454	34214	1993
HAdV-F41	DQ315364.2	34188	2007
HAdV-D42	JN226761	35231	2013
HAdV-D43	JN226762	35012	2013
HAdV-D44	JN226763	35214	2013
HAdV-D45	JN226764	35154	2013
HAdV-D46	AY875648	35178	2006
HAdV-D47	JN226757	35106	2013
HAdV-D48	EF153473	35206	2007
HAdV-D49	DQ393829	35215	2006
HAdV-B50	AY737798	35385	2007
HAdV-D51	JN226765	35114	2013

Species and type designations for the 51 human adenovirus (HAdV) serotypes. Recent published work demonstrates how genome “mining,” in-depth analyses of the growing HAdV genome database, can bring about new realizations and add critical new information to prior ones. The trimeric fiber protein on adenoviruses mediates viral entry through interaction of the distal most “knob” structure on the fiber with host cell receptors. In a phylogenetic analysis of HAdV-D fiber genes, HAdV-D types associated with EKC were recently shown to form a unique clade (Ismail et al., 2016). By proteotyping, a new in silico methodology described in detail below, EKC virus-associated fiber knobs were uniquely shared, and signature amino acid positions distinguished EKC from non-EKC types. Remarkably, human corneal epithelial cell tropism could be predicted by the presence of a lysine or alanine at residue 240, and this amino acid residue in EKC viruses showed evidence for positive selection. These data added to the prior observation by Huang and coworkers that artificial mutation to a lysine at residue 240 in a non-EKC virus could confer infection of Chang cells, a conjunctiva derived continuous cell line (Huang et al., 1999). However, because Chang cells came later known to be contaminated by HeLa cells, the importance of residue 240 to ocular tropism was until this new observation, in some doubt. Another recently published effort provided further evidence of the importance and potential for HAdV genome mining. Late adenoviral gene expression is initiated by the adenovirus major late promoter (Ramke et al., 2017), followed by splicing of mRNAs to the viral tripartite leader for translation (Chow et al., 1977; Akusjärvi and Pettersson, 1978; Chow and Broker, 1978; Logan and Shenk, 1984). The HAdV tripartite leader is a 200-nucleotide 5' noncoding region that circumvents the requirement for eukaryotic initiation factor 4F or cap binding protein complex (Ziff and Evans, 1978; Akusjärvi and Pettersson, 1979; Dolph et al., 1988; Zhang et al., 1989), and permits translation of HAdV mRNAs at late times in infection when cap-dependent translation is blocked due to shut down of host cellular cap-dependent mRNA translation. HAdV 5′ untranslated regions (5′UTRs) are critical for cap-independent initiation, and impact mRNA localization and stability. The HAdV tripartite leader (TPL), composed of three introns (TPL 1-3), drives translation of HAdV late mRNA. The annotation of 72 HAdV genotypes for the HAdV TPL and another previously described leader, the i-leader, let to identification of newly identified polycistronic mRNAs for RID-α and RID-β within the E3 transcription unit, and a potential new open reading frame (ORF) within the i-leader sequence, with termination of this potential protein in TPL3 (Ramke et al., 2017). In addition, the authors also identified a potential new leader sequence embedded within the E3 region, tentatively named the j-leader (Figure 1).

Figure 1

Putative “j”-leader located within the CR1-α E3 gene. (A) Schematic for the location of a newly detected leader (“j”-leader) embedded within the E3 CRI-α gene, experimentally determined to be spliced to some, but not all mRNAs of the E3 genes. (B) Gel photomicrograph of mRNA transcripts amplified with forward primer from TPL1 and reverse primers from CR1-γ, CR1-β, and RID-α. Primers were chosen to elicit similarly sized bands to facilitate subsequent sequencing. (C) Nucleotide sequence of the PCR product for CR1-β. The putative j-leader sequence and splice sites are shown in yellow and green, respectively. Note an additional 4 nucleotide 5′UTR (AACC) prior to the CR1-β start site (red). The 5′UTR in (C) prior to the splice site for the j-leader is from TPL3. Adapted from Ramke et al. (2017) with permission.

Structure and infection

The HAdV is non-enveloped, icosahedral in shape, and contains a double stranded DNA genome of ~36,000 base pairs (bp) with ~1 open reading frame (ORF) for every 1000 nucleotides. Viral DNA is associated with four (interior) core proteins including Mu, VII, V, and terminal protein. The histone-like protein (p) VII protects viral DNA from cellular DNA damage responses (Lischwe and Sung, 1977; Karen and Hearing, 2011; Avgousti et al., 2017). The outer protein coat (capsid) of the virus consists of 240 hexon capsomers and 12 penton capsomers, along with several minor capsid proteins. The latter include pVI, pIIIa, pVIII, and pIX and are important to capsid stability. Each penton capsomer contains a ring of five penton base proteins which bind and support the trimeric fiber protein with its distal fiber knob. During viral infection, the fiber knob binds to one of several host cell receptors (Nemerow, 2000; Goosney and Nemerow, 2003; Nemerow et al., 2009). The penton base protein contains two hypervariable loops. The interaction between fiber knob and a host cell receptor brings about secondary contact between the hypervariable loop 2 (HVL2) arginine-glycine-aspartic acid (RGD) motif in each penton base protein (five per penton base capsomer) with host cell integrins αvβ3, αvβ5, and αvβ1, that in turn induce endocytosis of the virus (Li et al., 1998a,b; Li et al., 2000). HAdV structural proteins can serve multiple functions. For example, the minor capsid structural protein VI (pVI) plays a critical role in at least three distinct aspects of the viral “life” cycle: endosomal escape during cell entry, nuclear assembly during viral replication, and stability of the intact, infectious virus outside the host (Wodrich et al., 2003; Wiethoff et al., 2005; Moyer et al., 2011, 2016). These findings suggest that, as with pVI, other HAdV structural proteins may have multiple functions yet to be elucidated.

Genomics and evolution

The relatively large genome database for HAdV-D (over 50 unique viruses with available whole genome sequences) (Tables 1, 2) has permitted detailed analyses of genome relationships within this clinically important adenovirus species. HAdV-D genomes are highly conserved (>90%). However, whole genome analyses of HAdV-D have revealed specific loci of genetic hypervariability in the hexon, penton base, fiber, and E3 CR1α, β, and γ genes (Figure 2), dictating nonsynonymous amino acid changes in corresponding proteins (Figure 3). GC content confers genome stability and resistance to recombination (Gruss et al., 1991), and the genomes of HAdV-D have among the highest GC content among HAdV species (~56%). The hypervariable regions in HAdV-D were found to be sharply reduced in GC nucleotide content relative to the rest of the genome (Robinson et al., 2013a). Mutations in HAdV are relatively infrequent, with genome stability now documented in some types across decades (Hofmayer et al., 2009; Mahadevan et al., 2010; Seto et al., 2010; Dehghan et al., 2013b; Robinson et al., 2013a; Alkhalaf et al., 2015). However, those regions of the genome shown to be hypervariable and relatively low in GC content are the very same also shown to undergo homologous recombination (Robinson et al., 2009a, 2011b; Walsh et al., 2009; Zhou et al., 2012; Singh et al., 2013), driving the evolution of new genotypes.

Table 2

Species and molecular types of human adenovirus (HAdV) genotypes 52–90.

HAdV type	^#Name	GenBank accession no.	Genome length	Year published
HAdV-G52	P52H52F52/2003/USA	DQ923122.2	34250	2007
HAdV-D53	P37H22F8/2005/DEU	FJ169625	34909	2009
HAdV-D54	P54H54F8/2000/JPN	AB333801	34920	2008
HAdV-B55	P14H11F14/2006/CHN	FJ643676	34755	2010
HAdV-D56	P56H15F9/2008/FRA	HM770721	35066	2011
HAdV-C57	P1H57F6/2001/RUS	HQ003817	35818	2011
HAdV-D58	P58H58F29/1996/ARG	HQ883276	35217	2011
HAdV-D59	P64H25F56/2007/USA	JF799911	35072	2012
HAdV-D60	P60H20F60/2009/CAN	HQ007053	35050	2013
HAdV-A61	P31H31F31/2004/JPN	JF964962	33776	2011
HAdV-D62	P62H62F62/1993/GBR	JN162671	35127	2014
HAdV-D63	P30H30F29/1959/USA	JN935766	35168	2012
HAdV-D64	P22H19F37/1993/USA	EF121005	35231	2012
HAdV-D65	P58H10F9/2004/BGD	AP012285	35172	2012
HAdV-B66	P66H7F3/1987/ARG	JN860676	35080	2012
HAdV-D67	P67H9F67/2005/BGD	AP012302	35075	2013
HAdV-B68	P16H3F16/2004/ARG	JN860678	35538	Unpublished
HAdV-D69	P53H15F69/1955/SAU	JN226748	35124	2013
HAdV-D70	P70H70F29/2014/DEU	KP641339	35186	2015
HAdV-D71	P9H20F71/1987/DEU	KF268207	35192	2013
HAdV-D72	P72H30F72/1985/DEU	KF268335	34553	2013
HAdV-D73	P67H45F27/2015/DEU	KY618676	35190	2017
HAdV-D74	P70H74F51/2015/DEU	KY618677	35155	2017
HAdV-D75	P75H26F29/2015/DEU	KY618678	35104	2017
HAdV-B76	P21H21F16/DEU	KF633445	35586	2013
HAdV-B77	P35H34F7/1985/DEU	KF268328	34653	2013
HAdV-B78	P11H11F7/2000/ARG	KT970440	34881	Unpublished
HAdV-B79	P11H34F11/2015/JPN	LC177352	34779	2017
HAdV-D80	P19,23H28F22/2014/DEU	TBA	34909	Unpublished
HAdV-D81	P65H48F60/2012/JPN	AB765926.1	35198	2014
HAdV-D82	P56H15F37/2011/JPN	LC066535.1	35122	Unpublished
HAdV-D83	P83H9F15/2010/PAR	KX827426.1	35207	2017
HAdV-D84	P43H17F84/2011/PAN	MF416150	35257	2017
HAdV-D85	P37H19F8/2015/JPN	LC314153	35203	2018
HAdV-D86	P9H25F25/1978/SWE	TBA	35147	Unpublished
HAdV-D87	P9H15F25/1967/USA	MF476841	35159	Unpublished
HAdV-D88	P88H15F9/1963/USA	MF476842	35115	Unpublished
HAdV-C89	P89H2F2/2015/DEU	TBA	35998	Unpublished
HAdV-D90	P33H27F67/2017/BGD	TBA	34207^*	Unpublished

Name indicates molecular type (P, penton base; H, hexon; F, fiber)/year of isolation/country of isolation.

TBA: GenBank accession number, to be assigned.

Metagenomics project, missing the inverted terminal repeat sequences.

Figure 2

Nucleotide diversity plots, by HAdV species, generated with DnaSP, represent the average number of nucleotide differences per site between each type in every HAdV species. The % diversity is calculated on the y-axis; the x-axis illustrates the nucleotide position on the genome. HAdV-Ds (red line on bottom half of plot) show particular diversity in the penton base, hexon, E3, and fiber coding regions, with otherwise very high conservation. From Robinson et al. (2013a) with permission.

Figure 3

Amino acid diversity calculated in MEGA 4.02, measuring the average amino acid substitution for each HAdV-D protein. Each bar in the graph corresponds to a protein as represented by arrows. Red = early genes. Dark green = late genes. Black = intermediate genes. The hypervariable loops of penton base and hexon proteins were also analyzed separately (light green) and showed particularly high levels of amino acid substitutions. From Robinson et al. (2013a) with permission.

Species and molecular types of human adenovirus (HAdV) genotypes 52–90. Name indicates molecular type (P, penton base; H, hexon; F, fiber)/year of isolation/country of isolation. TBA: GenBank accession number, to be assigned. Metagenomics project, missing the inverted terminal repeat sequences. Nucleotide diversity plots, by HAdV species, generated with DnaSP, represent the average number of nucleotide differences per site between each type in every HAdV species. The % diversity is calculated on the y-axis; the x-axis illustrates the nucleotide position on the genome. HAdV-Ds (red line on bottom half of plot) show particular diversity in the penton base, hexon, E3, and fiber coding regions, with otherwise very high conservation. From Robinson et al. (2013a) with permission. Amino acid diversity calculated in MEGA 4.02, measuring the average amino acid substitution for each HAdV-D protein. Each bar in the graph corresponds to a protein as represented by arrows. Red = early genes. Dark green = late genes. Black = intermediate genes. The hypervariable loops of penton base and hexon proteins were also analyzed separately (light green) and showed particularly high levels of amino acid substitutions. From Robinson et al. (2013a) with permission. Adenoviruses recombine specifically during viral replication (Williams et al., 1975; Meinschad and Winnacker, 1980; Munz et al., 1983), and do so by both homologous and heterologous mechanisms (Young et al., 1984; Epstein and Young, 1991; Crawford-Miksza and Schnurr, 1996). However, the evidence for homologous recombination as the major mechanism driving HAdV-D evolution is unassailable (Robinson et al., 2013a; Singh et al., 2013). Specifically, recombination occurs in the two penton base hypervariable regions (these code for two hypervariable loops (HVLs) on the penton base protein, separated from one another by ~125 conserved amino acids), seven hexon hypervariable regions (these are closely adjacent in the hexon gene and determine two adjacent HVLs on the hexon protein), fiber (fiber gene and protein are entirely hypervariable), and E3 CR1α, β, and γ (each also entirely hypervariable). For homologous recombination between two HAdVs to occur, at least two virus types with high nucleotide sequence homology at corresponding locations in both genomes must co-infect the same cell, and viral DNA replication should be ongoing. Co-infection by two or more HAdVs has been well documented (Lee et al., 2005; Echavarria et al., 2006; Vora et al., 2006; McCarthy et al., 2009; Halstead et al., 2010; Seto et al., 2010), as has the presence of two HAdV types in archived clinical samples (Singh et al., 2012). “Proteotyping” is a novel approach to the study of genome evolution (Obenauer et al., 2006), and has been applied to characterize recombination among HAdV-D (Robinson et al., 2013a; Singh et al., 2013). In this method, maximum likelihood trees are used to align amino acid sequences of hypervariable, frequently recombined proteins. Each amino acid is assigned a unique, arbitrary color. Consensus residues are colored white, and gaps in the alignment are colored black. A threshold of <10% sequence divergence is used to distinguish unique proteotypes. An example of proteotyping is shown (Figure 4), comparing an amino acid alignment from E3 14.7K, a highly conserved gene with one distinct proteotype, with the hypervariable E3 CR1α (Singh et al., 2013), with six distinct proteotypes observed among 38 HAdV-Ds. E3 14.7K is therefore not hypervariable and not recombinant. E3 CR1α is hypervariable and recombinant.

Figure 4

Proteotyping analysis comparing the HAdV-D E3 14.7K (A) and CR1α (B) proteins. The 14.7K protein was conserved, while CR1α demonstrated 6 unique proteotypes. Maximum likelihood phylogenetic trees are shown to the left for each putative protein, and amino acid signatures to the right. The scale bar at the bottom left of each sub-figure denotes the phylogenetic distance reflected in horizontal dimension of the respective tree. To construct the amino acid signatures shown, each amino acid was assigned a unique color (upper right corner), consensus amino acids at each position across all 38 viruses were assigned white, and gaps in the alignment were colored black. Horizontal red lines delineate distinct proteotypes. Adapted from. Singh et al. (2013) with permission. Another way to interpret the analyses for those proteins like E3 CR1α, with more than one proteotype is that those proteotypes containing more than one HAdV type have previously recombined in nature, while those proteotypes with only one HAdV type are those that have not (yet) been shown to recombine in nature. HAdV-D37 and 29 fall within different hexon proteotypes (Figure 5). HAdV-D37 shares a hexon proteotype with HAdV-D13 and 30 (Robinson et al., 2013a), while HAdV-D29 shares a hexon proteotype with HAdV-D15, 56, and 69 (Singh et al., 2015). These two hexon proteotypes therefore have undergone prior homologous recombination. In contrast, the hexon proteins of HAdV-D10 and 28 are each in a proteotype with only one member; hexon recombination for these two viruses has therefore not yet been documented in nature. In sum, these data show by independent means that homologous recombination within HAdV-D is common, and confirm previously recognized patterns of homologous recombination among HAdV-D (Robinson et al., 2009a, 2011b, 2013a,b; Walsh et al., 2009, 2010a; Singh et al., 2012, 2013; Zhou et al., 2012; Gonzalez et al., 2014).

Figure 5

Proteotyping for 38 HAdV-Ds, sorted for the hexon proteotype column. Numbers and colors are arbitrary, and distinguish distinct proteotypes. Recombinants can be identified by rows. For example, HadV-D56, -D29, and -D15 fall within the same proteotype and are predicted to share highly similar nucleotide sequences for their respective hexon hypervariable regions (as confirmed by Singh et al., 2015). For HAdV-D29 and –D15, the recombination event extended through the E3 CR1β ORF gene and then ended. The local sequence and/or structure of DNA in regions flanking recombinogenic sites is significant for directing cellular recombination machinery to those regions. In bacteria, a signal for recombination between homologous DNA is the crossover hotspot instigator, or Chi nucleotide sequence. This was first discovered in bacteriophage lambda, then in bacterial DNA, and later shown to mediate recombination between them (Stahl, 1998). The Chi sequence in E. coli (ChiEC) is 5′-GCTGGTGG-3′ (Smith et al., 1981; Bianco and Kowalczykowski, 1997), and its presence induces the conversion of the RecBCD enzyme from a helicase to an exonuclease, producing ssDNA that can invade homologous dsDNA during recombination (Taylor et al., 1985). The RecA enzyme of E. coli is loaded onto unwound ssDNA by RecBCD and promotes ssDNA exchange/recombination with homologous dsDNA (Cox, 1999; Smith, 2012). RecA has significant homology to eukaryotic Rad51 and its paralogs (Suwaki et al., 2011), enzymes that repair dsDNA breaks in human cells, facilitate homologous recombination, and during adenovirus infection, bind to the E2 DNA binding protein (Tookman et al., 2016). In our study of the region just 5′ to HVL2 on the penton base gene, a recombination hot-spot for HAdV-D (Robinson et al., 2009a), we found Chi-like sequences (ChiAD), e.g., 5′-ACTTCTGA-3′ in the proteotype containing HAdV-D64, and 5′-TCTCCTGA-3′ in the proteotype including HAdV-D37 (Lee et al., 2018). The putative ChiAD sequences we identified in HAdV-D were found within the GC-rich component of GC/AT transition zones that precede and include HVL2, and were conserved within each proteotype. In vitro, E. coli lysates containing RecA protein increased recombination of two HAdV-D genotypes with the same penton base HVL2 proteotype. RecA was shown by ChIP to bind specifically to ChiAD nucleotide sequence in the same regions, and also colocalize with adenovirus DNA within infected cell nuclei. These data suggest that Chi-like nucleotide sequences adjacent to the junction of conserved and hypervariable gene segments in HAdV-D may be an important signal for homologous recombination, and provide evidence in support of the idea that local bacterial flora might enhance natural recombination through Chi-like nucleotide sequences at HAdV-D recombination hotspots. Another explanation for homologous recombination between HAdV, not exclusive of a role for ChiAD, is the potential for GC-low (AT-rich) single stranded DNA (ssDNA) to form hairpin loops (Nagy and Bujarski, 1997; Ohshima et al., 2007), a physical nonlinearity that would facilitate binding of ssDNA of one HAdV-D type to a homologous segment of ss or dsDNA from a physically adjacent but different HAdV-D type during co-infection of the same cell. Hairpin loops and other alterations in the physical configuration of ssDNA during DNA replication might also contribute to polymerase jumping (Jennings et al., 1983; Spaan et al., 1983; Pääbo et al., 1990; Viswanathan et al., 1999), in which physical constraints to polymerization lead to translocation of the DNA polymerase to an adjacent DNA from a different virus, resulting in a recombined DNA. Polymerase jumping has been shown to occur during HAdV DNA replication (King et al., 1997; de Jong et al., 2003), although it has not been suggested previously as a mechanism for HAdV-D evolution. Analysis of 38 HAdV-D whole genome sequences identified instances of 15 nucleotide-long GC-rich sequence adjacent to 15 nucleotide-long AT-rich sequence (sometimes with a 15 or 30 nucleotide-long GC-moderate sequence intervening), located just 5′ and 3′ to frequently recombined gene segments, and which were shown by in silico analysis of their corresponding ssDNA to form hairpin loops (Robinson et al., 2013a). Taken together, these data suggest covariant effects of nucleotide sequence and ssDNA secondary structures on homologous recombination between two HAdV-Ds.

Transcriptome

Regions of the HAdV-D genome currently thought to be “noncoding,” may contain functional elements. Because viruses exist on the nano-scale, viral genomes are by necessity constrained by size, and “junk” nucleotide sequences represent an extravagance. The National Human Genome Research Institute project to identify functional elements in the human genome (Encyclopedia of DNA Elements, or ENCODE) identified functionality in much of the human genome previously without known utility (Consortium et al., 2007; Qu and Fang, 2013; Kellis et al., 2014) The double-stranded DNA genomes of HAdV also contain regions with no known function. Transcriptional profiling of host gene expression has been studied after HAdV infection (Dorer et al., 2011) However, although viral transcriptomes have been reported for several viruses, most notably dengue, varicella zoster, and Epstein-Barr viruses (Ortmann et al., 2008; Ertl et al., 2011; Nagel et al., 2011, 2013; Arvey et al., 2013; Sujayanont et al., 2014), a de novo HAdV transcriptome has not been reported. Wu and coworkers used deep RNA sequencing to confirm known bat AdV transcripts (Wu et al., 2013), but did not investigate “noncoding” regions. In silico ORF prediction in HAdV can be difficult due to splice variants and inconsistencies in banked gene annotations (Davison et al., 2003), but in a prior annotation of HAdV-D37, ~60 new additional ORFs were predicted using in combination, the NCBI ORF finder, TIGR annotation engine, and GeneMark Heuristic model (Robinson et al., 2008) Putative genes were found within the large regions of noncoding DNA on the complementary strand opposite to established HAdV genes (Figure 6), in smaller regions on the coding strand within established transcription units but between confirmed genes, and overlapping or completely within established genes. Work is in progress in our laboratories to identify putative new genomic elements in HAdV by high-throughput sequencing of the viral transcriptome of HAdV-D37.

Figure 6

Transcription map for HAdV-D37. Genes are divided by early (shaded) or late expression. Red brackets denote large areas of “noncoding” DNA, but many additional, smaller, potential coding regions exist between and within known genes. Adapted from Robinson et al. (2008) with permission.

Transcription factor binding sites

HAdV uses host TFs nuclear factor I and III (NF-I and NF-III) as part of the viral DNA replication complex (Pruijn et al., 1988; Mul et al., 1990; Hatfield and Hearing, 1991, 1993) Simian AdVs typically lack the NF-I binding site, while human viruses express it. It was previously reported that HAdV-E4, originally isolated in 1952, is a product of recombination between HAdV-B16 and the simian AdV, SAdV-E26. Clinical strains of HAdV-E4 isolated recently contain a NF-I binding site in the inverted terminal repeat (Houng et al., 2006; Dehghan et al., 2013a,b) that is absent in the original isolate (Purkayastha et al., 2005), suggesting that NF-I binding may be important to viral fitness in humans. To further elucidate mechanisms of viral gene expression, we are exploring novel TF binding sites on HAdV-D DNA, using ENCODE validated methodologies (Gerstein et al., 2012; Landt et al., 2012).

Protein interactomes

The major HAdV capsid structural proteins—hexon, penton base, and fiber—interact directly with extracellular mediators of host immunity (Gahéry-Ségard et al., 1998; Molinier-Frenkel et al., 2002; Schoggins and Falck-Pedersen, 2006; Tamanini et al., 2006; Kalyuzhniy et al., 2008; Chintakuntlawar et al., 2010; Bradshaw et al., 2012; Flatt et al., 2013). The hexon, penton base, and fiber proteins also exhibit distinct amino acid signatures, characterizing discrete proteotypes (Robinson et al., 2013a). Gene products from the E3 transcription unit of HAdV function in viral immune evasion (Horwitz, 2004; Lichtenstein et al., 2004b; Windheim et al., 2004). In HAdV-D, the open reading frames for three of eight E3 genes—CR1α, CR1β, and CR1γ–are uniquely hypervariable compared to the other ORFs within the E3 transcription unit, and also segregate into discrete proteotypes (Singh et al., 2013). Highly conserved genes, such as DNA binding protein, DNA polymerase, and E3 14.7K, show no such variability (Robinson et al., 2013a; Singh et al., 2013). While it may be assumed that hypervariablity in major capsid and E3 proteins is driven through evolutionary selection by the extracellular interactome, amino acid differences in a hypervariable protein can also lead to differences in that protein's intracellular interactome, the set of intrinsic host cell proteins which network with the viral protein, as was recently confirmed for E3 CR1 genes across HAdV species (Martinez-Martin et al., 2016). Viral capsid structural proteins are critical to virion stability. For the nonenveloped HAdV, fiber and penton base proteins on the external surface of the capsid serve as ligands for attachment to the host cell (Huang et al., 1999) and initiate viral entry (Wickham et al., 1993), respectively. HAdVs are typically internalized via endosomes. Endosomal acidification leads to structural instability of the capsid and endosomal release into the cytosol. HAdV capsid is then transported by microtubules to the nuclear membrane. Viral DNA then enters the nucleus through nuclear pores, leaving almost all the viral structural proteins in the cytosol (Henaff et al., 2011). Viral capsid proteins within the cell are eventually targeted for ubiquitination (Ko et al., 2010; Marvin and Wiethoff, 2012; Horan et al., 2013; Li et al., 2013) and degraded (Greber et al., 1993), but there are many opportunities for interaction with intracellular host cellular proteins during entry, trafficking, translation, assembly, and egress. Penton base HVL2, with its RGD motif, is critical to viral internalization through the interaction with host cell integrins,; (Wickham et al., 1993) but function of penton base HVL1 is unknown, and might be revealed though knowledge of its protein interactome. The closely adjacent hexon HVL1 and 2 form the epsilon epitope that determines serum neutralization, and interactions between the hexon protein and serum coagulation factor X confers liver tropism to HAdV-C5 (Sumarheni et al., 2014). However, nothing is known about potential hexon interactions with intracellular proteins during infection. The E3 transcription unit of HAdV codes for proteins that mediate immune evasion by the virus (Horwitz, 2004). Although E3 is labeled as an early transcription region, its transcripts are expressed both early and late during viral infection (Chow et al., 1977; Chow and Broker, 1978; Bhat and Wold, 1986), and there is evidence for at least one E3 protein that late transcripts are translated (Robinson et al., 2011a). E3 gene products are not required for viral replication in cultured cells (Morin et al., 1987), but inhibit cellular and cytokine mediated host immune responses to infection (Horwitz, 2004; Lichtenstein et al., 2004b; Windheim et al., 2004). Almost all of what is known about the function of specific E3 proteins derives from studies on HAdV-C. For example, HAdV-C2 E3 CR1α directs another E3 protein (19K) to the endoplasmic reticulum of cytotoxic T cells (Wilson-Rawls et al., 1994), where 19K binds to and retains MHC class I proteins (Jefferies and Burgert, 1990), preventing presentation of viral peptides within MHC class I at the cell surface (Burgert and Kvist, 1985, 1987; Andersson et al., 1987; Burgert et al., 1987; Cox et al., 1991). CR1α, RIDα, and RIDβ proteins cooperate to evade TNFα-related apoptosis through TRAIL (Elsing and Burgert, 1998; Tollefson et al., 1998; Benedict et al., 2001; Lichtenstein et al., 2004a). CR1β (Wold et al., 1984), also called the adenovirus death protein (Tollefson et al., 1992), is required for cell lysis (Tollefson et al., 1996) and viral spread (Doronin et al., 2003). The ORF size of each E3 gene varies across HAdV species (Figure 7) (Robinson et al., 2011c). Similarly, immune evasion functions of E3 gene products may not be the same across HAdV species, or function similarly in all cell types (Routes and Cook, 1990). Windheim and coworkers recently showed that the CR1β protein of the eye pathogen HAdV-D64 suppresses natural killer cell function (Windheim et al., 2013). The E3 CR1 genes are uniquely hypervariable within HAdV-D, and as predicted, overlapping but distinguishable intracellular interactomes across proteotypes were recently reported by Martinez-Martin and colleagues, who used protein microarrays to identify novel CR1β binding partners (Martinez-Martin et al., 2016).

Figure 7

Comparison of E3 transcription unit from HAdV-C and -D. Note in particular the difference in ORF size between CR1β of the two HAdV species. Adapted from Robinson et al. (2011c) with permission.

Conclusions

HAdV was critical to the dual discoveries of viral oncogenesis and RNA splicing (Berget et al., 1977; Chow et al., 1977; Whyte et al., 1988). HAdV is also a significant agent of disease for which there is no approved treatment. Recent mining of HAdV genomes has been highly productive, and there is ample evidence to suggest that further whole genome analyses will elucidate new and fundamental mechanisms in HAdV biology. In the last decade, of 27 newly identified HAdVs, 19 were HAdV-Ds, suggesting the continuing evolution of new pathogens from species D. Analyses of 38 fully sequenced HAdV-D whole genomes identified homologous recombination of specific regions within the hexon, penton base, fiber, and E3 CR1 genes as the major mechanism behind HAdV-D evolution, a new finding (Robinson et al., 2013a; Singh et al., 2013). Stereotypical reductions in GC content at the junction of conserved and hypervariable regions, along with Chi-like sequence motifs (also a new finding), appear likely to augment the intrinsic tendency of HAdV to undergo homologous recombination in vivo (Lee et al., 2018). Recently, the whole genome sequences of 85 HAdVs from archives and current collections were determined, including both historical and circulating strains, respectively (Ismail et al., 2018). Of these, 3 novel recombinants within HAdV-B and 15 within HAdV-D were identified. Only two of the 15 HAdV-Ds were found to contain novel genes (penton base and fiber); these were subsequently typed as HAdV-D71 and 72. Isolates of HAdV-D53 and HAdV-D58, two novel genotypes recently recognized, were also identified, adding confidence in their clinical importance. Fully genotyped HAdVs now number 90, with more awaiting type numbers, and the scientific community has a 10-fold larger database of unique HAdV genomes than available only 15 years ago. Published and validated ENCODE methodologies can now be applied, and comparisons made across disparate HAdV genomes. We suggest that the HAdV genome contains previously uncharacterized functional elements, and that every HAdV protein has pleiotropic interactions. Current technologies should afford a wave of new and important discoveries that may lead to needed therapies against adenoviral diseases.

Author contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

160 in total

1. Modeling adenovirus latency in human lymphocyte cell lines.

Authors: Yange Zhang; Wen Huang; David A Ornelles; Linda R Gooding
Journal: J Virol Date: 2010-06-23 Impact factor: 5.103

2. Overreliance on the hexon gene, leading to misclassification of human adenoviruses.

Authors: Gurdeep Singh; Christopher M Robinson; Shoaleh Dehghan; Timothy Schmidt; Donald Seto; Morris S Jones; David W Dyer; James Chodosh
Journal: J Virol Date: 2012-02-01 Impact factor: 5.103

3. Deep RNA sequencing reveals complex transcriptional landscape of a bat adenovirus.

Authors: Lijun Wu; Peng Zhou; Xingyi Ge; Lin-Fa Wang; Michelle L Baker; Zhengli Shi
Journal: J Virol Date: 2012-10-24 Impact factor: 5.103

4. Recombination in adenovirus. I. Analysis of recombinant viruses under non-selective conditions.

Authors: C Meinschad; E L Winnacker
Journal: J Gen Virol Date: 1980-05 Impact factor: 3.891

5. Coincidence of the promoter and capped 5' terminus of RNA from the adenovirus 2 major late transcription unit.

Authors: E B Ziff; R M Evans
Journal: Cell Date: 1978-12 Impact factor: 41.582

6. Does the higher order structure of the influenza virus ribonucleoprotein guide sequence rearrangements in influenza viral RNA?

Authors: P A Jennings; J T Finch; G Winter; J S Robertson
Journal: Cell Date: 1983-09 Impact factor: 41.582

7. Architecture of the human regulatory network derived from ENCODE data.

Authors: Mark B Gerstein; Anshul Kundaje; Manoj Hariharan; Stephen G Landt; Koon-Kiu Yan; Chao Cheng; Xinmeng Jasmine Mu; Ekta Khurana; Joel Rozowsky; Roger Alexander; Renqiang Min; Pedro Alves; Alexej Abyzov; Nick Addleman; Nitin Bhardwaj; Alan P Boyle; Philip Cayting; Alexandra Charos; David Z Chen; Yong Cheng; Declan Clarke; Catharine Eastman; Ghia Euskirchen; Seth Frietze; Yao Fu; Jason Gertz; Fabian Grubert; Arif Harmanci; Preti Jain; Maya Kasowski; Phil Lacroute; Jing Jane Leng; Jin Lian; Hannah Monahan; Henriette O'Geen; Zhengqing Ouyang; E Christopher Partridge; Dorrelyn Patacsil; Florencia Pauli; Debasish Raha; Lucia Ramirez; Timothy E Reddy; Brian Reed; Minyi Shi; Teri Slifer; Jing Wang; Linfeng Wu; Xinqiong Yang; Kevin Y Yip; Gili Zilberman-Schapira; Serafim Batzoglou; Arend Sidow; Peggy J Farnham; Richard M Myers; Sherman M Weissman; Michael Snyder
Journal: Nature Date: 2012-09-06 Impact factor: 49.962

8. Replication and virus-induced transcriptome of HAdV-5 in normal host cells versus cancer cells--differences of relevance for adenoviral oncolysis.

Authors: Dominik E Dorer; Frank Holtrup; Kurt Fellenberg; Johanna K Kaufmann; Sarah Engelhardt; Jörg D Hoheisel; Dirk M Nettelbeck
Journal: PLoS One Date: 2011-11-30 Impact factor: 3.240

9. Genomic and bioinformatics analysis of human adenovirus type 37: new insights into corneal tropism.

Authors: Christopher M Robinson; Fatemeh Shariati; Allison F Gillaspy; David W Dyer; James Chodosh
Journal: BMC Genomics Date: 2008-05-09 Impact factor: 3.969

10. The extracellular interactome of the human adenovirus family reveals diverse strategies for immunomodulation.

Authors: Nadia Martinez-Martin; Sree R Ramani; Jason A Hackney; Irene Tom; Bernd J Wranik; Michelle Chan; Johnny Wu; Maciej T Paluch; Kentaro Takeda; Philip E Hass; Hilary Clark; Lino C Gonzalez
Journal: Nat Commun Date: 2016-05-05 Impact factor: 14.919

17 in total

Review 1. Genomic foundations of evolution and ocular pathogenesis in human adenovirus species D.

Authors: Ashrafali Mohamed Ismail; Xiaohong Zhou; David W Dyer; Donald Seto; Jaya Rajaiya; James Chodosh
Journal: FEBS Lett Date: 2019-12-11 Impact factor: 4.124

2. Human adenovirus species in children with acute respiratory illnesses.

Authors: Varvara Probst; Emily K Datyner; Zaid Haddadin; Danielle A Rankin; Lubna Hamdan; Herdi K Rahman; Andrew Spieker; Laura S Stewart; Claudia Guevara; Erin Yepsen; Jonathan E Schmitz; Natasha B Halasa
Journal: J Clin Virol Date: 2020-12-11 Impact factor: 3.168

3. Human Desmoglein-2 and Human CD46 Mediate Human Adenovirus Type 55 Infection, but Human Desmoglein-2 Plays the Major Roles.

Authors: Ying Feng; Changhua Yi; Xinglong Liu; Linbing Qu; Wan Su; Tao Shu; Xuehua Zheng; Xianmiao Ye; Jia Luo; Mingli Hao; Xikui Sun; Liang Li; Xiaolin Liu; Chenchen Yang; Suhua Guan; Ling Chen; Liqiang Feng
Journal: J Virol Date: 2020-08-17 Impact factor: 5.103

4. Human Adenovirus Subtype 21a Isolates From Children With Severe Lower Respiratory Illness in China.

Authors: Wenkuan Liu; Li Zhang; Yong Cai; Qiong Zhang; Dehui Chen; Shuyan Qiu; Yanqun Wang; Duo Xu; Shujun Gu; Xiao Li; Jing Dai; Qian Liu; Rong Zhou; Xingui Tian
Journal: Front Microbiol Date: 2022-06-16 Impact factor: 6.064

5. RANBP2 and USP9x regulate nuclear import of adenovirus minor coat protein IIIa.

Authors: Ashrafali M Ismail; Amrita Saha; Ji S Lee; David F Painter; Yinghua Chen; Gurdeep Singh; Gabriela N Condezo; James Chodosh; Carmen San Martín; Jaya Rajaiya
Journal: PLoS Pathog Date: 2022-06-16 Impact factor: 7.464

6. Molecular and Clinical Characterization of Human Adenovirus E4-Associated Conjunctivitis.

Authors: Russell N Van Gelder; Lakshmi Akileswaran; Kenji Nakamichi; David Stroman
Journal: Am J Ophthalmol Date: 2021-11-03 Impact factor: 5.488

Review 7. Mystery eye: Human adenovirus and the enigma of epidemic keratoconjunctivitis.

Authors: Rahul A Jonas; Lawson Ung; Jaya Rajaiya; James Chodosh
Journal: Prog Retin Eye Res Date: 2019-12-28 Impact factor: 21.198

8. Analysis of severe human adenovirus infection outbreak in Guangdong Province, southern China in 2019.

Authors: Wenkuan Liu; Shuyan Qiu; Li Zhang; Hongkai Wu; Xingui Tian; Xiao Li; Duo Xu; Jing Dai; Shujun Gu; Qian Liu; Dehui Chen; Rong Zhou
Journal: Virol Sin Date: 2022-01-17 Impact factor: 6.947

9. Human Adenovirus Type 55 Distribution, Regional Persistence, and Genetic Variability.

Authors: Jun Hang; Adriana E Kajon; Paul C F Graf; Irina Maljkovic Berry; Yu Yang; Mark A Sanborn; Christian K Fung; Anima Adhikari; Melinda S Balansay-Ames; Christopher A Myers; Leonard N Binn; Richard G Jarman; Robert A Kuschner; Natalie D Collins
Journal: Emerg Infect Dis Date: 2020-07 Impact factor: 6.883

10. Seroepidemiological investigation of HAdV-4 infection among healthy adults in China and in Sierra Leone, West Africa.

Authors: Busen Wang; Jianhua Li; Shipo Wu; Yi Chen; Zhe Zhang; Yanfang Zhai; Qiang Guo; Jinlong Zhang; Xiaohong Song; Zhenghao Zhao; Lihua Hou; Wei Chen
Journal: Emerg Microbes Infect Date: 2018-12-05 Impact factor: 7.163