Literature DB >> 32475424

Endogenous avian leukosis virus subgroup E elements of the chicken reference genome.

Andrew S Mason1, Janet E Fulton2, Jacqueline Smith3.   

Abstract

The chicken reference genome contains 2 endogenous avian leukosis virus subgroup E (ALVE) insertions, but gaps and unresolved repetitive sequences in previous assemblies have hindered their precise characterization. Detailed analysis of the most recent reference genome (GRCg6a) now shows both ALVEs within contiguous chromosome assemblies for the first time. ALVE6 (ALVE-JFevA) and ALVE-JFevB are both located on chromosome 1, with ALVE6 close to the p-arm telomere. ALVE-JFevB is a structurally intact element containing the ALVE gag, pol, and env genes and is capable of forming replication competent viruses. In contrast, ALVE6 contains a 3,352 bp 5' truncation and lacks the entire 5' long terminal repeat and gag gene. Despite this, ALVE6 remains able to produce intact envelope protein, likely due to a mutation in the recognition site for a known inhibitory miRNA (miR-155). Whole genome resequencing data sets from layers, broilers, and 3 independent sources of wild-caught red junglefowl were surveyed for the presence of each of these reference genome ALVEs. ALVE-JFevB was found in no other chicken or red junglefowl genomes, whereas ALVE6 was identified in some layers, broilers, and native breeds but not within any other red junglefowl genome. Improved assembly contiguity has facilitated better characterization of the 2 ALVEs of the chicken reference genome. However, both the limited ALVE content and unique presence of ALVE-JFevB suggests that the reference individual is unrepresentative of ancestral Gallus gallus ALVE diversity.
Copyright © 2020 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  ALVE; ALVE-JFevB; ALVE6; ERV; reference genome

Mesh:

Year:  2020        PMID: 32475424      PMCID: PMC7597685          DOI: 10.1016/j.psj.2019.12.074

Source DB:  PubMed          Journal:  Poult Sci        ISSN: 0032-5791            Impact factor:   3.352


Introduction

Endogenous retroviruses (ERV) constitute approximately 3% of the chicken (Gallus gallus) genome, a consequence of millions of years of retroviral integrations into the germline (Mason et al., 2016). The avian leukosis virus (ALV) is the only known chicken retrovirus with recurrent exogenous and endogenous activity, with the endogenous subgroup E (the ALVE, historically identified as ev genes) limited to the domestic chicken and its wild progenitor, the red junglefowl (RJF) (Frisby et al., 1979, Borysenko et al., 2008, Payne and Nair, 2012). Owing to their recent genome integration, ALVEs are typically present in low copy numbers, but many retain some structural integrity, facilitating persistent retroviral gene expression, and recombination with other ERV or exogenous retroviruses (Katzourakis et al., 2005, Payne and Nair, 2012). Recent in-depth studies have revealed the great diversity present across chicken populations, with more than 400 different ALVEs described to date (Benkel, 1998, Rutherford et al., 2016, Mason, 2018). In commercial populations, ALVE-induced viremia elicits reductions in growth rate and total body weight in broilers (Fox and Smyth, 1985, Ka et al., 2009), and egg weight, specific gravity, and lifetime egg production in layers (Kuhnlein et al., 1989, Gavora et al., 1991). Expression of replication-competent proviruses (Crittenden et al., 1984, Gavora et al., 1995), or even gag glycoproteins alone (Astrin and Robinson, 1979, Robinson et al., 1981), can induce tolerance to novel ALV infections, resulting in delayed immune response and a higher incidence of lymphoid tumors. Furthermore, coinfection with Marek's disease virus, including attenuated vaccine viruses, has been shown to reactivate otherwise silenced ALVE in the genome and increase the incidence of spontaneous lymphoid tumors (Cao et al., 2015). However, ALVE effects are complex as expression of env glycoproteins prohibits some of these effects by receptor interference (Smith et al., 1990, Smith et al., 1991). Despite extensive research into the effects of ALVEs, the 2 ALVEs present in the chicken reference genome remain incompletely described, most likely due to their locations within repetitive DNA, including one near the telomere of chromosome 1 (Benkel and Rutherford, 2014, Mason, 2018). The release of an updated, highly contiguous assembly (GRCg6a) provides a new opportunity to fully describe the ALVEs of the chicken reference genome. This study characterizes the location and structural integrity of both ALVE6 (ALVE-JFevA) and ALVE-JFevB in the GRCg6a assembly and determines their abundance in diverse chicken populations.

Materials and methods

ALVEs were detected in the new chicken genome assembly (GRCg6a; GenBank: GCA_000002315.5) by BLASTn (Altschul et al., 1990) using the ALVE1 reference sequence (GenBank: AY013303.1). Open reading frames (ORFs) were predicted with GLIMMER3 (Delcher et al., 2007), and the miR-155 AGCATTA recognition site (Hu et al., 2016) was annotated by the EMBOSS fuzznuc tool (Rice et al., 2000). Sequence surrounding each ALVE was annotated for other repetitive elements using RepBase CENSOR (Kohany et al., 2006) and identified repeat abundance was assessed by BLASTn. Sixteen whole-genome resequencing (WGS) data sets (totaling 142 chickens; summarized in Table 1), which were previously analyzed for their unassembled ALVE content (Mason, 2018), were used for this study. These samples included commercially used elite layer lines (White Leghorn, White Plymouth Rock, and Rhode Island Red breeds), Indonesian native breeds (Black Java, Black Sumatra, Kedu Hitam, and Sumatera), wild-caught RJF from Java, Sumatra, and Tibet, and an experimental research broiler line. WGS data were reanalyzed to specifically detect the presence of both ALVE6 (ALVE-JFevA) and ALVE-JFevB. Paired-end reads from each WGS data set were mapped to the GRCg6a assembly using BWA-mem v0.7.10 (Li, 2013), filtering out reads with a mapping quality less than 20. In all cases, average genome coverage across the assembled chromosome exceeded 10X. The presence of ALVE6 and ALVE-JFevB was detected by identifying reads with sequence homology to both the ALVE and the neighboring genome sequence. Such reads reflect contiguous sequence in the host genome and the presence of that specific ALVE insertion (Mason, 2018).
Table 1

Summary of whole genome resequencing datasets surveyed for ALVE6 and ALVE-JFevB.

NameLibrary preparationReference/Accession
Hy-Line International elite layer linesKranis et al., 2013
 5 x White Leghorn5 x Pool of 10
 2 x White Plymouth Rock2 x Pool of 10
 1 x Rhode Island Red1 x Pool of 10
Indonesian nativesDDBJ: DRA003951
 Black JavaPool of 10
 Black SumatraPool of 10
 Kedu HitamPool of 10
 SumateraPool of 5
 Red junglefowl from JavaPool of 3
 Red junglefowl from SumatraPool of 2
INRA experimental broiler line16 individualsENA: PRJNA247952
Red junglefowl from Tibet6 individualsENA: PRJNA241474

Abbreviation: ALVE, avian leukosis virus subgroup E.

Summary of whole genome resequencing datasets surveyed for ALVE6 and ALVE-JFevB. Abbreviation: ALVE, avian leukosis virus subgroup E.

Results and discussion

For the first time, the current chicken genome assembly (GRCg6a) contains both the endogenous ALV integrations present in the reference genome RJF (International Chicken Genome Sequencing Consortium, 2004). Previous assemblies had correctly assigned ALVE-JFevB to chromosome 1 (1p2.3), but ALVE6 (ALVE-JFevA) is located near the telomere of chromosome 1 (1p2.10), so remained unassembled and incompletely sequenced (Benkel and Rutherford, 2014). With the improvements in the current assembly, both ALVEs can now be more completely described. ALVE6 (1:210601-214776) is a 5′ truncated, 4,176 bp insertion in the forward orientation, with the previously identified target site duplication GGCGCT (Benkel, 1998) assembled at the 3′ end (Figure 1). The 5′ truncation has deleted 3,352 bp of the ALVE, without any associated flanking genomic sequence deletion, removing the 5′ long terminal repeat (LTR), gag domain and 67 bp of reverse transcriptase. The remaining sequence has 2 ORFs. The first (ALVE6:43-2013, first frame) encodes the reverse transcriptase thumb domain, RNaseH, and integrase, and the second (ALVE6:1877-3736, second frame) encodes an intact envelope. Chickens containing ALVE6 have long been known to express high titers of envelope glycoproteins (Robinson et al., 1981), perhaps due, in part, to a previously undescribed mutation in the recognition site of miR-155 ([A > G]GCATTA), a miRNA which typically regulates ALVE envelope expression by targeting transcripts for degradation (Hu et al., 2016).
Figure 1

ALVE6 integration site showing 2 putative open reading frames. ALVE6 pol is truncated at the 5′ end but encodes the reverse transcriptase (RT) thumb domain, intact RNaseH (RH), and integrase (INT) domains. The envelope ORF is complete and features a mutated recognition site for the known inhibitory miRNA miR-155 (vertical bar). Abbreviations: ALVE, avian leukosis virus subgroup E; LTR, long terminal repeat; ORF, open reading frame; SU, surface; TM, transmembrane.

ALVE6 integration site showing 2 putative open reading frames. ALVE6 pol is truncated at the 5′ end but encodes the reverse transcriptase (RT) thumb domain, intact RNaseH (RH), and integrase (INT) domains. The envelope ORF is complete and features a mutated recognition site for the known inhibitory miRNA miR-155 (vertical bar). Abbreviations: ALVE, avian leukosis virus subgroup E; LTR, long terminal repeat; ORF, open reading frame; SU, surface; TM, transmembrane. ALVE-JFevB (1:32724216-32731739) is an intact, 7,524 bp insertion in the forward orientation, with a GGCTTG target site duplication assembled at both ends (Figure 2). The ALVE-JFevB LTRs retain 100% identity and share 97.8% identity with the ALVE1 LTRs, with no variants affecting the TATA box or transcription factor binding sites, including the transcription start site. ALVE-JFevB contains intact ORF for gag-pol (ALVE-JFevB:479-5364), taking into account the ribosomal -1 frameshift just before the gag termination codon (Leblanc et al., 2013), and the envelope domain (ALVE-JFevB:5228-7084). Taken together, the intactness of ALVE-JFevB supports a recent integration and the ability to form replication competent viral particles. However, the presence of the intact miR-155 site within the envelope domain may inhibit complete expression of ALVE-JFevB.
Figure 2

ALVE-JFevB integration site within a GGERV20 element. ALVE-JFevB is intact with gag, pol, and env ORF with detectable subunits. GGERV20 is in the negative orientation and contains putative gag and pol ORFs. pol is truncated at the 3′ end, but the 4 core catalytic components remain unaffected. Abbreviations: ALVE, avian leukosis virus subgroup E CA, capsid; INT, integrase; LTR, long terminal repeat; MA, matrix; NC, nucleocapsid; ORF, open reading frame; PR, protease; RH, RNaseH; RT, reverse transcriptase; SU, surface; TM, transmembrane.

ALVE-JFevB integration site within a GGERV20 element. ALVE-JFevB is intact with gag, pol, and env ORF with detectable subunits. GGERV20 is in the negative orientation and contains putative gag and pol ORFs. pol is truncated at the 3′ end, but the 4 core catalytic components remain unaffected. Abbreviations: ALVE, avian leukosis virus subgroup E CA, capsid; INT, integrase; LTR, long terminal repeat; MA, matrix; NC, nucleocapsid; ORF, open reading frame; PR, protease; RH, RNaseH; RT, reverse transcriptase; SU, surface; TM, transmembrane. The ALVE-JFevB integration site is complex because of its location within another transposable element: GGERV20, an ERV related to spumaviruses (Huda et al., 2008, Benkel and Rutherford, 2014). GGERV20 is a 5,827-bp element, which retains the ability to retrotranspose within the genome and is therefore polymorphic between chicken populations, with at least 65 full-length copies throughout the GRCg6a assembly. ALVE-JFevB has inserted within a reverse orientation GGERV20, 801 bp from the GGERV20 3′LTR. While this does disrupt the 3′ end of the GGERV20 pol ORF, the longer 5′ fragment contains all the core polymerase catalytic domains (Figure 2) and therefore may retain functional activity if expressed.

The Prevalence of ALVE6 and ALVE-JFevB Among Chickens

ALVE-JFevB was detected in no other analyzed dataset, including the wild-caught RJF from Java, Sumatra, and Tibet. Furthermore, the GGERV20 element in which ALVE-JFevB has inserted was also not found in any other data set, suggesting a sequential GGERV20 retrotransposition followed by ALVE-JFevB integration in the reference RJF lineage. Crucially, the LTR pairs of both ALVE-JFevB and the outer GGERV20 share 100% identity, supporting evolutionarily recent integrations. ALVE6 was also not found in any other RJF genome in this study. However, ALVE6 was identified in the INRA broiler population, 4 Hy-Line elite layer lines (2 White Leghorn and 2 White Plymouth Rocks), and in the Black Java and Black Sumatra birds from Indonesia. Although this distribution is quite broad, it does not unambiguously support the presence of ALVE6 in the common G. gallus ancestor. Recent work has revealed the large diversity of chicken ALVEs within and between populations. Noncommercial chicken populations harbor large numbers of low frequency and lineage-specific ALVEs, with individual bird genomes typically containing more than 6 ALVE loci (Rutherford et al., 2016, Mason, 2018). The presence of only 2 ALVEs in the RJF reference genome is another measure of how unrepresentative this ‘reference’ bird is of extant G. gallus genomic diversity (Ulfah et al., 2016). Consequently, great care needs to be taken when using the reference genome as a background for the study of ALVEs, such as the recent work by Sun et al., 2017, who identified postdomestication piRNA-mediated defense against these elements. Such comparative genomic approaches, particularly those attempting to unpick the complex chicken domestication process (Rubin et al., 2010), should only be undertaken once the complete ALVE complement of that bird or line has been ascertained by methods such as obsERVer (Mason, 2018). This is indicative of a wider issue of confidence in reference genomes, as ALVEs are just one marker, which highlights how unrepresentative of chicken diversity the reference genome can be. The GRCg6a assembly contiguity improvements facilitate more comprehensive ALVE and broader structural variant identification from WGS projects. However, ALVEs may still be missed when present on the largely incomplete, or absent, assemblies of chromosomes 29, 34–38 and W, or in the poorly assembled centromeres and telomeres.

Conclusions

The current chicken genome assembly (GRCg6a) now shows both the genomic location and complete sequence of the 2 ALVEs known to exist with the reference RJF. ALVE-JFevB is structurally intact and is unique to the reference genome. ALVE6 (ALVE-JFevA), while truncated by nearly half its length, is found across diverse chicken breeds and is capable of producing envelope protein, potentially due to a mutation identified here in the miR-155 recognition site, a miRNA which typically marks envelope transcript for degradation. Examination of genome sequences from diverse chicken populations and wild-caught RJF did not reveal the universal presence of either of these 2 ALVEs, and showed that, on average, individual chicken genomes typically contain over 6 ALVE integrations. These 2 observations suggest that the reference genome is not representative of G. gallus ALVE abundance and diversity and is unlikely to represent ALVE content in ancestral RJF. Therefore, caution must be applied when using the current reference genome as a baseline for the predomesticated state.
  31 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

2.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

3.  The evolutionary dynamics of endogenous retroviruses.

Authors:  Aris Katzourakis; Andrew Rambaut; Oliver G Pybus
Journal:  Trends Microbiol       Date:  2005-10       Impact factor: 17.079

4.  The influence of ev6 on the immune response to avian leukosis virus infection in rapid-feathering progeny of slow- and rapid-feathering dams.

Authors:  E J Smith; A M Fadly; I Levin; L B Crittenden
Journal:  Poult Sci       Date:  1991-08       Impact factor: 3.352

5.  The effects of recessive white and dominant white genotypes on early growth rate.

Authors:  W Fox; J R Smyth
Journal:  Poult Sci       Date:  1985-03       Impact factor: 3.352

6.  Gs, an allele of chickens for endogenous avian leukosis viral antigens, segregates with ev 3, a genetic locus that contains structural genes for virus.

Authors:  S M Astrin; H L Robinson
Journal:  J Virol       Date:  1979-08       Impact factor: 5.103

7.  Influence of selection for egg production and Marek's disease resistance on the incidence of endogenous viral genes in White Leghorns.

Authors:  U Kuhnlein; M Sabour; J S Gavora; R W Fairfull; D E Bernon
Journal:  Poult Sci       Date:  1989-09       Impact factor: 3.352

Review 8.  Posttranscriptional regulation of retroviral gene expression: primary RNA transcripts play three roles as pre-mRNA, mRNA, and genomic RNA.

Authors:  Jason Leblanc; Jason Weil; Karen Beemon
Journal:  Wiley Interdiscip Rev RNA       Date:  2013-06-10       Impact factor: 9.957

9.  Endogenous viral genes influence infection with avian leukosis virus.

Authors:  J S Gavora; J L Spencer; B Benkel; C Gagnon; A Emsley; A Kulenkamp
Journal:  Avian Pathol       Date:  1995-12       Impact factor: 3.378

10.  Development of a high density 600K SNP genotyping array for chicken.

Authors:  Andreas Kranis; Almas A Gheyas; Clarissa Boschiero; Frances Turner; Le Yu; Sarah Smith; Richard Talbot; Ali Pirani; Fiona Brew; Pete Kaiser; Paul M Hocking; Mark Fife; Nigel Salmon; Janet Fulton; Tim M Strom; Georg Haberer; Steffen Weigend; Rudolf Preisinger; Mahmood Gholami; Saber Qanbari; Henner Simianer; Kellie A Watson; John A Woolliams; David W Burt
Journal:  BMC Genomics       Date:  2013-01-28       Impact factor: 3.969

View more
  1 in total

1.  Identification and characterisation of endogenous Avian Leukosis Virus subgroup E (ALVE) insertions in chicken whole genome sequencing data.

Authors:  Janet E Fulton; David W Burt; Andrew S Mason; Ashlee R Lund; Paul M Hocking
Journal:  Mob DNA       Date:  2020-06-30
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.