| Literature DB >> 29706541 |
Stephanie S Pavlovich1, Sean P Lovett2, Galina Koroleva2, Jonathan C Guito3, Catherine E Arnold2, Elyse R Nagle2, Kirsten Kulcsar2, Albert Lee4, Françoise Thibaud-Nissen5, Adam J Hume6, Elke Mühlberger6, Luke S Uebelhoer3, Jonathan S Towner3, Raul Rabadan4, Mariano Sanchez-Lockhart7, Thomas B Kepler8, Gustavo Palacios9.
Abstract
Bats harbor many viruses asymptomatically, including several notorious for causing extreme virulence in humans. To identify differences between antiviral mechanisms in humans and bats, we sequenced, assembled, and analyzed the genome of Rousettus aegyptiacus, a natural reservoir of Marburg virus and the only known reservoir for any filovirus. We found an expanded and diversified KLRC/KLRD family of natural killer cell receptors, MHC class I genes, and type I interferons, which dramatically differ from their functional counterparts in other mammals. Such concerted evolution of key components of bat immunity is strongly suggestive of novel modes of antiviral defense. An evaluation of the theoretical function of these genes suggests that an inhibitory immune state may exist in bats. Based on our findings, we hypothesize that tolerance of viral infection, rather than enhanced potency of antiviral defenses, may be a key mechanism by which bats asymptomatically host viruses that are pathogenic in humans.Entities:
Keywords: Chiroptera; antiviral immunity; filovirus; genome; innate immunity; natural killer cell receptors; type I interferon
Mesh:
Substances:
Year: 2018 PMID: 29706541 PMCID: PMC7112298 DOI: 10.1016/j.cell.2018.03.070
Source DB: PubMed Journal: Cell ISSN: 0092-8674 Impact factor: 66.850
Contiguity and Coverage among Bat Genomes
| Species (Assembly Name) | Scaffold N50 (Kb) | Contig N50 (Kb) | Total Length (Gb) | Total Gap Length (Mb) | Coverage/Sequencing Technology |
|---|---|---|---|---|---|
| 2,007.2 | 1,489.0 | 1.910 | 0.482 | 169.2× Illumina HiSeq 2500 and Pacific Biosciences RS II | |
| 5,954.0 | 21.9 | 2.198 | 181.040 | 188.0× Illumina | |
| 15,954.8 | 31.8 | 1.986 | 41.334 | 110× Illumina HiSeq 2000 | |
| 4,293.3 | 64.3 | 2.035 | 68.155 | 7× Sanger | |
| 3,225.8 | 23.3 | 2.107 | 125.473 | 120× Illumina HiSeq 2000 | |
| 3,454.5 | 15.2 | 2.060 | 181.338 | 110× Illumina HiSeq 2000 | |
| 4,315.2 | 29.8 | 1.803 | 68.170 | 77.0× Illumina HiSeq | |
| 13,455.9 | 21.4 | 2.027 | 215.248 | 84× Illumina HiSeq | |
| 16.9 | 7.0 | 1.736 | 20.607 | 18.0× Illumina HiSeq | |
| 27.7 | 12.7 | 1.838 | 7.320 | 18.0× Illumina HiSeq | |
| 21.2 | 11.7 | 1.926 | 4.731 | 17.0× Illumina HiSeq | |
| 3,754.4 | 37.8 | 2.073 | 58.015 | 146.44× Illumina HiSeq | |
| 22.7 | 9.5 | 1.960 | 13.7 | 17.0× Illumina HiSeq | |
| 2,328.2 | 39.9 | 2.237 | 281.734 | 218.6× Illumina HiSeq |
The contiguity statistics (scaffold and contig N50, total length, and total gap length) are reported for each bat genome available in the NCBI GenBank database (accessed on 8/8/17), along with the GenBank Assembly Name and the given coverage. Kb, kilobases; Mb, megabases; Gb, gigabases. See also Figure S1, Tables S1 and S2, and STAR Methods.
Approximately 145× coverage of Illumina HiSeq 2500 data and 24× coverage of Pacific Biosciences RS II data.
Figure S1Genome Characteristics of Raegyp2.0, Related to Table 1
(A) K-mer frequency distribution in Raegyp2.0. The percentage (frequency, y axis) of all 25-mers that are present a given number of times (depth, x axis) in the Rousettus aegyptiacus genome sequence.
(B) Heterozygosity of Raegyp2.0. SNPs: Single-nucleotide polymorphisms; indels: insertions or deletions. Scaffold and contig.
(C and D) (C) Nx and (D) NGx plots in megabases (Mbp) for Raegyp2.0 with an estimated reference genome size of 2.11 Gb. See STAR Methods for detailed description of Nx and NGx plots.
Figure 1Gene Family Expansion and Contraction across a Phylogenetic Tree of 15 Mammalian Species
A maximum likelihood tree based on 2,400 orthologous proteins was generated and used to infer expansion and contraction of 7,698 gene families. The number of expanded and contracted gene families is in blue and red, respectively. Numbers in black are the bootstrap evidence for partitions based on 1,000 bootstrap replicates. Images used under a creative commons license. MRCA, most recent common ancestor.
See also Tables S3 and S4, Data S1, and STAR Methods.
Figure S2Model Selection Hierarchy for Positive and Purifying Selection Analysis, Related to Table 2 and STAR Methods
RA = R. aegyptiacus, Mega = megabats other than R. aegyptiacus, Micro = microbats, Non-bat = all species in non-bat branches (human, crab-eating macaque, rhesus macaque, mouse, dog, cow, pig, guinea pig, hamster, and horse). Arrows indicate nested models (e.g., an arrow pointing from Model 1a to 2a means that Model 1a is nested in Model 2a). For each applicable model, color indicates which branches were used to estimate which evolution rate (orange - ω1; green - ω2; purple - ω3; gray - ω0, i.e., the background rate of evolution). Model 0 was the best-fitting for 330 genes, Model 1a for 5 genes, Model 1b for 65 genes, Model 1c for 23 genes, Model 2a for 0 genes, model 2b for 32 genes, and Model 3 for 1 gene.
Positive and Relaxed Purifying Selection in R. aegyptiacus Immune Genes
| Symbol | Gene | ω0 | ω1 | FDR |
|---|---|---|---|---|
| Suppressor of IKBKE 1 | 0.117 | 1.107 | 1.85E−09 | |
| Interferon stimulated gene 15 ubiquitin-like modifier | 0.151 | 1.841 | 3.01E−08 | |
| Profilin 1 | 0.161 | 2.003 | 1.58E−05 | |
| CD48 molecule | 0.638 | 2.074 | 1.62E−04 | |
| Interleukin 1 receptor like 1 | 0.417 | 1.246 | 5.83E−04 | |
| NAD(P)H quinone dehydrogenase 1 | 0.204 | 1.210 | 6.23E−04 | |
| Interleukin 17A | 0.316 | 1.146 | 2.22E−03 | |
| Leptin | 0.360 | 2.289 | 2.42E−03 | |
| Oncostatin M receptor | 0.534 | 1.053 | 6.85E−03 | |
| TNF receptor superfamily member 1A | 0.346 | 1.133 | 1.58E−02 | |
| Interferon alpha and beta receptor subunit 1 | 0.527 | 1.193 | 4.64E−02 | |
| Dexd/h-box helicase 58 | 0.339 | 0.586 | 5.77E−03 | |
| Toll like receptor 8 | 0.346 | 0.608 | 1.82E−03 | |
| Nucleotide binding oligomerization domain containing 2 | 0.150 | 0.267 | 1.02E−03 | |
| Lymphotoxin beta receptor | 0.314 | 0.803 | 5.24E−04 | |
| Nuclear factor kappa B subunit 2 | 0.083 | 0.324 | 6.67E−12 | |
| REL proto-oncogene, NF-κB subunit | 0.225 | 0.424 | 1.11E−02 | |
| RELA proto-oncogene, NF-κB subunit | 0.093 | 0.424 | 4.10E−11 | |
| RELB proto-oncogene, NF-κB subunit | 0.087 | 0.545 | 8.31E−15 | |
| Ribonuclease L | 0.538 | 0.906 | 5.77E−03 | |
| Janus kinase 2 | 0.090 | 0.303 | 1.60E−03 | |
| Signal transducer and activator of transcription 3 | 0.002 | 0.016 | 4.72E−02 |
Evolution rates were estimated under several models, and the best fitting model was chosen. ω0 refers to the background rate of evolution and always includes non-bat species, but also includes some bat species under certain models. ω1 refers to the rate of evolution of the R. aegyptiacus gene but also includes other bat species under certain models. FDR, false discovery rate. See Figure S2 and STAR Methods for full description of the models. See also Table S5.
ω1 estimate under Model 1a.
ω1 estimate under Model 1b.
ω1 estimate under Model 1c.
ω1 estimate under Model 3.
Figure 2Expansion of the NKG2 Genes in R. aegyptiacus
(A) CD94 and NKG2 genes in the natural killer complex in Raegyp2.0. Each arrow designates a scaffold sequence in the Raegyp2.0 genome (see STAR Methods for accessions). Not pictured are pseudogenes and non-coding genes. The ellipse indicates the presence of additional non-NKG2 genes on the same scaffold.
(B) Multiple sequence alignments showing activating and inhibitory signaling motifs in NKG2 genes in humans and three bats. There were no putative functional bat NKG2 genes identified in in P. alecto or M. davidii except NKG2-D. ITIM (immunoreceptor tyrosine-based inhibition motif) residues are in red, and the signal anchor residue lysine (K) or arginine (R) are in green and blue respectively. Dashes represent gaps in the alignment.
(C) Expression of putative functional NKG2 genes in transcriptomic data from 10 tissues in a R. aegyptiacus bat. Rows are ordered by highest average expression of transcripts across all tissues for a given gene. Expression is reported in log2(TPM), where TPM refers to transcripts per million. Data analyzed from Lee et al. (2015).
(D) Maximum likelihood phylogenetic tree of bat NKG2 proteins and homologs in other species. NKG2 proteins from R. aegyptiacus are colored by predicted function. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if 65 or over.
See also Figures S3, S5, and S7 and STAR Methods.
Figure S3Multiple Sequence Alignments and Locus Maps of NKG2 Proteins in Bats, Related to Figures 2 and 3
Dots in alignments represent identity to the human protein sequence.
(A and B) Alignment of human NKG2A and C and putative functional bat NKG2 protein sequences. (A) shows the NKG2 residues that are known to contact CD94 in the human NKG2A protein (in orange), and (B) shows the NKG2 residues that are known to contact HLA-E/peptide in human NKG2A protein (in blue) (Kaiser et al., 2008).
(C) Locus maps of the NKG2 and CD94 genes in the natural killer complex. Each arrow designates a scaffold sequence in corresponding bat genome (see STAR Methods for accessions). Not pictured are unrelated pseudogenes and non-coding genes. The ellipse indicates the presence of additional non-NKG2 genes on the same scaffold.
(D) Alignment of the transmembrane domain of putative functional NKG2D proteins in humans and bats. Dashes indicate positions of diversity in the consensus sequence. The conserved arginine residue that serves as a signal anchor is shown in red. NKG2D-1 is shown for M. davidii, and NKG2D-2 is shown for M. lucifugus.
Figure S5Related to Figures 2 and 3
(A–C) Expanded maximum likelihood phylogenies of (A) NKG2, (B) CD94 proteins, and (C) NKG2D proteins. R. aegyptiacus proteins are marked by red dots. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if over 65.
Figure S7Diversity and Expression of Type I IFN Genes in R. aegyptiacus, Related to Figure 5
(A) Phylogeny of R. aegyptiacus type I IFN proteins. Maximum likelihood phylogenetic tree of bat type I IFN proteins. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if over 65.
(B) Antiviral effect of recombinant R. aegyptiacus IFN-ω4. RoNi cells were treated with recombinant IFN-ω4 (rIFN-ω4), rIFN-β1, or an unrelated protein (rPA-D1) for 4 hr, infected with VSV-eGFP at an MOI of 0.05, and imaged for eGFP expression 1 day post infection. Higher concentrations of recombinant IFN-ω4 inhibit viral replication as demonstrated by the absence of eGFP expression in cells after multiple viral replication cycles. Brightness was increased by 20% on all images.
(C) Sendai virus (SeV) infection of RoNi cells elicits an IFN response, including IFN-ω. RoNi cell monolayers were infected with SeV strain Cantell at an MOI of 1.0 or mock infected, and harvested for total RNA extraction and sequencing at 3, 8, and 24 hr. Sequencing data were quantified by IFN subtype in transcripts per million (TPM). Values plotted are the mean ± standard deviation of three replicates for each time point. IFN-ε and IFN-δ were not expressed. Adj. p values from unpaired t test between SeV and mock: ∗ < 0.05, ∗∗ < 0.005, ∗∗∗ < 0.0005. See STAR Methods.
Figure 3Expression and Diversity of CD94 in R. aegyptiacus
(A) Multiple sequence alignments showing conserved cysteine residues in CD94 genes in humans and five bats. Pseudogenes are indicated with the letter p in protein name. Asterisks indicate missing residues from a partial P. vampyrus CD94.
(B) Expression of CD94 genes in transcriptomic data from ten tissues in an Egyptian rousette bat. Rows are ordered by highest average expression of transcripts across all tissues for a given gene. Expression is reported in log2(TPM), where TPM refers to transcripts per million. Data analyzed from Lee et al. (2015).
(C) Maximum likelihood phylogenetic tree of bat CD94 proteins and homologs in other species. CD94 proteins from R. aegyptiacus are marked by red dots. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if over 65.
See also Figure S3, Figure S4, Figure S5 and STAR Methods.
Figure S4Multiple Sequence Alignments of CD94 Proteins in Bats, Related to Figure 3
Dots in alignments represent identity to the human protein sequence, while dashes represent gaps in the alignment.
(A and B) Alignment of human CD94 and putative functional bat CD94 protein sequences. (A) shows the CD94 residues that are known to contact NKG2A in the human CD94 protein (in orange), and (B) shows the CD94 residues that are known to contact HLA-E/peptide in human CD94 protein (in blue) (Kaiser et al., 2008).
(C) Alignment of the transmembrane domain of CD94 in human, mouse, rat, and bats. The lysine (K) residue that serves as a signal anchor for DAP10 and DAP12 in rodents is shown in blue. This residue is not conserved in bat CD94 proteins.
(D) Alignment of the cytoplasmic domain of CD94 in human and bats.
Figure 4Characterization of the MHC Class I Region in Raegyp2.0
(A and B) Locus maps of (A) the MHC class I region and (B) MHC class I genes outside the canonical class I region in Raegyp2.0. Each arrow designates a scaffold sequence in the Raegyp2.0 genome (see STAR Methods for accessions). Not pictured are non-MHC pseudogenes and non-coding genes. The ellipse indicates the presence of additional genes on the same scaffold. Black, MHC class I genes; gray, non-MHC genes; dark blue, MICB; unfilled boxes, MHC class I pseudogenes. The α, κ, and β class I duplication blocks are shown in red, purple, and green, respectively.
(C) Expression of MHC class I genes in transcriptomic data from 10 tissues in an Egyptian rousette bat. Rows are ordered by highest average expression of transcripts across all tissues for a given gene. Expression is reported in log2(TPM), where TPM refers to transcripts per million. Data analyzed from Lee et al. (2015).
(D) Maximum likelihood phylogenetic tree of bat MHC class I proteins (R. aegyptiacus proteins in red) with human MHC class I proteins as an outgroup. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if 65 or over.
(E) Sequence logo plots showing the sequence diversity of predicted nonamer peptides derived from the signal sequences of MHC class I genes from human (H. sapiens), mouse (M. musculus), R. aegyptiacus, and the gray mouse lemur (M. murinus). The y axis shows information content in bits, and the x axis shows position in the nonamer.
See also STAR Methods.
Figure S6NKG2D Ligands in Bats, Related to Figure 2
Maximum likelihood phylogenetic tree of bat and human NKG2D ligands or NKG2D ligand-like proteins. R. aegyptiacus sequences are shown in red. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if over 65. Green – MILL and MIC-like proteins, purple – RAET1E-like proteins, blue – human ULBP proteins, yellow – two groups of bat proteins. See STAR Methods for abbreviations.
Figure 5Diversity of the Type I Interferons in Raegyp2.0
(A) Locus map of the type I IFNs in Raegyp2.0. Each arrow designates a scaffold sequence in the Raegyp2.0 genome (see STAR Methods for scaffold accessions). Unfilled boxes indicate pseudogenes. Orange, IFNβ; blue, IFNω; red, IFNα; yellow, IFNδ; purple, IFNε; green, IFNκ. The single non-IFN gene within the locus (KLHL9) is in gray. Not pictured are non-coding genes. The ellipse indicates the presence of additional non-IFN genes on the same scaffold.
(B) Maximum likelihood phylogenetic tree of bat type I IFN proteins and homologs in other species. R. aegyptiacus proteins are marked in red, with groups of closely related proteins collapsed. Bootstrap evidence (percentage of 500 bootstrap replicates) is labeled on branches if 65 or over.
See also Figure S6, Table S6, and STAR Methods.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Sendai virus, Cantell strain | Charles River Laboratories | Cat# 10100774 |
| Vesicular stomatitis virus-eGFP | N/A | |
| Healthy, wild-caught, older juvenile male Egyptian Rousette bat ( | This paper | |
| Recombinant 6xHis-IFN-β1 (R. aegyptiacus protein) | This paper (Blue Heron Biotech) | N/A |
| Recombinant 6xHis-IFN-ω4 (R. aegyptiacus protein) | This paper (Blue Heron Biotech) | N/A |
| Recombinant 6xHis-PA-D1 (Domain 1 of | This paper (Blue Heron Biotech) | N/A |
| Human: FreeStyle 293-F Cells | Thermo Fisher Scientific | Cat# R79007 |
| FreeStyle MAX Reagent | Thermo Fisher Scientific | Cat# 16447500 |
| OptiPRO SFM (1X) | Thermo Fisher Scientific | Cat# 12309-050 |
| Lysis/Binding Solution Concentrate | Thermo Fisher Scientific | Cat# 4462362 |
| Capturem His-tagged purification maxiprep kit | Clontech, Takara Bio | Cat# 635713 |
| Vivaspin 2 Protein Concentrators, MWCO 10000 | GE Life Sciences | Cat# 28932247 |
| UltraClean Blood DNA Isolation Kit (customized protocol) | MO BIO Laboratories | Cat# 12000-100 |
| Prep X Complete ILMN DNA Library Kit | WaferGen Biosystems | Cat# 640101 |
| KAPA Library Amplification kit | KAPA Biosystems | Cat# KK2621 |
| KAPA Library Quantification kit | KAPA Biosystems | Cat# KK4824 |
| TruSeq PE Cluster Kit v3-cBot-HS | Illumina | Cat# PE-401-3001 |
| TruSeq SBS Kit – HS (200 cycle) | Illumina | Cat# FC-401-3001 |
| MagMAX-96 Total RNA Isolation Kit | Ambion | Cat# AM1830 |
| SMRTbell Template Prep Kit 1.0 | Pacific Biosciences | Cat# 100-259-100 |
| DNA/Polymerase Binding Kit P4 | Pacific Biosciences | Cat# 100-236-500 |
| DNA/Polymerase Binding Kit P5 | Pacific Biosciences | Cat# 100-256-000 |
| DNA Sequencing Reagent 2.0 | Pacific Biosciences | Cat# 100-216-400 |
| DNA Sequencing Reagent 3.0 | Pacific Biosciences | Cat# 100-254-800 |
| SMRT Cell 8Pack V3 | Pacific Biosciences | Cat# 100-171-800 |
| Bioanalyzer DNA 12000 Kit | Agilent | Cat# 5067-1508 |
| TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat High Throughput (96 samples, 96 indexes) | Illumina | Cat# 20020597 |
| HiSeq PE Cluster Kit V4 - cBot | Illumina | Cat# PE-401-4001 |
| HiSeq SBS Kit V4 250 cycle kit | Illumina | Cat# FC-401-4003 |
| High Sensitivity D1000 ScreenTape | Agilent | Cat# 5067-5582 |
| High Sensitivity D1000 Reagents | Agilent | Cat# 5067-5583 |
| Raegyp2.0 genome sequence | This paper | GenBank: GCA_001466805.2; RefSeq: GCF_001466805.2; GenBank: LOCP00000000.2 |
| GenBank: GECF00000000.1; SRA Project: SRP066106 | ||
| Sendai virus-infected RoNi cell transcriptome data | This paper | NCBI GEO: |
| Ensembl database release 90 | ||
| RefSeq proteins – Pteropus vampyrus | Baylor College of Medicine; NCBI RefSeq | RefSeq: GCF_000151845.1 |
| RefSeq proteins – Pteropus alecto | RefSeq: GCF_000325575.1 | |
| RefSeq proteins – Myotis davidii | RefSeq: GCF_000327345.1 | |
| RefSeq proteins – Myotis lucifugus | RefSeq: GCF_000147115.1 | |
| RefSeq proteins – Macaca fascicularis | Washington University; NCBI RefSeq | RefSeq: GCF_000364345.1 |
| RefSeq proteins – Macaca mulatta | Baylor College of Medicine Genome Sequencing Center; NCBI RefSeq | RefSeq: GCF_000772875.2 |
| RefSeq proteins – Cavia porcellus | The Genome Sequencing Platform, The Genome Assembly Team; NCBI RefSeq | RefSeq: GCF_000151735.1 |
| RefSeq proteins – | Beijing Genomics Institute; NCBI RefSeq | RefSeq: GCF_000223135.1 |
| RefSeq proteins – Equus caballus | The Genome Assembly Team; NCBI RefSeq | RefSeq: GCF_000002305.2 |
| RefSeq proteins – | Genome Reference Consortium; NCBI RefSeq | RefSeq: GCF_000001635.24 |
| RefSeq proteins – Sus scrofa | The Swine Genome Sequencing Consortium (SGSC); NCBI RefSeq | RefSeq: GCF_000003025.5 |
| RefSeq proteins – Bos taurus | Cattle Genome Sequencing International Consortium; NCBI RefSeq | RefSeq: GCF_000003205.7 |
| RefSeq proteins – Canis familiaris | Dog Genome Sequencing Consortium; NCBI RefSeq | RefSeq: GCF_000002285.3 |
| RefSeq proteins – Homo sapiens | Genome Reference Consortium; NCBI RefSeq | RefSeq: GCF_000001405.33 |
| RefSeq proteins – Rousettus aegyptiacus | This paper | RefSeq: GCF_001466805.2 |
| N/A | ||
| pCAGGS/6xHis–IFN-ω4 | This paper | N/A |
| pCAGGS/6xHis–IFN-β1 | This paper | N/A |
| pET22b/6xHis-PA-D1 | This paper | N/A |
| Jellyfish | ||
| SparseAssembler | ||
| LoRDEC v0.5 | ||
| DBG2OLC | ||
| LINKS v1.5.1 | ||
| L_RNA_Scaffolder (downloaded 9/28/2015) | ||
| SSPACE v3.0 | ||
| Bowtie2 | ||
| Pilon | ||
| Quiver | Pacific Biosciences | |
| QUAST v3.0 | ||
| CEGMA v2.5 | ||
| bwa-mem v0.7.10 | ||
| SAMtools v0.1.18; v1.3 | ||
| NCBI Eukaryotic Genome Annotation Pipeline; Gnomon tRNAscan-SE | ||
| WindowMasker | ||
| Splign; ProSplign | ||
| OrthoMCL v2.0.9 | ||
| bioDB | ||
| RepeatMasker | ||
| “One code to find them all” script | ||
| PicardTools v1.131 | ||
| BCFTools v1.3.1 | ||
| Mafft v7.305b | ||
| trimAL v1.3; v1.4 | ||
| RAxML v8.2.9 | ||
| CAFE v3.1 | ||
| Mega 6.0 | ||
| MUSCLE v3.8.31 | ||
| kallisto v.0.43.0 | ||
| pheatmap | ||
| PAML v4.9b | ||
| LMAP v1.0.0 | ||
| BioEdit v7.0.0 | ||
| Trimmomatic-0.33 | ||
| DIGS for EVE, EVE library v0.1 | Gifford n.d. | |
| PANTHER database | ||
| Ensembl release 90 | ||