| Literature DB >> 28558694 |
Irene Bassano1,2, Swee Hoe Ong1, Nathan Lawless3, Thomas Whitehead3, Mark Fife3, Paul Kellam4,5.
Abstract
BACKGROUND: Interferon inducible transmembrane (IFITM) proteins are effectors of the immune system widely characterized for their role in restricting infection by diverse enveloped and non-enveloped viruses. The chicken IFITM (chIFITM) genes are clustered on chromosome 5 and to date four genes have been annotated, namely chIFITM1, chIFITM3, chIFITM5 and chIFITM10. However, due to poor assembly of this locus in the Gallus Gallus v4 genome, accurate characterization has so far proven problematic. Recently, a new chicken reference genome assembly Gallus Gallus v5 was generated using Sanger, 454, Illumina and PacBio sequencing technologies identifying considerable differences in the chIFITM locus over the previous genome releases.Entities:
Keywords: Chicken IFITM; Genetic characterization; Illumina MiSeq; PacBio RSII; RNA-seq
Mesh:
Year: 2017 PMID: 28558694 PMCID: PMC5450142 DOI: 10.1186/s12864-017-3801-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
PacBio and Illumina MiSeq de novo assembly and mapping statistics
| PacBio RSII | Illumina MiSeq | |||
|---|---|---|---|---|
| Number of reads | 78,140 | 665,450 | ||
| Number of Bases | 401,758,407 | 199,635,000 | ||
| Mean Read Length | 5141 | 300 | ||
|
| ||||
| Assembly software | HGAP | IVA | ||
| Polished contigs | 6 | 13 | ||
| Sum of contigs length | 4,818,915 bp | 277,830 bp | ||
| Largest fragment | 2,323,934 bP | 73,284 bp | ||
| N50a | 1,102,549 bp | NA | ||
| Mapping | ||||
| Reference | Mapped reads | Mean coverage | Mapped reads | Mean coverage |
| Chr.5 | 33,892 | 193 | 586,297 | 607 |
| Chr.5 | 34,068 | 196 | 693,474 | 440 |
| PacBio_contig N.2 | NA | NA | 606,994 | 599 |
aN50 read length metric: The read length at which 50% of the bases are in reads longer than, or equal to, this value
Basic statistic of de novo assembled contigs from PacBio reads
| Contig | Length | Base calleda | Consensus accuracyb | Base coveragec |
|---|---|---|---|---|
| 1 | 2323934 | 1.0 | 0.99 | 38.8 |
| 2 | 223345 | 0.99 | 0.99 | 419.37 |
| 3 | 1102486 | 1.0 | 0.99 | 40.56 |
| 4 | 623652 | 0.99 | 0.99 | 38.0 |
| 5 | 537146 | 0.99 | 0.99 | 36.7 |
| 6 | 17862 | 1.0 | 0.99 | 28.56 |
aBases Called: The percentage of reference sequence that has ≥ 1x coverage. % Bases Called + % Missing Bases should equal 100; bConsensus Accuracy: The accuracy of the consensus sequence compared to the reference; cBase Coverage: The mean depth of coverage across the reference sequence
Fig. 1Locus comparison between PacBio consensus sequence (contig 2) and a portion of chromosome 5 of the two versions of the chicken genome. a: The 203 kb BAC reference sequence contained in the PacBio contig 2 (in the middle) is compared with chromosome 5 of Gallus gallus v4 (top) or v5 (bottom) using ACT, Artemis Comparison Tool. The annotation files for Gallus gallus v4 and PacBio contig 2 have been compressed to allow visualization of the whole BAC; for Gallus gallus v5 it was drawn manually only to visualize location of the locus. b: The chIFITM locus (circled in A) is enlarged in B to show only the chIFITM locus including the flanking genes (this is a 40 kb region extracted from the 203Kb total). Gaps are visible in Gallus gallus v4 represented by white bars (N nucleotides), while these are absent in the comparison with the more complete Gallus gallus v5. The graph does not show differences at the nucleotide level, but only an overall view of the locus. c: Dot Plot comparison graphs of the assembled PacBio contig 2 versus Gallus gallus v5 showing differences not visible when using ACT for the 40Kb region. The region enlarged in the right Dot Plot shows a stretch of the genomic region within the intronic region of the chIFITM3 gene which shows differences with chicken genome assembly v5. d Clustal Omega alignment of the PacBio contig 2 consensus sequences and the chicken genome v5 (portion of the IFITM3 gene corresponding to the gap seen in 2C). In yellow is highlighted the gap
Fig. 2Artemis coverage and stack view of Illumina MiSeq reads mapped against PacBio consensus sequence (contig 2). a Overall coverage and GC content of the Illumina MiSeq BAC reads (203 kb region) mapped against the PacBio contig 2. This reference was built using the annotation of Gallus gallus v4 as scaffold. The chIFITM genes are located between 138150 and 177724 in the 203Kb region. b stack view of the Illumina MiSeq reads showing the chIFITM locus
Coordinates of the chIFITM genes within the PacBio consensus sequence (contig 2)
| Gene | Location in contig 2 |
|---|---|
| chIFITM1 | 162068..163611 |
| chIFITM2 | 164151..165395 |
| chIFITM3 | 158589..159917 |
| chIFITM5 | 165955..167524 |
| ATHL1 | 168807..177724 |
| B4GALNT4 | 138150..157395 |
Fig. 3a Artemis coverage and stack view of the IFITM locus in DF1 cells following pull down of the IFITM locus using SureSelect probes and sequencing with PacBio. The figure shows an intact locus and successful mapping of the IFITM locus against the Gallus gallus sequence reference, despite two gaps observed within the B4GALNT4 and IFITM3 genes. b Artemis coverage and stack view of the IFITM locus in DF1 cells following pull down of the IFITM locus using SureSelect probes and sequencing with PacBio. These reads were instead mapped against the new PacBio contig 2 sequence reference. As for the mapping above, two gaps (one partial) are observed within the B4GALNT4 and IFITM3 genes, although more reads cross the gaps, allowing full coverage. c Artemis coverage and stack view of the IFITM locus in turkey breast tissue following pull down of the IFITM locus using SureSelect probes and sequencing with Illumina MiSeq. The graph shows successful mapping of MiSeq reads despite using chicken probes to pull down the locus in turkey tissue. The white bars represent actual gaps in the turkey reference as published on both Ensemble and NCBI and to which the probes will not eventually map as gaps are shown in the reference as “NNN”
Fig. 4Clustal Omega alignment of the amino acid sequence of the IFITM proteins derived from the consensus sequence of DF1 and turkey samples following targeted SureSelect pulldown. The amino acid sequences are compared to the Gallus gallus v5 sequences. Domain structures are represented as: IM1 and IM2, intramembrane domain 1 and 2, CIL, conserved intracellular loop. These have yet to be defined for chIFITM5
chIFITM transcripts average coverage values in the stable cell lines
| Cell line | Average coverage |
|---|---|
| 293 T | 35 |
| 293 T - chIFITM1 | 34 |
| 293 T - chIFITM2 | 339 |
| 293 T - chIFITM3 | 746 |
Expression levels of the IFITM transcripts calculated as RPKM in the different RNA-seq studies deposited in the European Nucleotide Archive (ENA) database
RPKM values were calculated for all the samples present in each study by Artemis. (NA: BWA did not detect any BAM alignment across the reference provided.) [17, 42–60]
Tissue types, experimental conditions and species considered in the different RNA-seq studies
| N. | Tissue | Condition | Species |
|---|---|---|---|
| 1 | Lung | H5N3 AIV | Fayoumi and leghorn |
| 2 | DF-1 | IRF 7 overexpression and knockdown assays/poly I:C | East Lansing Line (ELL-0) |
| 3 | DF-1 | Cell-adapted Infectious Bursal Disease Virus (ca-IBDV) infection | East Lansing Line (ELL-0) |
| 4 | Trachea | Infectious laryngotracheitis virus vaccine | 15-day-old SPF white leghorn chickens |
| 5 | DT40 CL18 chicken B lymphoma cells | Basal | Bursal lymphoma cell line derived from a Hyline SC chicken |
| 6 | Caecal tissue | C. | Barred Rock chickens |
| 7 | Breast muscle | Basal | White rock/Xinghua chickens |
| 8 | Abdominal adipose tissue | Body weight | 7 week old broiler chickens |
| 9 | Primary hepatocellular carcinoma epithelial cell line | Heat stress response | Chicken male white-leghorn hepat ocellular (LMH) cell line. |
| 10 | Spleen | J Subgroup Avian Leukosis Virus (ALV-J) Infection | White Recessive Rock |
| 11 | Facial | Talpid2 heterozygous carriers | HH25 chickens |
| 12 | DT40 cells | Splicing factor SRSF10 | Bursal lymphoma cell line derived from a Hyline SC chicken |
| 13 | MSB1 cell line | Marek’s disease virus 1 | Chicken lymphoblastoid cell line |
| 14 | Liver | Heat stress response | Broiler chickens |
| 15 | Endocardial cells | Endocardial EMT | HH18 chicken/embryo |
| 16 | Brain (cerebral cortex/whole brain without cerebellum), cerebellum, heart, kidney, liver and testis | Basal | Red jungle fowl |
| 17 | Liver/muscle | Basal | 7 day red jungle fowl and broiler |
| 18 | CEF/HD11 | Lipopolysaccharide | 11-day white leghorn |
| 19 | Mid shaft tibial bone | Basal | White leghorn |
| 20 | Ileum/lung | H5N2/H5N1 | White leghorn/Domestic Gray Mallards |
| 21 | Adrenal gland, adipose, cerebellum, testis, ovary, heart, hypothalamus, kidney, liver, lung, breast muscle, sciatic nerve, proventriculus, spleen | Basal | Red Jungle Fowl |
| 22 | Whole embryo | Basal | UE1295 PEAT/F-37380 cross |
| 23 | Testis | New Hampshire | |
| 24 | Spleen | IBDV | Gallus gallus |
| 25 | CEF | chIFNα | CEF |
| 26 | Chicken embryo | Basal | Gallus Gallus |
Fig. 5The read alignment views in Artemis showing RNA-Seq data from the different studies. Top panel: the ‘coverage view” showing a separate plot for each BAM mapped to our PacBio contig 2 (40Kb region). The coverage shows only data relative to constitutive expression level of chIFITMs in immune-relevant tissues and cell lines (lung, trachea, spleen, liver, DF1, CEF, HD11, DT40). Bottom panel: the “stack view” (paired reads: blue, single reads and/or reads with an unmapped pair: black; reads spanning the same region: green) to show in more detail read depth across each chIFITM transcript. All the features were annotated manually blasting the sequences from the latest version of the chicken genome. Cyan: CDS region, grey: mRNA, white: gene (overlapping with mRNA features)
Fig. 6RNA-seq data alignment of reads from the immune relevant tissues and cell lines in treated conditions: infection with IBDV, ALV-J, ILVV, LPS, H5N5/H5N1 or heat stress-induced conditions. The graph shows that also in these conditions, levels of chIFITM are lower compared to chIFITM2 and chIFITM3. Top panel, overall coverage. Bottom panel stack view of each chIFITM transcript