| Literature DB >> 35962405 |
Antonio Grimaldi1, Francesco Panariello1, Patrizia Annunziata1,2, Andrea Ballabio1,3,4,5,6, Davide Cacchiarelli7,8,9, Teresa Giuliano1, Michela Daniele1,2, Biancamaria Pierri10, Chiara Colantuono1,2, Marcello Salvi1,2, Valentina Bouché1, Anna Manfredi1,2, Maria Concetta Cuomo10, Denise Di Concilio10, Claudia Tiberio11, Mariano Fiorenza3, Giuseppe Portella3, Ilaria Cimmino3,12, Antonio Sorrentino12, Giovanna Fusco10, Maria Rosaria Granata12, Pellegrino Cerino10, Antonio Limone10, Luigi Atripaldi11.
Abstract
BACKGROUND: Genomic surveillance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the only approach to rapidly monitor and tackle emerging variants of concern (VOC) of the COVID-19 pandemic. Such scrutiny is crucial to limit the spread of VOC that might escape the immune protection conferred by vaccination strategies or previous virus exposure. It is also becoming clear now that efficient genomic surveillance would require monitoring of the host gene expression to identify prognostic biomarkers of treatment efficacy and disease progression. Here we propose an integrative workflow to both generate thousands of SARS-CoV-2 genome sequences per week and analyze host gene expression upon infection.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35962405 PMCID: PMC9372932 DOI: 10.1186/s13073-022-01098-8
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 15.266
Fig. 1A systematic approach allows the generation of large and robust genomic data in a cost-effective manner. A Schematic representation of the workflow set up to collect, process, and analyze a considerable number of viral genomes. Top: Oronasopharyngeal swabs are performed to diagnose the presence of the SARS-CoV-2 genome in patients and extract its RNA. Subsequently, viral RNA is retrotranscribed and subjected to two PCR steps to amplify and index the obtained cDNA. After circularization and nanoball generation, the obtained library is then sequenced and analyzed. Bottom: As an alternative and faster approach, an optimized approach enables the amplification and indexing to occur in one PCR step. B Multiple solutions were tested to optimize the workflow. The table reports the input RNA volume, the amount of reads produced per sample, the number of samples loaded per flowcell, the average time required to process a 96-well plate, and the relative cost per sample. Cost details are reported in Additional file 2: table S1. C Boxplot showing the percentage of samples submitted on the GISAID platform, divided by each tested solution. Only samples with an average Ct value < 33 were considered. D Violin plot showing the distribution of the percentage of SARS-CoV-2 reads detected for different ranges of CTs. n:sample size. E Variant annotation, cumulative frequency, and sequencing coverage of each position of the SARS-CoV-2 genome. F Venn diagram showing the intersection between mutations detected in all the sequenced genomes worldwide (yellow) and the mutations found in this study (light blue). G Representation of all the 156 lineages identified in this study. The length of the bars is indicative of the number of samples for each lineage in the logarithmic scale. Colored bars indicate VOC
Fig. 2Characterization of SARS-CoV-2 genome evolution in the south of Italy. A Geographic map representing European States, colored by the number of 2021 months with at least 5% of viral genomes sequenced, compared to new cases. Only for Italy, individual regions are displayed. B Top: geographic map representing Italian regions, colored by the number of genomes deposited on the GISAID platform. Bottom: percentage of genomes deposited on GISAID over the total Italian sequences, divided in Northern (green) and Southern (blue) regions. 28% of the overall Italian sequences have been produced by this study (dark blue). C Geographic distribution in Campania of the genomes analyzed in this study (top) relative to the population density (bottom). D Density plots showing the distribution, in time, of the most frequent variants described in this study (middle) or in Italy (bottom) relative to the Campania infection curve (top) and waves (red-colored areas). Red arrows highlight different variants dynamics between regional and national level, in a certain period of time. E Distribution of the average CT value across different Variants of Concern (VOC). Only not significant (n.s.) pairwise comparisons are reported (Bonferroni adjusted p-value > 0.05)
Fig. 4Transcriptional profiling of SARS-CoV-2 infected patients reveals a gene signature correlated with viral load and preserved across different lineages. A Correlation analysis between CTs and gene expression of B.1 patients, performed on 8100 genes, is shown as a barplot. For each gene (x-axis), its correlation value (y-axis) and significance (p-value < 0.0001, red) is reported. Bottom: highlight of the significant results. (161 genes). The top 10 most anti-correlated genes are reported (black box). B Pathway and gene set enrichment analysis performed for different databases using the gene signature previously identified. Each barplot shows the significance (x-axis) and the percentage of overlap (fill color) between the input signature and the tested public genesets. C Heatmap of z-scored, log2-transformed, and normalized gene counts for the 161 significantly correlated genes from A. Values have been averaged in 4 groups of samples depending on the CT (x-axis) or whether they were negative. D Venn diagram of significantly anti-correlated genes between B.1 (161 genes) and Delta (16 genes) variant-infected patients
Fig. 3High-Throughput genomic surveillance allows the identification of new SARS-CoV-2 lineages. A Donut chart representing the amount of analyzed genomes presenting the Spike E484K mutation, divided by lineage. The definition of Expected lineage is described in the Methods. B Section of the phylogenetic tree representation of the whole dataset (n=12,998), colored by lineages. The identified lineage is reported (blue dots, left) and zoomed in (right). n:sample size. C Geographic distribution of genomic variants belonging to the identified lineage, colored by the collection date. The size of each pie chart is proportional to the number of samples in each geographic position. n:sample size. D Line plot showing the frequency trend of the selected mutations in time. E Section of the phylogenetic tree representation of the whole dataset (n=12,998), colored by lineages. The identified lineage is reported (arrow, blue dots). n:sample size. F Geographic distribution of genomic variants belonging to the identified lineage, colored by the collection date. The size of each pie chart is proportional to the number of samples in each geographic position. n:sample size. G Genomic characterization of twenty patients with long COVID-19 infection. The number of detected mutations is reported as a function of the number of days from the first swab. The assigned lineage (colors) and consistency (transparency) are also displayed. H Patient 8 genomic characterization relative to the number of detected mutations (colors), the infection load (y-axis), and symptoms severity (+++: severe; ++: moderate)
Detailed clinical status of patients from Fig. 3G
| Patient | Immune compromised | Main clinical symptoms | Comorbidities | Age | Vaccine | Outcome |
|---|---|---|---|---|---|---|
| 1 | Yes | Pneumonia | LNH | 64 | None | Healed |
| 2 | No | Pneumonia | Hemoperitoneum, anemia | 30 | None | Deceased |
| 3 | Yes | Respiratory failure | Pulmonary hypertension, NHL | 64 | Pfizer (×2) | N/A |
| 4 | No | ARDS | Diabetes, hypertension, ischemic heart disease | 76 | None | Healed |
| 5 | No | Mild respiratory failure | Necrotizing-hemorrhagic pancreatitis | 60 | None | Deceased |
| 6 | No | ARDS | Hypertension, dyslipidemia | 61 | None | Deceased |
| 7 | No | Bilateral pneumonia, fever, asthenia, myalgia, dyspnea | T2D, obesity, hypertension | 59 | None | Healed |
| 8 | No | Not specified severe symptomatology | Atrial fibrillation, T2D | 78 | None | Deceased |
| 9 | Yes | ARDS | Anemia, ALS, COPD | 73 | Pfizer (×1) | Healed |
| 10 | No | Respiratory failure | ARDS, sepsis, anemia, pulmonary hypertension | 64 | None | Healed |
| 11 | No | Bilateral pneumonia | None | 88 | None | Deceased |
| 12 | No | Bilateral pneumonia, fever, asthenia, myalgia, dyspnea | Hypertension, T2D, HCV, dyslipidemia, obesity | 68 | None | Healed |
| 13 | Yes | Pneumonia | NHL | 73 | Pfizer (×1) | Healed |
| 14 | N/A | 87 | N/A | Healed | ||
| 15 | No | respiratory failure | Psoriasis | 44 | None | Heled |
| 16 | No | Pneumonia, dyspnea, chest pain | Hypothyroidism, severe obesity | 71 | Pfizer (×1) | Healed |
| 17 | N/A | N/A | N/A | 26 | N/A | Healed |
| 18 | Secondary to chemotherapy | Asymptomatic | Ewing sarcoma | 13 | None | Healed |
| 19 | No | Fever, cough, dyspnea, pneumonia | Mixed dyslipidemia, obesity, hyperthyroidism, hypovitaminosis D | 64 | None | Healed |
| 20 | Yes | Pneumonia | Thymoma, Good’s syndrome | 60 | Pfizer (×1) | Healed |