Literature DB >> 34153515

SARS-CoV-2 genome sequencing from post-mortem formalin-fixed, paraffin-embedded lung tissues.

Claude Van Campenhout¹, Ricardo De Mendonça¹, Barbara Alexiou¹, Sarah De Clercq¹, Marie-Lucie Racu¹, Claire Royer-Chardon¹, Stefan Rusu¹, Marie Van Eycken¹, Maria Artesi², Keith Durkin², Patrick Mardulyn³, Vincent Bours⁴, Christine Decaestecker⁵, Myriam Remmelink¹, Isabelle Salmon⁶, Nicky D'Haene⁷.

Abstract

Implementation of SARS-CoV-2 testing in the daily practice of pathology laboratories requires procedure adaptation to formalin-fixed and paraffin-embedded (FFPE) samples. So far, one study reported the feasibility of SARS-CoV-2 genome sequencing on FFPE tissues with only one contributory case out of two. The present study aimed to optimize SARS-CoV-2 genome sequencing using the Ion AmpliSeq SARS-CoV-2 Panel on 22 FFPE lung tissues from 16 deceased COVID-19 patients. SARS-CoV-2 was detected in all FFPE blocks using a real-time RT-qPCR targeting the E gene with Crossing Point (Cp) values ranging from 16.02 to 34.16. Sequencing was considered as contributory (i.e. with a uniformity >55%) for 17 FFPE blocks. Adapting the number of target amplification PCR cycles according to the RT-qPCR Cp values allowed to optimize the sequencing quality for the contributory blocks; i.e. 20 PCR cycles for blocks with a Cp value <28 and 25 PCR cycles for blocks with a Cp value between 28 and 30. The majority of blocks with a Cp value >30 were non-contributory. Comparison of matched frozen and FFPE tissues revealed discordance for only three FFPE blocks, all with a Cp value >28. Variant identification and clade classification was possible for 13 patients. The present study validates SARS-CoV-2 genome sequencing on FFPE blocks and opens the possibility to explore correlation between virus genotype and histopathological lesions.

Entities: CellLine Chemical Disease Gene Mutation Species

Year: 2021 PMID： 34153515 PMCID： PMC8219372 DOI： 10.1016/j.jmoldx.2021.05.016

Source DB: PubMed Journal: J Mol Diagn ISSN： 1525-1578 Impact factor: 5.568

The coronavirus disease 2019 (COVID-19) pandemic is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Coronaviruses are a family of enveloped single-strand, positive-sense RNA viruses that cause a wide spectrum of respiratory diseases. Since the initial report on this novel coronavirus in Wuhan, China,1, 2, 3 mortality and morbidity rapidly increased around the globe. Researchers worldwide are contributing to sequencing initiatives to try to understand how the virus is spreading. As of April 2021, up to 1,211,666 SARS-CoV-2 genomes were sequenced and uploaded to the Global Initiative on Sharing All Influenza Data (GISAID; , last accessed April 30, 2021). SARS-CoV-2 genome sequencing allows the detection of genetic modifications that could have occurred. Most SARS-CoV-2 virus detection and genotyping methods are based on fresh samples from upper or lower respiratory tract, such as nasopharyngeal swab, oropharyngeal swab, sputum, or bronchoalveolar lavage. In the current COVID-19 pandemic, pathology laboratories face the major challenge to implement SARS-CoV-2 testing in their daily practice. In pathology laboratories, most surgical and cytology specimens are formalin fixed, paraffin embedded (FFPE). Post-mortem studies indicated that SARS-CoV-2 could be detected by RT-qPCR on FFPE blocks of lungs and other organs.5, 6, 7, 8, 9 Adapting the SARS-CoV-2 genome sequencing protocols to FFPE blocks may provide valuable diagnostic tools for its detection and genotyping. Virus sequencing can be achieved by Sanger , and/or next-generation sequencing (NGS). , NGS is now well implemented in pathology laboratories for detection of cancer-related molecular alterations, using FFPE tissues. , The use of targeted NGS panels allows the identification of tumor molecular profiles using small quantities of nucleic acids from FFPE blocks. However, only a few studies have reported the use of NGS to detect pathogens in FFPE blocks. , 15, 16, 17 Sekulic et al showed the feasibility of SARS-CoV-2 sequencing on FFPE blocks, but only one case of two was contributory. The present study aimed to optimize SARS-CoV-2 genome sequencing using NGS on 22 post-mortem FFPE tissues.

Materials and Methods

Clinical Series

Lung samples were collected from the 16 first confirmed COVID-19 (positive RT-qPCR assay on nasopharyngeal swab and/or bronchoalveolar lavage) patients who died in Hôpital Erasme (Brussels, Belgium) since March 13, 2020, and with a positive SARS-CoV-2 E gene RT-qPCR on lung FFPE blocks (see below). The study protocol was approved by the local ethics committee (P2020/218). The autopsy procedure, clinical courses, and histopathologic findings have been already described. Briefly, six samples per lung lobe (ie, a total of 30 samples) were collected, formalin fixed, and paraffin embedded (except for two patients who had previously undergone lobectomy for cancer and for whom only 18 samples were taken). One or two blocks were randomly selected for molecular analysis among FFPE blocks showing histopathologic lesions. When two blocks were tested, they included one FFPE block from the left lung and one FFPE block from the right lung, to evaluate the heterogeneity of viral spread. Moreover, one sample was snap frozen for each lung lobe. The material was biobanked by the Biobanque Hôpital Erasme-Université Libre de Bruxelles (BE_BERA1), Cliniques Universtaires de Bruxelles Hôpital Erasme, Biobanking and Biomolecular Ressources Research Infrastructure-European Research Infrastructure Consortium. Semiquantitative evaluation of hemorrhage on hematoxylin and eosin slides was performed by two senior pathologists (N.D. and M.R.) as follows: negative or <10% (0); between 10% and 20% of lung parenchyma showing intra-alveolar hemorrhage (+); between 20% and 30% of lung parenchyma showing intra-alveolar hemorrhage (++); and >30% of lung parenchyma showing intra-alveolar hemorrhage (+++). Necrosis was evaluated as follows: negative (0) or positive (+).

Nucleic Acid Extraction and SARS-CoV-2 Detection by RT-qPCR

For FFPE blocks, total nucleic acids were extracted from two unstained slides (10 μm thick) using the Maxwell RSC DNA FFPE kit and the Promega Maxwell extractor following the protocol described by the manufacturer (Promega Corp., Madison, WI) in an elution volume of 50 μL. For frozen tissues, RNAs were extracted using PureLink RNA Mini Kit (ThermoFisher Scientific, Waltham, MA) following manufacturer's instructions. The RNA yield was quantified using a Qubit 2.0 Fluorometer (ThermoFisher Scientific). For FFPE blocks, RNA quality was analyzed with the Agilent RNA 6000 Pico Kit on a Bioanalyzer 2100 (Agilent, Santa Clara, CA). The RNA from the FFPE blocks showed a fragmented profile, with a mean peak height of 130 nucleotides. The mean percentage of RNA fragments >200 nucleotides was of 60%, and no samples showed a percentage of RNA fragments >200 nucleotides of <30% (data not shown). The detection of the SARS-CoV-2 virus in the nucleic acid extracts was performed by RT-qPCR. One-step RT-qPCR assay specific for the amplification of SARS-CoV-2 E gene was adapted from the protocol described by Corman et al and as previously described. Briefly, 100 ng of RNA was amplified in 20-μL reaction mixture containing 5 μL of 4× TaqMan Fast Virus 1-step master mix (ThermoFisher Scientific), 0.4 μmol/L of forward (5′-ACAGGTACGTTAATAGTTAATAGCGT-3′) and reverse (5′-ATATTGCAGCAGTACGCACACA-3′) primers, and 0.2 μmol/L of probe (5′-FAM-ACACTAGCCATCCTTACTGCGCTTCG-BBQ-3′). Amplification was performed on the LightCycler 480 type II (F. Hoffmann-La Roche SA, Basel, Switzerland) following the manufacturer's instructions. Amplification condition was 50°C for 10 minutes for reverse transcription, followed by 95°C for 20 seconds and then 45 cycles at 95°C for 3 seconds and 58°C for 30 seconds. Crossing point (Cp) values were calculated using the second derivative maximum method from the Roche LightCycler software version 1.5.1.62 SP3. A clinical sample highly positive for SARS-CoV-2 (with a low Cp), diluted 1:1000, was used as positive control; and a clinical sample obtained from a patient autopsied before the pandemic was used as negative control in each analysis.

Library Preparation and Sequencing

For library construction, 10 ng of RNA (5 and 1 ng for testing robustness) was retrotranscribed with the SuperScript VILO (ThermoFisher Scientific) in accordance with the manufacturer's instructions. The Ion AmpliSeq SARS-CoV-2 Research Panel (ThermoFisher Scientific) was used to manually prepare the libraries. The panel consists of two 5× primer pair pools that target 237 amplicons specific to the SARS-CoV-2 coronavirus and 5 human expression controls. The amplicon lengths range from 125 to 275 bp and are designed to provide >99% coverage of the SARS-CoV-2 genome, covering from position 43 to position 29,842 (positions related to reference sequence). Amplification condition was 98°C for 2 minutes for initial denaturation, followed by 20, 25, or 30 cycles (Supplemental Table S1) at 98°C for 15 seconds and 60°C for 4 minutes. Then, the amplicons were digested, barcoded, and purified using AMPure XP Beads (Beckman Coulter, Brea, CA). The libraries were amplified by PCR, and size selection was performed using AMPure XP Beads. The Ion 510, Ion 520, and Ion 530 Kit, Chef and the Ion Chef (ThermoFisher Scientific), were used for template preparation and chip loading. Sequencing was performed using the S5 Gene Studio instrument (ThermoFisher Scientific). SARS-CoV-2 whole-genome was sequenced using Oxford Nanopore (Oxford, UK) technology as previously described.

Data Analysis

The raw sequencing data were analyzed using the torrent suite software version 5.12 (ThermoFisher Scientific). The sequencing metric analysis was performed using the coverage analysis plug-in. For fresh samples, the manufacturer (ThermoFisher Scientific) recommends obtaining 1 M reads per sample and reports that the uniformity is >85%. The following sequencing quality classification was used: optimal if the mapped reads were >1,000,000 and uniformity >90%; suboptimal if the mapped reads were between 1,000,000 and 500,000 and/or uniformity between 80% and 90%. If the mapped reads were <500,000 and/or uniformity between 55% and 80%, the sequencing quality was considered as poor. If the uniformity was <55%, the sequencing was considered as non-contributory. The sequenced fragments were assembled using Iterative Refinement Meta-Assembler. Alignment to the SARS-CoV-2 genome reference and variant detection were performed using the Variant Caller plug-in COVID19AnnotateSnpEff version 1.0.0.1 (ThermoFisher Scientific). The variants were defined as sequence variations from the reference sequence of the severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1 NC_045512.2. Each variant with an allelic frequency (AF) >90% and recurrent variants (Supplemental Table S2) reported in the literature21, 22, 23, 24 were verified in the Integrative Genome Viewer (IGV) from the Broad Institute (, last accessed November 9, 2020). Sequences were aligned using the MUSCLE algorithm. Clades were allowed according to GISAID definitions (ie, clade G for patients with C241T, C3037T, and A23403G variants; clade GR for patients with C241T, C3037T, A23403G, and GGG28881AAAC variants; and clade GH for patients with C241T, C3037T, A23403G, and G25563T variants). The occurrence of variants was checked on the GISAID (using CoVsurver) and Nextstrain websites to detect new variants. Viral sequences from eight patients with <25 variants in the variant list were deposited in GISAID (, last accessed April 4, 2021). For all contributive sequences, clades were attributed using Nextstrain (, last accessed April 4, 2021; , last accessed April 4, 2021) and Pangolin classification tools and Pangolin COVID-19 classification according to Rambaut et al (Pangolin, , last accessed April 4, 2021).

Statistical Analysis

To select the optimal library preparation protocol, uniformities, numbers of mapped reads, and coverages were analyzed for each block and considered as independent. For evaluation of the sequencing performance for the selected PCR condition, the number of variants (total and with an AF >90%) was also analyzed for each block and considered as independent. The U-test was applied for the comparison of two independent groups of ranked data. The Friedman test was applied for the comparison of multiple dependent groups. Spearman correlation analysis was used to analyze the relationship between the RT-qPCR Cp values and uniformities. Statistical analyses were performed using Statistica 7.1 (Statsoft, Tulsa, OK).

Results

Sequencing Protocol Optimization

This study included 16 confirmed COVID-19 deceased patients with a positive SARS-CoV-2 E gene RT-qPCR on lung FFPE blocks. For six patients, two different lung lobes were tested, leading to 22 FFPE blocks. RT-qPCR Cp values for the different FFPE blocks ranged from 16.02 to 34.16 (Supplemental Table S1). For SARS-CoV-2 genome sequencing, the Ion AmpliSeq SARS-CoV-2 Research Panel was used, which is an amplicon-based library preparation method. Because the 22 FFPE blocks were relatively heterogeneous in terms of RT-qPCR Cp values, three different numbers of target amplification cycles were tested: 20, 25, and 30 PCR cycles for all the blocks. Libraries suitable for sequencing were obtained for all blocks, except for one (block 2-2) for which the library concentration at 20 PCR cycles was too low for sequencing (Supplemental Table S1). Globally, no significant differences were observed in terms of sequencing metrics (number of mapped reads and coverage) with increased numbers of PCR cycles. Only the uniformity appears higher at 20 PCR cycles (median, 95.72%) than at 25 and 30 PCR cycles (medians, 92% and 88%, respectively; Friedman test: P = 0.04) (Supplemental Table S1). Next, the analyses were refined according to the RT-qPCR Cp values (Figure 1 ). For blocks with low RT-qPCR Cp values (<24), the average number of mapped reads is higher with 20 PCR cycles (Friedman test: P = 0.002). In contrast, for blocks with an RT-qPCR Cp value between 24 and 30, the average number of mapped reads is higher with 25 or 30 cycles of PCR (Friedman test: P = 0.009). For RT-qPCR Cp values >30, a similar but slighter variation appeared in the number of mapped reads but was not significant (Friedman test: P = 0.135). Uniformity clearly decreased with the increase of the RT-qPCR Cp value for the three tested conditions (20, 25, and 30 cycles), as confirmed by the negative Spearman correlations (Figure 2 ). In particular, for the seven blocks with an RT-qPCR Cp value >30, five showed a uniformity of <55% for all the tested conditions. These five blocks were considered as non-contributory; 17 blocks were thus considered as contributory. These 17 contributory blocks were coming from 13 patients (including four patients with two blocks tested).

Figure 1

Figure 2

Dot plot of SARS-CoV-2 genome uniformity against RT-qPCR crossing point (Cp) values. For the same block, three different library preparation protocols were tested by varying the number of target amplification PCR cycles. The different conditions are indicated as follows: circle, 20 PCR cycles; square, 25 PCR cycles; diamond, 30 PCR cycles. Spearman correlation between RT-qPCR Cp value and uniformity is −0.63 (P = 0.002) at 20 cycles, −0.86 (P < 10−6) at 25 cycles, and −0.81 (P = 0.000006) at 30 cycles.

Variation of mapped read numbers according to RT-qPCR crossing point (Cp) values and number of PCR cycles. Data are displayed as medians (circle, 20 PCR cycles; square, 25 PCR cycles; diamond, 30 PCR cycles), 25% to 75% quartiles (box plots), and nonoutliers (bars). Dot plot of SARS-CoV-2 genome uniformity against RT-qPCR crossing point (Cp) values. For the same block, three different library preparation protocols were tested by varying the number of target amplification PCR cycles. The different conditions are indicated as follows: circle, 20 PCR cycles; square, 25 PCR cycles; diamond, 30 PCR cycles. Spearman correlation between RT-qPCR Cp value and uniformity is −0.63 (P = 0.002) at 20 cycles, −0.86 (P < 10−6) at 25 cycles, and −0.81 (P = 0.000006) at 30 cycles. For the 17 contributory blocks, the aim was to establish the best PCR condition for sequencing performance and variant analyses. As sequencing quality criteria, the uniformity was selected as the most important factor because it is related to the homogeneity of the coverage distribution. The PCR condition with the highest uniformity was thus selected. If there were conditions with similar uniformities (±3%), the condition with the highest number of mapped reads was selected. If there were conditions with similar uniformities (±3%) and number of mapped reads (±20%), the condition with the fewest PCR cycles was preferred (Supplemental Table S1). This allowed selection of 20 PCR cycles for blocks with an RT-qPCR Cp value <28 and 25 PCR cycles for blocks with an RT-qPCR Cp value between 28 and 30. It was not possible to establish rules for blocks with an RT-qPCR Cp value >30, with most of them being non-contributory (Supplemental Table S1).

Sequencing Performances Obtained after Optimization

After adapting the number of target amplification PCR cycles according to the RT-qPCR Cp values for the 17 contributory blocks, the median number of mapped reads and uniformity were 1,642,150 (minimum-maximum: 305,249 to 2,094,563) and 95.9% (minimum-maximum: 81% to 98%), respectively. The sequencing quality was considered as optimal for 10 blocks (Materials and Methods), with a median number of mapped reads of 1,748,009, a median coverage of 10,644, and a median uniformity of 96.4% (Table 1 ). The RT-qPCR Cp value of these 10 FFPE blocks varied from 18.69 to 31.14. The sequencing quality was considered as suboptimal for six blocks, with a median number of mapped reads of 1,128,420, a median coverage of 5385, and a median uniformity of 89%. The RT-qPCR Cp values ranged from 16.02 to 30.55. The sequencing quality of one block was considered as poor (Table 1). According to the RT-qPCR Cp values, significant differences were observed between contributory blocks with an RT-qPCR Cp value <24 and those with an RT-qPCR Cp value between 24 and 30 in terms of number of mapped reads, uniformity, and the number of variants (Figure 3 ). As it would be easier in daily practice to use the same protocol for each block, a comparison between the sequencing metrics and the data obtained by the Variant Caller plug-in was performed for each block for the three different conditions (20, 25, and 30 cycles) and the selected condition, as proposed above. The adaptation of the number of PCR cycles to the RT-qPCR Cp value (selected condition) allowed obtaining more blocks with an optimal or suboptimal result (Supplemental Table S3). Moreover, increasing the number of PCR cycles lead to a higher number of variants with an AF <0.9, which can reflect sequencing artifacts.

Table 1

Sequencing Metrics for Matched FFPE and Frozen Tissues

Patient no.	RT-qPCR Cp value	Selected PCR cycles for target amplification	Sequencing quality for FFPE tissue	Sequencing quality for frozen tissue	Variants for FFPE tissue, n	Variants for frozen tissue, n	Variants with AF >90% for FFPE tissue, n	Variants with AF >90% for frozen tissue, n
1	18.69	20	Optimal	Optimal	6	7	5	5
2-1	28.76	25	Optimal	Optimal	865	238	5	5
2-2	31.62	/	NC	Optimal	/	189	/	5
3	23.13	20	Optimal	Optimal	12	11	9	9
4	19.32	20	Optimal	Optimal	13	10	6	6
5-1	29.16	25	Suboptimal	Suboptimal	774	184	6	5
5-2	31.41	/	NC	Suboptimal	/	231	/	5
6	34.16	/	NC	NC	/	/	/	/
7	31.14	20	Optimal	Optimal	19	15	9	9
8-1	16.02	20	Suboptimal	Poor	8	10	7	7
8-2	21.57	20	Optimal	Suboptimal	9	7	7	7
9-1	27.96	20	Poor	Suboptimal	896	284	4	4
9-2	30.55	30	Suboptimal	Optimal	340	160	22	4
10	33.03	/	NC	Suboptimal	/	69	/	9
11-1	21.98	20	Optimal	Optimal	18	15	10	10
11-2	23.05	20	Optimal	Optimal	26	15	10	10
12-1	28.46	25	Suboptimal	Suboptimal	1025	293	6	7
12-2	29.69	25	Suboptimal	Optimal	589	210	9	7
13	20.59	20	Optimal	Suboptimal	15	9	7	7
14	30.88	/	NC	Poor	/	162	/	7
15	20.56	20	Optimal	Optimal	18	21	6	6
16	28.87	25	Suboptimal	Poor	707	314	8	5

AF, allelic frequency; Cp, crossing point; FFPE, formalin fixed, paraffin embedded; NC, non-contributory; /, not applicable.

Figure 3

Variation of sequencing performances [number of mapped reads (A), uniformity (B), number of variants (C), and number of variants with an allelic frequency (AF) >0.9 (D)] obtained with the selected PCR condition, according to RT-qPCR crossing point (Cp) values for contributory blocks. Data are displayed as medians, 25% to 75% quartiles (box plots), and nonoutliers (bars). The U-test was applied. ∗P < 0.05, ∗∗P < 0.01.

Sequencing Metrics for Matched FFPE and Frozen Tissues AF, allelic frequency; Cp, crossing point; FFPE, formalin fixed, paraffin embedded; NC, non-contributory; /, not applicable. Variation of sequencing performances [number of mapped reads (A), uniformity (B), number of variants (C), and number of variants with an allelic frequency (AF) >0.9 (D)] obtained with the selected PCR condition, according to RT-qPCR crossing point (Cp) values for contributory blocks. Data are displayed as medians, 25% to 75% quartiles (box plots), and nonoutliers (bars). The U-test was applied. ∗P < 0.05, ∗∗P < 0.01.

Factors Influencing Sequencing Performances

The presence of hemorrhage and/or necrosis on the 22 FFPE blocks was evaluated to identify if histologic features can affect the sequencing performance and quality (Supplemental Table S4). Hemorrhage was observed for eight blocks, and necrosis was observed for four blocks. The sequencing quality was more often optimal when neither hemorrhage nor necrosis was present (7/11 blocks with optimal sequencing when neither hemorrhage nor lysis was present versus 3/11 blocks with optimal sequencing when hemorrhage and/or lysis was present). To examine the impact of formalin fixation (a well-known cause of RNA damage–induced changes and sequencing artifacts), the same library preparation (with the adaptation of the PCR amplification cycles to the RT-qPCR Cp values) and sequencing protocols were used on matched frozen tissues. Using this method on the 22 frozen tissues, sequencing quality was considered as optimal for 11 (median number of mapped reads of 1,549,686, median coverage of 9971, and median uniformity of 97%). Sequencing quality was considered as suboptimal for seven frozen tissues (median number of mapped reads of 1,256,654, median coverage of 5375, and median uniformity of 90%). Finally, for three frozen tissues, the sequencing qualities were considered as poor; and for one frozen tissue, they were considered as non-contributory (Table 1). When considering the six suboptimal FFPE blocks and the matched frozen tissues, optimal quality on frozen tissues was observed for two of them, whereas sequencing remained suboptimal for two and poor for the remaining two. The 9-1 poor sequencing quality from FFPE was suboptimal from frozen tissue. The five FPPE blocks categorized as non-contributory showed various results when frozen tissue was sequenced: one optimal, two suboptimal, one poor, and one non-contributory.

Variant Analysis

Among the 17 contributory FFPE blocks, between 6 and 1025 variants were detected, with an AF varying between 2% and 100%. Between 4 and 22 variants with an AF >90% were detected, with a mean of 8 variants per FFPE block. Each variant with an AF >90% and recurrent variants reported in the literature were verified in the IGV (Materials and Methods). Verification using IGV validated all variants with an AF >90%, except for two deletions that were detected by the Variant Caller plug-in but not confirmed (Patients 5 and 12). Moreover, for Patient 2, the variant G11083T was detected by the Variant Caller plug-in with an AF of 63% in the FFPE block and with an AF of 73% in the matched frozen tissue, but IGV verification revealed an AF of almost 100% for the two conditions. For Patient 12, the variant GGG28881AAC was detected by the Variant Caller plug-in with an AF of 74% and with an AF of 95% in the matched frozen tissue; verification using IGV revealed an AF of almost 100% for the FFPE block. For Patient 16, IGV verification showed the presence of the variants C241T and GGG28881AAC in both FFPE block and matched frozen tissue. However, the C27476T variant identified in the FFPE block with an AF of 95% was not observed in the matched frozen tissue. For all optimal (10 of 10) FFPE blocks, the same variants with an AF >90% were detected in the matched frozen tissues. For the six suboptimal FFPE blocks, comparison of the variant caller plug-in results between FFPE and matched frozen tissue revealed additional variants for four FFPE blocks (5-1, 9-2, 12-2, and 16), a missing variant for one FFPE block (12-1), and the same profile for one FFPE block (8-1) (Table 1). However, IGV verification showed that the profile was concordant between FFPE and matched frozen tissue for blocks 5-1 and 12-1. In summary, the comparison of matched frozen and FFPE tissues identified three blocks presenting discordance (9-2, 12-2, and 16), with additional variants in the FFPE blocks that were absent in the matched frozen tissue. All of the three FFPE blocks were characterized by a suboptimal sequencing and by an RT-qPCR Cp value >28. Regarding the four patients with two different lung lobes tested, two presented the same variant profile (Patients 8 and 11). Discordances were observed between lobes for Patients 9 and 12, but comparison with matched frozen tissues revealed that additional variants observed in one lobe were related to sequencing artifacts. Regarding recurrent variants reported in the literature, all patients harbored the C241T, C3037T, C14408T, and A23403G nucleotide variants (Figure 4 and Supplemental Appendix S1). Distinct variant profiles have been identified across the patients (Tables 2 and 3 ). According to the GISAID definitions (Materials and Methods), clade G was assigned for seven patients, clade GR was assigned for four patients, and clade GH was assigned for two patients. For four patients (Patients 9, 11, 12, and 16), some genomic positions cannot be assessed because of an AF of around 40% to 60%. Using Nextstrain classification, eight patients were classified as clade 20A (because of the C14408T and A23403G variants), one patient was classified as clade 20B (because of the GGG28881AAC variant), one patient was classified as clade 20C (because of the C1059T and G25563T variants), and three patients were classified as clade 20D (because of the C4002T, G10097A, C13536T, and C23731T variants) (Figure 5 ). According to Pangolin COVID-19 classification from Rambaut et al, 11 patients were classified as B.1 and two patients were classified as C.11 (alias of B.1.1.1.11). Variants were checked on the GISAID (using CoVsurver) and Nextstrain websites, and three variants that have never been described before were detected [i.e. A16166G (AF of 99.8% for both tested lobes) for Patient 11, C710T (AF of 100%) for Patient 13, and C21805T (AF of 99.7%) for Patient 15]. Interestingly, these three variants were also detected in the matched frozen tissues.

Figure 4

Table 2

Variant Frequencies

Nucleotide variation	Gene	Mutation type	Amino acid change	Frequency
C241T	Upstream (5′UTR)ORF1ab			13/13
C710T	ORF1ab/NSP1	Missense	L149F	1/13
C1059T	ORF1ab/NSP2	Missense	T265I/T85I	1/12
C2113T	ORF1ab	Synonymous	-	1/13
C3037T	ORF1ab	Synonymous	-	13/13
C4002T	ORF1ab/NSP3	Missense	T1246I/T428I	3/13
C7765T	ORF1ab	Synonymous	-	1/13
C8782T	ORF1ab	Synonymous	-	0/13
G10097A	ORF1ab/NSP5	Missense	G3278S/G15S	3/13
G11083T	ORF1ab/NSP6	Missense	L3606F/L37F	1/13
C13536T	ORF1ab	Synonymous	-	2/12
C14408T	ORF1ab/NSP12	Missense	P4715L/P323L	13/13
C15324T	ORF1ab	Synonymous	-	6/12
T15978C	ORF1ab	Synonymous	-	1/13
A16166G	ORF1ab/NSP12	Missense	N5301S/N909S	1/13
C17690T	ORF1ab/NSP13	Missense	S5809L/S485L	1/13
C18060T	ORF1ab	Synonymous	-	0/13
C18877T	ORF1ab	Synonymous	-	1/13
C21805T	S	Synonymous	-	1/13
A23403G	S	Missense	D614G	13/13
C23731T	S	Synonymous	-	2/12
G24794T	S	Missense	A1078S	1/13
G25563T	ORF3a	Missense	Q57H	2/12
G26144T	ORF3a	Missense	G251V	0/13
T28144C	ORF8	Missense	L84S	0/13
G28690T	N	Missense	L139F	1/13
A28765G	N	Synonymous	-	1/13
GGG28881AAC	N	Missense	RG203KR	4/13
G29291A	N	Missense	D340N	1/12

UTR, untranslated region; -, no amino acid change.

Table 3

Variant Profile per Patient

Patient no.	Profile	GISAID clade	GISAID ID∗	Nextstrain clade	Pangolin COVID-19 classification²⁸
1	C241T-C3037T-C14408T-C15324T-A23403G	G	SARS-CoV-2/human/Brussels/1/2020_EPI_ISL_451935	20A	B.1
2	C241T-C3037T-G11083T-C14408T-C15324T-A23403G	G	-	20A	B.1
3	C241T-C2113T-C3037T-C7765T-C14408T-C17690T-C18877T-A23403G-G25563T	GH	SARS-CoV-2/human/Brussels/3/2020_EPI_ISL_452142	20A	B.1.9
4	C241T-C3037T-C14408T-C15324T-A23403G-A28765G	G	SARS-CoV-2/human/Brussels/4/2020_EPI_ISL_452148	20A	B.1.83
5	C241T-C3037T-C14408T-C15324T-A23403G	G	-	20A	B.1
7	C241T-C3037T-C4002T-G10097A-C13536T-C14408T-A23403G-C23731T-GGG28881AAC	GR	SARS-CoV-2/human/Brussels/7/2020_EPI_ISL_452140	20D	C11
8	C241T-C1059T-C3037T-C14408T-A23403G-G25563T-G29291A	GH	SARS-CoV-2/human/Brussels/8/2020_EPI_ISL_452149	20C	B.1.321
9	C241T-C3037T-C14408T-A23403G	G	-	20A	B.1.6
11	C241T-C3037T-C4002T-G10097A-C13536T-C14408T-A16166G-A23403G-C23731T-GGG28881AAC	GR	SARS-CoV-2/human/Brussels/11/2020_EPI_ISL_452150	20D	B.1.1.1
12	C241T-C3037T-C14408T-T15978C-A23403G-G24794T-GGG28881AAC	GR	-	20B	B.1.1
13	C241T-C710T-C3037T-C14408T-C15324T-A23403G-G28690T	G	SARS-CoV-2/human/Brussels/13/2020_EPI_ISL_452151	20A	B.1
15	C241T-C3037T-C14408T-C15324T-C21805T-A23403G	G	SARS-CoV-2/human/Brussels/15/2020_EPI_ISL_452152	20A	B.1
16	C241T-C3037T-C4002T-G10097A-C14408T-A23403G-GGG28881AAC	GR	-	20D	C.11

COVID-19, coronavirus disease 2019; GISAID, Global Initiative on Sharing All Influenza Data; ID, identifier; -, viral sequences not deposited in GISAID.

GISAID (, last accessed May 7, 2021).

Figure 5

Nextstrain classification for 13 patients.

Partial sequence alignments of 21 formalin-fixed, paraffin-embedded blocks with three different numbers of target amplification cycles against the reference sequence NC_0455512.2. Key residue nucleotides for Global Initiative on Sharing All Influenza Data clade classification are indicated. Sequences for block 6 are not included in the alignment as they are much shorter than the others and do not align sufficiently well to the other sequences to give useful information. Variant Frequencies UTR, untranslated region; -, no amino acid change. Variant Profile per Patient COVID-19, coronavirus disease 2019; GISAID, Global Initiative on Sharing All Influenza Data; ID, identifier; -, viral sequences not deposited in GISAID. GISAID (, last accessed May 7, 2021). Nextstrain classification for 13 patients. To confirm the variants identified using the Ion Torrent sequencing platform, the SARS-CoV-2 genome from the frozen tissues matching the 17 contributory FFPE blocks was also sequenced using Oxford Nanopore technology. Sequences were obtained for all tissues, except one (block 2). Variants reported in Supplemental Table S2 could be confirmed using this third-generation sequencing platform.

Robustness Analysis

To evaluate the robustness of the SARS-CoV-2 genotyping on FFPE blocks, the technique was challenged by lowering the amount of RNA used in the reverse transcription reaction. Instead of 10 ng, 5 or 1 ng was used as input to prepare the libraries for five different FFPE blocks: two blocks with an RT-qPCR Cp value <24 (1 and 4), two blocks with an RT-qPCR Cp value between 24 and 30 (2-1 and 16), and one block with an RT-qPCR Cp value >30 (7). For three of the five blocks, genotyping results (variants with an AF >90%) remained identical regardless of the amount of input RNA. For two blocks (both with an RT-qPCR Cp value >24), the decrease of viral input was associated with discordant results in the number of identified variants (data not shown).

Discussion

Currently, many questions remain about the origin, evolution, and spreading of the SARS-CoV-2. The SARS in 2003, Middle East respiratory syndrome in 2014, and the current COVID-19 pandemic highlight the need for coronavirus genome characterization. Laboratories worldwide are using their sequencing infrastructure and expertise to deliver and characterize SARS-CoV-2 genome sequences. Most of these sequences are generated from fresh samples; therefore, library preparation and sequencing protocols are not adapted to FFPE blocks. This study aimed to optimize SARS-CoV-2 genotyping on post-mortem FFPE lung tissues using the Ion AmpliSeq SARS-CoV-2 Research Panel. According to the manufacturer, the number of target amplification cycles should be adapted to the viral load. Even if the RT-qPCR Cp value can be affected by batch effect and cannot be used as a precise quantitative measure of viral load, the RT-qPCR Cp value can indirectly reflect the viral load. Because the RT-qPCR Cp values were heterogeneous across the FFPE blocks, different numbers of target amplification cycles were tested to optimize the sequencing. The different numbers of amplification cycles were selected to avoid overamplification of smaller fragments, leading to lower uniformity. Low template input and biased amplification of biological material by PCR are also a source of distortion and can potentially affect the accuracy of variant detection.32, 33, 34 The present data highlight the importance of the RT-qPCR Cp value in the sequencing optimization. Indeed, an increase of the number of target amplification PCR cycles is required for blocks with a higher RT-qPCR Cp value (>28). Nevertheless, most FFPE blocks with an RT-qPCR Cp value >30 were non-contributory, even if the number target amplification PCR cycles was increased. SARS-CoV-2 sequences available from databases (National Center for Biotechnology Information and GISAID) are generated with different sequencing platforms and methods, and quality criteria are not well defined. In the present study, sequencing quality was categorized as optimal, suboptimal, or poor based on the number of mapped reads and the uniformity. Using this classification, all the variants identified in the optimal FFPE blocks were confirmed on matched frozen tissue. Discordances were observed only for blocks with suboptimal or poor sequencing. These data suggest that the proposed sequencing quality evaluation allows the identification of FFPE blocks with reliable results when the sequencing quality is optimal. If the sequencing quality is suboptimal or poor, new variants should be analyzed with caution, especially if a high number of variants was identified. This study aimed also to identify factors that can impact the sequencing quality. SARS-CoV-2 genotyping results are influenced by several factors, such as the presence of hemorrhage and/or necrosis in the tissues, RT-qPCR Cp values, and formalin fixation. The sequencing quality was more often optimal when neither hemorrhage nor necrosis was present. Among the five non-contributory blocks, three presented hemorrhage and/or necrosis. Regarding RT-qPCR Cp values, after optimization, a significant difference in terms of sequencing metrics was still observed between FFPE blocks with low or high RT-qPCR Cp values. Because formalin fixation leads to cross-linking and fragmentation of nucleic acids, extractions from FFPE are typically fragmented into pieces <300 bp long. The Ion AmpliSeq SARS-CoV-2 Research Panel amplicon lengths range from 125 to 275 bp, with an average length of 202 bp. This relatively short amplicon length can explain the success of sequencing on FFPE blocks. Indeed, the comparison of matched frozen and FFPE tissues revealed discordance only for three blocks, all with a suboptimal sequencing result and with an RT-qPCR Cp value >28. Moreover, three variants never described before have been detected using FFPE blocks and confirmed on the matched frozen tissue. These data confirmed that FFPE material is suitable for SARS-CoV-2 genotyping. The present study has some limitations: i) The sample size was relatively small. ii) The autopsies were performed from 72 to 96 hours after death. This delay can alter the quality of the nucleic acids. iii) Total nucleic acids were used as starting input. The impact of viral enrichment strategies should be investigated. iv) RT-qPCR Cp value was used to determine the number of PCR cycles. However, RT-qPCR Cp values can vary and should be validated in each laboratory. v) The amount of available material was relatively large as it was autopsy tissue. However, in the daily practice of the pathology laboratories, molecular testing should be adapted to small biopsies and low quantities of nucleic acids. To investigate the robustness of the test, the amount of starting RNA was decreased, with concordant results for FFPE blocks with low RT-qPCR Cp values. These data have to be confirmed in a larger study using biopsies. vi) This study is limited to lung tissues, and other organs were not investigated. vii) No comparison was possible with the premortem sample. Several publications have shown that third-generation sequencing methods (Oxford Nanopore sequencing and PacBio Sequel) can be used to genotype viral pathogens, such as SARS-CoV-2. , 35, 36, 37 Direct RNA sequencing using nanopores allows virus identification without the amplification biases linked to other sequencing technologies. The third-generation sequencing also offers near to real-time genome sequencing and consequently short turnaround time (hours compared with days with Ion Torrent and Illumina). In the context of a new emerging infectious disease, these methods provide a powerful tool to rapidly identify pathogens. Nevertheless, third-generation NGS platforms are less compatible with FFPE than second-generation platforms. For this reason, the most commonly used NGS platforms in pathology laboratories still belong to the second generation. The data obtained in the present study allowed us to classify SARS-CoV-2 genomes using the clade nomenclature from GISAID for all contributory sequences as well as Nextstrain and Pangolin COVID-19 classification tools.27, 28 The variant profile was used only for the purpose of classification, as the functional and clinical impacts of these mutations remain unknown. According to the clade classification, most of the 13 patients (8/13) are classified as clade 20A, 1 is classified as clade 20B, 1 is classified as clade 20C, and 3 are classified as clade 20D (Nextstrain, , last accessed April 23, 2021). In conclusion, the present study proposes to adapt the number of target amplification PCR cycles according to RT-qPCR Cp value to optimize and to obtain reliable SARS-CoV-2 genome sequencing on FFPE samples. This opens the possibility to explore correlation between virus genotype and histopathologic lesions.

39 in total

1. Sources of PCR-induced distortions in high-throughput sequencing data sets.

Authors: Justus M Kebschull; Anthony M Zador
Journal: Nucleic Acids Res Date: 2015-07-17 Impact factor: 16.971

2. Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands.

Authors: Aura Timen; Marion Koopmans; Bas B Oude Munnink; David F Nieuwenhuijse; Mart Stein; Áine O'Toole; Manon Haverkate; Madelief Mollers; Sandra K Kamga; Claudia Schapendonk; Mark Pronk; Pascal Lexmond; Anne van der Linden; Theo Bestebroer; Irina Chestakova; Ronald J Overmars; Stefan van Nieuwkoop; Richard Molenkamp; Annemiek A van der Eijk; Corine GeurtsvanKessel; Harry Vennema; Adam Meijer; Andrew Rambaut; Jaap van Dissel; Reina S Sikkema
Journal: Nat Med Date: 2020-07-16 Impact factor: 53.440

3. Nextstrain: real-time tracking of pathogen evolution.

Authors: James Hadfield; Colin Megill; Sidney M Bell; John Huddleston; Barney Potter; Charlton Callender; Pavel Sagulenko; Trevor Bedford; Richard A Neher
Journal: Bioinformatics Date: 2018-12-01 Impact factor: 6.931

4. Direct RNA sequencing on nanopore arrays redefines the transcriptional complexity of a viral pathogen.

Authors: Daniel P Depledge; Kalanghad Puthankalam Srinivas; Tomohiko Sadaoka; Devin Bready; Yasuko Mori; Dimitris G Placantonakis; Ian Mohr; Angus C Wilson
Journal: Nat Commun Date: 2019-02-14 Impact factor: 14.919

5. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding.

Authors: Roujian Lu; Xiang Zhao; Juan Li; Peihua Niu; Bo Yang; Honglong Wu; Wenling Wang; Hao Song; Baoying Huang; Na Zhu; Yuhai Bi; Xuejun Ma; Faxian Zhan; Liang Wang; Tao Hu; Hong Zhou; Zhenhong Hu; Weimin Zhou; Li Zhao; Jing Chen; Yao Meng; Ji Wang; Yang Lin; Jianying Yuan; Zhihao Xie; Jinmin Ma; William J Liu; Dayan Wang; Wenbo Xu; Edward C Holmes; George F Gao; Guizhen Wu; Weijun Chen; Weifeng Shi; Wenjie Tan
Journal: Lancet Date: 2020-01-30 Impact factor: 79.321

6. Postmortem examination of COVID-19 patients reveals diffuse alveolar damage with severe capillary congestion and variegated findings in lungs and other organs suggesting vascular dysfunction.

Authors: Thomas Menter; Jasmin D Haslbauer; Ronny Nienhold; Spasenija Savic; Helmut Hopfer; Nikolaus Deigendesch; Stephan Frank; Daniel Turek; Niels Willi; Hans Pargger; Stefano Bassetti; Joerg D Leuppi; Gieri Cathomas; Markus Tolnay; Kirsten D Mertz; Alexandar Tzankov
Journal: Histopathology Date: 2020-07-05 Impact factor: 5.087

7. Population genomics of intrapatient HIV-1 evolution.

Authors: Fabio Zanini; Johanna Brodin; Lina Thebo; Christa Lanz; Göran Bratt; Jan Albert; Richard A Neher
Journal: Elife Date: 2015-12-11 Impact factor: 8.140

8. Isolation and Full-Length Genome Characterization of SARS-CoV-2 from COVID-19 Cases in Northern Italy.

Authors: Danilo Licastro; Sreejith Rajasekharan; Simeone Dal Monego; Ludovica Segat; Pierlanfranco D'Agaro; Alessandro Marcello
Journal: J Virol Date: 2020-05-18 Impact factor: 5.103

9. Unspecific post-mortem findings despite multiorgan viral spread in COVID-19 patients.

Authors: Myriam Remmelink; Ricardo De Mendonça; Nicky D'Haene; Sarah De Clercq; Camille Verocq; Laetitia Lebrun; Philomène Lavis; Marie-Lucie Racu; Anne-Laure Trépant; Calliope Maris; Sandrine Rorive; Jean-Christophe Goffard; Olivier De Witte; Lorenzo Peluso; Jean-Louis Vincent; Christine Decaestecker; Fabio Silvio Taccone; Isabelle Salmon
Journal: Crit Care Date: 2020-08-12 Impact factor: 9.097

10. A Recurrent Mutation at Position 26340 of SARS-CoV-2 Is Associated with Failure of the E Gene Quantitative Reverse Transcription-PCR Utilized in a Commercial Dual-Target Diagnostic Assay.

Authors: Maria Artesi; Sébastien Bontems; Marie-Pierre Hayette; Vincent Bours; Keith Durkin; Paul Göbbels; Marc Franckh; Piet Maes; Raphaël Boreux; Cécile Meex; Pierrette Melin
Journal: J Clin Microbiol Date: 2020-09-22 Impact factor: 5.948