| Literature DB >> 33110627 |
Christian R Marshall1, Shimul Chowdhury2, Ryan J Taft3, Mathew S Lebo4,5, Jillian G Buchan6,7, Steven M Harrison4,5, Ross Rowsey8, Eric W Klee8,9, Pengfei Liu10, Elizabeth A Worthey11,12, Vaidehi Jobanputra13,14, David Dimmock2, Hutton M Kearney8, David Bick11, Shashikant Kulkarni10,15, Stacie L Taylor3, John W Belmont3, Dimitri J Stavropoulos1, Niall J Lennon5.
Abstract
Whole-genome sequencing (WGS) has shown promise in becoming a first-tier diagnostic test for patients with rare genetic disorders; however, standards addressing the definition and deployment practice of a best-in-class test are lacking. To address these gaps, the Medical Genome Initiative, a consortium of leading healthcare and research organizations in the US and Canada, was formed to expand access to high-quality clinical WGS by publishing best practices. Here, we present consensus recommendations on clinical WGS analytical validation for the diagnosis of individuals with suspected germline disease with a focus on test development, upfront considerations for test design, test validation practices, and metrics to monitor test performance. This work also provides insight into the current state of WGS testing at each member institution, including the utilization of reference and other standards across sites. Importantly, members of this initiative strongly believe that clinical WGS is an appropriate first-tier test for patients with rare genetic disorders, and at minimum is ready to replace chromosomal microarray analysis and whole-exome sequencing. The recommendations presented here should reduce the burden on laboratories introducing WGS into clinical practice, and support safe and effective WGS testing for diagnosis of germline disease.Entities:
Keywords: Genetic testing; Laboratory techniques and procedures; Next-generation sequencing
Year: 2020 PMID: 33110627 PMCID: PMC7585436 DOI: 10.1038/s41525-020-00154-9
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Fig. 1Clinical whole-genome sequencing workflow.
The workflow for clinical WGS involves three major analysis steps spanning wet laboratory and informatics processes: primary (blue) analysis refers to the technical production of DNA sequence data from biological samples through the process of converting raw sequencing instrument signals into nucleotides and sequence reads; secondary (green) analysis refers to the identification of DNA variants through read alignment and variant calling; and tertiary (yellow) analysis refers to variant annotation, filtering and prioritization, classification, interpretation, and reporting. Health record information and phenotype can be mined and converted to Human Phenotype Ontology (HPO) terms to aid variant interpretation. Primary analysis involves the sample, and library preparation and sequencing with base calling followed by extensive quality control (QC). During this stage, genotyping with an orthogonal method (SNP-array or targeted assay) is performed for QC purposes. Secondary analysis involves mapping, read alignment, and variant calling. Different classes of variation (SNVs, SV, CNVs, mitochondrial, and repeat expansions) will use different algorithms that can be run in parallel. Aside from QC of alignment and variant calling, the orthogonal genotyping can be used to ensure no sample mix-up has occurred throughout the workflow. Tertiary analysis begins with the annotation of variants followed by filtering, prioritization, and variant classification depending on the phenotype and clinical indication for testing. Classification of variants according to ACMG guidelines may be automated, but the final interpretation involves human intervention and will ultimately be driven by the case phenotype. Variants are reported based on relevance to the primary indication for testing and secondary, or incidental findings not associated with the reason for testing following any necessary confirmation method. Confirmation may be performed with an orthogonal wet laboratory method or in silico examination of the data based on how the test was validated. Clinical correlation (pink) is performed by the ordering physician, which may involve iterative feedback and collaboration with the laboratory (dotted arrows). Throughout the process, collection of aggregate data will be necessary to generate internal allele frequencies and for sharing of interpreted data with repositories.
Fig. 2Key steps in the analytical validation of a clinical WGS test.
Key steps in the analytical validation of clinical WGS include test development optimization, test validation, and quality management. Each step involves activities that lead to defined outcomes.
Summary of key questions and recommendations for the analytical validation of whole-genome sequencing.
| Current state/question | Consensus | Recommendation | Comments and future outlook | |
|---|---|---|---|---|
| Test development and optimization | Test definition considerations | |||
| What variant classes should be reported in a clinical WGS test? | Yes | • A clinical whole-genome sequencing test should aim to analyze and report on all detectable variant types (Table • At a minimum, we recommend reporting on SNVs, indels, and CNVs. | • Test definitions are evolving and laboratories should further aim to offer reporting of mitochondrial variants, repeat expansions, some structural variants, and selected clinically relevant genes whose analytical assessment is made difficult by pseudogenes or highly homologous sequence. | |
| Test performance considerations | ||||
| What comparisons are necessary for WGS to replace other tests such as CMA or WES? | Yes | • Clinical WGS test performance should meet or exceed that of any tests that it is replacing. • Current evidence suggests WGS is analytically sufficient to replace WES and CMA (Table | • If clinical WGS is deployed with any established gaps in performance compared to current gold standard tests, it should be noted on the test report. • Robust detection by WGS of some variants is either not equivalent (mosaic SNVs) or has yet to be established (e.g., repeat expansions), but should still be included in the test definition as long as limitations in test sensitivity are defined. | |
| Which variant types should be confirmed before reporting on a WGS test? | Yes | • The laboratory should have a strategy in place to define which variants need confirmatory testing before reporting. • Until the accuracy of more complex variants (SVs, REs, etc.) is equivalent to currently accepted assays, confirmation with an orthogonal method is necessary before reporting. | • As algorithms improve for complex variant calling and data are acquired to support WGS accuracy compared to currently accepted assays, it is expected that orthogonal confirmation will not be necessary. | |
| Upfront considerations for test design | ||||
| How should labs define and evaluate high-quality genome coverage and callability? | Some | • Metrics that measure genome completeness should be used to define the performance of clinical WGS, and include overall depth and evenness of coverage. • These should be monitored with respect to callable regions of the genome and the related calling accuracy for each variant type compared to orthogonally investigated truth sets. • An assessment of callable regions should use depth of coverage, base quality, and mapping quality. | • Although consensus was achieved on the concept of the evaluation of genome coverage and callability, consistent methodology, and universal cutoffs could not be established. Expected and suggested values, and ranges are shown in Table | |
| What reference standards and positive controls should be used for performing clinical WGS validation? | Yes | • Reference standards are useful for the evaluation of calling accuracy across variant type, size, and location. • Analytical validation of clinical WGS should include publicly available reference standards (e.g., NIST and platinum genomes) in addition to commercially available and laboratory-held positive controls for each variant type (see Supplementary Table | • Reference standards are not sufficient on their own for validation and laboratory-held positive controls derived from the same specimen type should also be used. | |
| How many controls should be used for variant types commonly addressed by the field vs those were standards are still evolving (REs)? | Yes | • For commonly addressed variants like small variants, a low minimal number of controls can be utilized if they include well-accepted reference standards. • For more complex and emerging variant types like RE, a large number of samples are necessary (see Supplementary Table | • As algorithms and reference standard datasets improve, it is expected that fewer samples will be needed for test validation. | |
| Test validation | Performance metrics, variant type, and genomic context | |||
| What factors should be considered when deciding which metrics to evaluate during test validation? | Some | • The analytical framework should include metrics that account for genome complexity, with special attention to sequence content and variant type. | • Small variants and CNVs have different calling constraints that can be affected differently by low-complexity sequence. | |
| Which performance metrics should be utilized to ensure the accuracy of clinical WGS? | Some | • Use of GA4GH and FDA recommendations for sensitivity or PPA and precision or TPPV, and lower bound of the 95% CI when truth sets are available. • Repeatability, reproducibility, and limits of detection should be tested. • PPA and NPA against positive controls assessed using a precedent technology is recommended when standard truth sets are not available (see Supplementary Table | • Precision or TPPV is a more useful metric than specificity due to the large number of true negatives expected from clinical WGS. | |
| What limitations affecting variant calling performance should be considered for a WGS test? | Some | Variant class: • Mosaics: limitation for detection of mosaic variants is low due to low overall read depth. • Structural variants: limitations in the detection of balanced changes (translocation and inversions). • Repeat expansions: limitation in accurately determining expanded sizes. Variant size: • Indels/SV: limitations of calling accuracy for larger indels >20 bp and smaller SVs <1000 bp. Genomic context: • Limitations of accuracy for all variant types in regions of high homology, low complexity, and other technically challenging regions. | • Consensus on limitations for calling performance by variant type was achieved; however, consensus on specific definitions related to variant size and genomic context was not. • Regions that are defined as technically challenging should be documented and made available to those ordering the test. | |
| Sample number and type for validation | ||||
| How many samples and what types are needed for validation? | Some | • Reference standards are sufficient to assess global accuracy of small variants, but should be supplemented with positive controls. • Beyond small variants, more positive controls are needed; these should reflect the most common pathogenic loci or variants. | • Little consensus on the number of positive controls needed. • Number of samples needed for emerging uses of clinical WGS will evolve as more laboratories validate clinical WGS. | |
| Quality management | Control samples | |||
| What should be included in ongoing quality control of a clinical WGS test? | Yes | • Identification of a comprehensive set of performance metrics and continual monitoring. • Use of positive controls on a periodic basis. | Less reliance on positive controls for each run; more on the continual monitoring of defined metrics. | |
| Sequencing quality and performance metrics | ||||
| What sequencing metrics should be monitored for performance? | No | • General agreement in some of the sequencing and analysis metrics that should be used for pass/fail (Table • Unable to reach consensus on which metrics should be used for sample level QC and monitoring, and the corresponding thresholds that need to be met. | Standards for metrics calculations should allow for consensus on thresholds. | |
Metrics for clinical whole-genome sequencing.
| Metric | Description | Type (threshold) or typical expected value |
|---|---|---|
| Examples of pass/fail metrics | ||
| Sample identity | Concordance with genotype (orthogonal and/or family structure when available). | Pass/fail (match) |
| Contaminationa | The estimated level of sample cross-individual contamination based on a genotype-free estimation. | Pass/fail (≤2%) |
| Gb ≥ Q30b | Total aligned gigabases (Gb) of data with base quality score >Q30. | Pass/fail (≥80 Gb) |
| Autosome mean coveragec | The mean coverage across human autosomes, after all filters are applied. | Pass/fail (≥30) |
| % Callabilityd | Percent of non-N reference positions in autosomal chromosomes with a passing genotype call. | Pass/fail (>95%) |
| Examples of metrics to monitor | ||
| %Q30 bases total | The percentage of bases that meet Q30 scores. | ≥85% |
| 20×%e | The fraction of non-N autosome bases that attained at least 20× sequence coverage in post-filtering bases. | ≥90% |
| PF reads aligned % | The percentage of passing filter (PF) reads that align to the reference sequence. | >98% |
| PF aligned Q20 basesf | The number of bases aligned to the reference sequence in PF reads that were mapped at high quality and where the base call quality was Q20 or higher. | >1.0E + 11 |
| Adapter-dimer % | The fraction of PF reads that are unaligned and match to a known adapter sequence right from the start of the read. | <0.2% |
| Chimera % | The percentage of reads that map outside of a maximum insert size (usually 100 kb) or that have the two ends mapping to different chromosomes. | <1% |
| Duplication % | The percentage of mapped sequence that is marked as duplicate. | <10% |
| Median insert sizeg | The median insert size of all paired end reads where both ends mapped to the same chromosome. | >300 bp |
| Excluded total % | The percentage of aligned bases excluded due to all filters. | <15% |
aLaboratories in the Medical Genome Initiative use a threshold of <1% for germline WGS from peripheral blood.
bSome laboratories in the Medical Genome Initiative use a similar metric of ≥85 Gb unaligned Q30 sequence.
cLaboratories in the Medical Genome Initiative use either 30× or 40× mean coverage as a cutoff.
dCallability or the fraction of the genome where accurate calls can be made can be calculated in different ways. The description in the table represents one way to calculate callability, but there are others including using the percentage of base pairs that reach a read depth (RD) of 20, with base quality (BQ) and mapping quality (MQ) of 20.
eMeasure of completeness. Depth of coverage at 15× also used by some laboratories along with a mapping quality cutoff (>10). Targets will also vary; laboratories may measure across genome, exome, OMIM morbid map genes, and positions or exons with known pathogenic regions.
fSome laboratories in the Medical Genome Initiative use Q10 Bases.
gMean insert size may also be used. The mean insert size is the “core” of the distribution. Artifactual outliers in the distribution often cause calculation of nonsensical mean and standard deviation values. To avoid this, the distribution is first trimmed to a “core” distribution of ±N median absolute deviations around the median insert size. By default, N = 10, but this is configurable.
Variant types detectable and reportable from clinical WGS.
| Variant type | Gene(s) (if applicable) | Disorder(s) | References (if applicable) |
|---|---|---|---|
SNVs and small insertions and deletionsa (1–50 base pairs) | N/A | Heritable disease | Zook et al.[ Eberle et al.[ |
| Copy number variationa (deletions and duplications) | N/A | Heritable disease including known microdeletion/duplication syndromes | Gross et al.[ Stavropoulos et al.[ Lindstrand et al.[ |
| Mitochondrial variationb (SNVs, deletions, duplications, and heteroplasmy of at least 5%) | N/A | Known mitochondrial disorders | Duan et al.[ |
| Structural variantsb | N/A | Heritable disease including those caused by translocations, inversions, and other genomic rearrangements | Lindstrand et al.[ |
| Repeat expansionsc | Fragile X and related disorders | Dolzhenko et al.[ | |
| Huntington disease | |||
| Spinocerebellar ataxia 1 | |||
| Myotonic dystrophy 1 | |||
| Amyotrophic lateral sclerosis | |||
| Selected pseudogenesc | Spinal muscular atrophy | Chen et al.[ | |
| 21-Hydroxylase deficiency | |||
| Codeine sensitivity | |||
| Alpha thalassemia | |||
| Colorectal cancer | |||
| Polycystic kidney disease 1 |
aRecommended minimum variant types for clinical validation of WGS. Copy number variation is defined here as unbalanced changes (deletions and duplications) that are at the resolution of chromosomal microarray analysis.
bSome initiative groups have clinically validated. Structural variants are defined here as any genomic alteration >50 base pairs, including balanced and unbalanced changes.
cExamples of targeted loci that could be validated and reported as part of a clinical WGS test.
| Term | Definition |
|---|---|
| Analytical validity | A measure of the accuracy with which a test predicts a genetic change. |
| Callable region (callability) | Regions of the genome where accurate single-nucleotide variant genotype can be reliably derived. Typically expressed as a percentage of non-N reference calls with a passing genotype across a target (whole genome, OMIM genes). |
| Completeness | Proportion of the genome, or a select region of interest (e.g., exons), that have sufficient, high-quality sequencing reads to enable identification of variants. |
| Negative percent agreement (NPA) | Equivalent to specificity. The proportion of correct calls in the absence of a variant, reflecting the frequency of false positives (FP). Calculated as the number of true negatives (TN) detected divided by the total number of positions where a variant is absent (TN plus FP). |
| No-call or invalid call | A position within the testing interval where no variant call can be been made. |
| Orthogonal confirmation | Verification of a specific variant call using a different testing modality. |
| Positive percent agreement (PPA) | Equivalent to recall/sensitivity. Ability of the test to correctly identify variants that are present in a sample, reflecting the frequency of false negatives (FN). Calculated as the number of known variants detected (true positives; TP) divided by the total number of known variants tested (TP plus FN). |
| Precision | Equivalent to TPPV. The fraction of variant calls that match the expected, reflecting the number of FP per test. Calculated as the number of TP divided by the total number of positive calls made (TP plus FP). |
| Predicted zygosity | In diploid organisms, one allele is inherited from the male parent and one from the female parent. Zygosity is a description of whether those two alleles have identical or different DNA sequences. |
| Read depth | A measure of the number of sequence reads that are aligned to a specific base or locus. |
| Repeatability | The percent agreement between the results of successive tests carried out under the same conditions of measurement. |
| Reproducibility | The percent agreement between the results of tests under a variety (e.g., different operators, machines, and time frames). |
| Sensitivity or recall | Equivalent to PPA. Ability of the test to correctly identify variants that are present in a sample, reflecting the frequency of FN. Calculated as the number of known variants detected (TP) divided by the total number of known variants tested (TP plus FN). |
| Specificity | Equivalent to NPA. The proportion of correct calls in the absence of a variant, reflecting the frequency of FP. Calculated as the number of TN detected divided by the total number of positions where a variant is absent (TN plus FP). |
| Technical positive predictive value (TPPV) | Equivalent to precision. The fraction of variant calls that match the expected, reflecting the number of FP per test. Calculated as the number of TP divided by the total number of positive calls made (TP plus FP). |