| Literature DB >> 25258387 |
Nele A Haelterman1, Lichun Jiang2, Yumei Li2, Vafa Bayat3, Hector Sandoval4, Berrak Ugur1, Kai Li Tan1, Ke Zhang5, Danqing Bei4, Bo Xiong1, Wu-Lin Charng1, Theodore Busby4, Adeel Jawaid4, Gabriela David1, Manish Jaiswal6, Koen J T Venken7, Shinya Yamamoto8, Rui Chen9, Hugo J Bellen10.
Abstract
Forward genetic screens using chemical mutagens have been successful in defining the function of thousands of genes in eukaryotic model organisms. The main drawback of this strategy is the time-consuming identification of the molecular lesions causative of the phenotypes of interest. With whole-genome sequencing (WGS), it is now possible to sequence hundreds of strains, but determining which mutations are causative among thousands of polymorphisms remains challenging. We have sequenced 394 mutant strains, generated in a chemical mutagenesis screen, for essential genes on the Drosophila X chromosome and describe strategies to reduce the number of candidate mutations from an average of -3500 to 35 single-nucleotide variants per chromosome. By combining WGS with a rough mapping method based on large duplications, we were able to map 274 (-70%) mutations. We show that these mutations are causative, using small 80-kb duplications that rescue lethality. Hence, our findings demonstrate that combining rough mapping with WGS dramatically expands the toolkit necessary for assigning function to genes.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25258387 PMCID: PMC4199363 DOI: 10.1101/gr.174615.114
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.A sequencing depth of 30× permits identification of 95% of SNVs. (A) Graph displaying the number of identified SNVs at different sequencing depths for the isogenized FRT19A X chromosome (FRT19A). A 30× coverage allows identification of ∼95% of the SNVs identified at 50× coverage. (B) Percentage of the X chromosome that is covered 1×–5× (black), 5×–10× (dark gray), 10×–20× (gray), or ≥20× (light gray) at various average sequencing depths. An average sequencing depth of 30× allows reliable heterozygous SNV-calling (requiring 10 or more reads) of 95% of the X chromosome. (C) Description of SNVs identified in the X chromosome of FRT19Aiso sequenced at 48× when compared to the reference sequence (y; cn bw sp) (Adams et al. 2000).
Figure 2.Filtering process to identify candidate genes in heterozygous mutants. (A) Flowchart of filters applied to identify candidate mutations in heterozygous mutants [y w (*) FRT19A/y w FRT19A]. All identified SNVs (brown) were first filtered against SNVs identified in the isogenized FRT19A X chromosome (ΔIso, orange). Subsequently, only SNVs that affect the coding sequence or splice sites were retained (functional, green). Next, the remaining SNVs were filtered against a database, containing polymorphisms found in a homozygous state in a collection of 205 viable, wild-type strains from the Drosophila Genetic Reference Panel (ΔDGRP, blue). (B, left) Impact of filters, introduced in A, on the total number of SNVs identified on the X chromosome. (Right) In a 1-Mb interval, the number of remaining candidate mutations is ∼4.
Figure 3.Mapping and sequencing strategy. General strategy to map lethal mutations on the X chromosome. (A) Duplication (Dp) mapping: For every mutant, lethality was mapped to an ∼1.4-Mb region by Dp mapping. (B) Complementation (Compl) testing: Mutations that map to the same duplication were intercrossed to identify Compl groups. (C) Sequencing: Whole-genome sequencing (WGS) was performed on a total of 394 transheterozygous mutations (mut 1/mut 3) whose lethalities map to a different duplication. The 394 mutations correspond to 258 single alleles and 68 complementation groups with two alleles. (D) Validation: We used 80-kb P[acman] duplications to rescue the lethality and confirm the mapping.
Subset of genes, identified through WGS and validated by 80-kb P[acman] dp rescue and/or complementation tests
Figure 4.Filtering strategy to identify candidate genes in transheterozygous (mut 1/mut 2) mutants. (A) The same filters were applied as in Figure 2, and additional filters were added to remove SNVs identified repeatedly in multiple sequenced genomes (ΔXscreen [red]). A final filter was added to exclude genes that appear to be difficult to sequence (technical [purple]). (B) Building a background-specific filter (∆Xscreen). The largest drop in SNVs is seen when the ΔXscreen filter is built based on recurring SNVs found in 12 transheterozygous mutant genomes. (C) Building a technique-specific filter (technical). Approximately 95 genes appear difficult to sequence or analyze, since SNVs in these genes are called in nearly every sequenced genome. Hence, these genes were excluded from analysis (see Supplemental Table 1). (D) Distribution of the number of SNVs per chromosome that were identified in all analyzed sequence files. On average, 15 to 25 SNVs were identified for the two X chromosomes sequenced in the same reaction. (E) Distribution of the number of identified candidate mutations in an ∼1.4-Mb region to which lethality was mapped by duplication mapping. On average, one to two candidate mutations were found per duplication. (F) Mapping efficiency. For complementation groups consisting of multiple alleles, the causative mutation could be identified in 85% of the sequenced lines, as they could be rescued by an 80-kb P[acman] construct. For single alleles, the mutation could be validated in 62% of the sequenced lines. (G) Characteristics of the identified mutations.
Primer pairs for PCR verification of causative mutations