| Literature DB >> 26173456 |
Vera B Kaiser1, Victoria Svinti2, James G Prendergast3, You-Ying Chau2, Archie Campbell4, Inga Patarcic5, Inês Barroso6, Peter K Joshi7, Nicholas D Hastie2, Ana Miljkovic5, Martin S Taylor2, Stefan Enroth8, Yasin Memari6, Anja Kolb-Kokocinski6, Alan F Wright2, Ulf Gyllensten8, Richard Durbin6, Igor Rudan7, Harry Campbell7, Ozren Polašek9, Åsa Johansson8, Sascha Sauer10, David J Porteous4, Ross M Fraser7, Camilla Drake2, Veronique Vitart2, Caroline Hayward2, Colin A Semple2, James F Wilson11.
Abstract
Homozygous loss of function (HLOF) variants provide a valuable window on gene function in humans, as well as an inventory of the human genes that are not essential for survival and reproduction. All humans carry at least a few HLOF variants, but the exact number of inactivated genes that can be tolerated is currently unknown—as are the phenotypic effects of losing function for most human genes. Here, we make use of 1432 whole exome sequences from five European populations to expand the catalogue of known human HLOF mutations; after stringent filtering of variants in our dataset, we identify a total of 173 HLOF mutations, 76 (44%) of which have not been observed previously. We find that population isolates are particularly well suited to surveys of novel HLOF genes because individuals in such populations carry extensive runs of homozygosity, which we show are enriched for novel, rare HLOF variants. Further, we make use of extensive phenotypic data to show that most HLOFs, ascertained in population-based samples, appear to have little detectable effect on the phenotype. On the contrary, we document several genes directly implicated in disease that seem to tolerate HLOF variants. Overall HLOF genes are enriched for olfactory receptor function and are expressed in testes more often than expected, consistent with reduced purifying selection and incipient pseudogenisation.Entities:
Mesh:
Year: 2015 PMID: 26173456 PMCID: PMC4572071 DOI: 10.1093/hmg/ddv272
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Filtering putative HLOF variants
| Filter | Stop gain mutations | Frameshift mutations |
|---|---|---|
| Total number of variants before filtering | 1084 | 1185 |
| Relative_position < 0.9 | 883 | 973 |
| Duke (mapability) | 960 | 1027 |
| DAC excluded regions (mapability) | 1083 | 1175 |
| CRg (mapability) | 893 | 917 |
| Min. 200 | 945 | 942 |
| SureSelect & TrueSeq | 794 | 829 |
| Ancestral allele | 1032 | NA |
| MNP/frameshift nearby | 982 | 817 |
| Hardy–Weinberg | 323 | 373 |
| All filters applied, Sanger Sequencing | 94 | 79 |
The number of putative HLOF variants remaining after each mutational filter had been applied individually. Filters have the following meaning: ‘Relative_position < 0.9’ indicates that variants are limited to the first 90% of all splice variants of a gene. ‘Duke’, ‘DAC’ and ‘CRg’ are mapability scores of UCSC. ‘Min.200’ indicates that at least 200 individuals were sampled for each variant. ‘SureSelect & TrueSeq’ is the intersection of the two exome sequencing kits. ‘Ancestral allele’ indicates whether the LOF variant is found in other primates. The ancestral allele filter was applied to frameshift mutations only after all other filters had been applied, leading to the exclusion of seven variants. ‘MNP/frameshift nearby’ indicates whether a restoring variant was found in proximity of the focal variant. ‘Hardy–Weinberg’ indicates whether the variant passed our Hardy–Weinberg filter.
Sanger-sequencing confirmed HLOFs
| Position | Gene | Number of homozygotes | Population |
|---|---|---|---|
| chr1:55076137 | 1 | NSPHS | |
| chr4:113539281 | 1 | CROATIA-Korčula | |
| chr5:96222446 | 1 | CROATIA-Vis | |
| chr6:28358464 | 1 | CROATIA-Vis | |
| chr6:31106500 | 26 | CROATIA-Vis, ORCADES, GS:SFHS, NSPHS | |
| chr12:70088219 | 2 | CROATIA-Vis, ORCADES | |
| chr14:57672624 | 2 | NSPHS, GS:SFHS | |
| chr15:44091290 | 1 | CROATIA-Vis | |
| chr17:46882286 | 2 | CROATIA-Vis, GS:SFHS | |
| chr17:47921435 | 1 | CROATIA-Vis | |
| chr19:36230499 | 1 | ORCADES | |
| chr19:51729103 | 3 | NSPHS, GS:SFHS | |
| chrX:50659021 | 1 | ORCADES |
The number of individuals which were homozygous for a novel, Sanger-sequencing confirmed variant, and the population where the variant was found.
Figure 1.Rare HLOFs are found within ROHs. Allele frequencies of HLOFs that are biased towards being inside or outside runs of homozygosity (ROHs) in the five populations studied (using a binomial test with P < 0.1). For GS:SFHS, CROATIA-Vis, ORCADES and NSPHS, the allele frequencies of variants that were enriched in ROHs were significantly lower compared with variants that were found in the autozygome (Wilcoxon test; P < 0.05). In CROATIA-Korčula, only two variants were underrepresented in ROHs, and the Wilcoxon test was not significant.
Summary of the numbers of HLOFs found
| Population | Yield | Private LOFs | Private HLOFs | ||
|---|---|---|---|---|---|
| GS: SFHS | 844 | 137 | 0.16 | 1 | 26 |
| CROATIA-Vis | 193 | 104 | 0.54 | 2 | 8 |
| ORCADES | 197 | 103 | 0.52 | 3 | 13 |
| NSPHS | 98 | 91 | 0.93 | 0 | 6 |
| CROATIA-Korčula | 100 | 74 | 0.74 | 1 | 4 |
The sample size, N, and the number of HLOF mutations that were found in each population, N(HLOFs); the average yield per individual (N(HLOFs)/N); the extent to which the mutations are shared across populations. Private LOFs are seen only in one population, including as heterozygotes; private HLOFs are shared as heterozygotes across populations.
Figure 2.Number of HLOFs per individual and predicted deleteriousness. Boxplot of the number of HLOFs carried by each individual in GS:SPHS, CROATIA-Vis, CROATIA-Korčula, ORCADES and NSPHS (A) and the C-scores associated with HLOF variants in the five populations (B).
Gene Ontology
| GO term | Description | FDR adjusted | Enrichment ( |
|---|---|---|---|
| GO:0004984 | Olfactory receptor activity | 3.35E-11 | 7.65 (17 424,362,151,24) |
| GO:0004930 | G-protein coupled receptor activity | 7.86E-06 | 3.81 (17 424,788,151,26) |
| GO:0004888 | Transmembrane signalling receptor activity | 8.29E-04 | 2.80 (17 424,1154,151,28) |
| GO:0004872 | Receptor activity | 8.60E-04 | 2.52 (17 424,1464,151,32) |
| GO:0038023 | Signalling receptor activity | 2.50E-03 | 2.58 (17 424,1253,151,28) |
| GO:0004871 | Signal transducer activity | 9.62E-02 | 2.09 (17 424,1548,151,28) |
| GO:0060089 | Molecular transducer activity | 8.25E-02 | 2.09 (17 424,1548,151,28) |
GO analysis of genes containing HLOF variants, using, as a background set, all genes captured by the intersection of the SureSelect and TruSeq exome sequencing kits.
Enrichment = (b/n)/(B/N); N = Number of genes; B = Number of genes associated with a GO term; n= number of genes in the target set; b = number of genes in the intersection.