Literature DB >> 36121795

Prevalence estimates of putatively pathogenic leptin variants in the gnomAD database.

Luisa Sophie Rajcsanyi^1,2, Yiran Zheng^1,2, Pamela Fischer-Posovszky³, Martin Wabitsch³, Johannes Hebebrand¹, Anke Hinney^1,2.

Abstract

Homozygosity for pathogenic variants in the leptin gene leads to congenital leptin deficiency causing severe early-onset obesity. This monogenic form of obesity has mainly been detected in patients from consanguineous families. Prevalence estimates for the general population using the Exome Aggregation Consortium (ExAC) database reported a low frequency of leptin mutations. One in approximately 15 million individuals will be homozygous for a deleterious leptin variant. With the present study, we aimed to extend these findings utilizing the augmented Genome Aggregation Database (gnomAD) v2.1.1 including more than 140,000 samples. In total, 68 non-synonymous and 7 loss-of-function leptin variants were deposited in gnomAD. By predicting functional implications with the help of in silico tools, like SIFT, PolyPhen2 and MutationTaster2021, the prevalence of hetero- and homozygosity for putatively pathogenic variants (n = 32; pathogenic prediction by at least two tools) in the leptin gene were calculated. Across all populations, the estimated prevalence for heterozygosity for functionally relevant variants was approximately 1:2,100 and 1:17,830,000 for homozygosity. This prevalence deviated between the individual populations. Accordingly, people from East Asia and individuals of mixed ethnicities ('Others') were at greater risk to carry a possibly damaging leptin variant. Generally, this study emphasises the scarcity of pathogenic leptin variants in the general population with varying prevalence for distinct study groups.

Entities: Chemical

Mesh：

Substances：
Leptin

Year: 2022 PMID： 36121795 PMCID： PMC9484668 DOI： 10.1371/journal.pone.0266642

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

The leptin-melanocortin system modulates the energy homeostasis and body weight regulation via the hypothalamic arcuate nucleus (ARC). The hormone leptin (LEP) is secreted by the adipose tissue into the bloodstream. In the ARC, LEP binds to the leptin receptor on pro-opiomelanocortin (POMC) and agouti-related peptide (AgRP) expressing neurons, stimulating POMC’s release and inhibiting AgRP’s expression. Subsequently, POMC is post-translationally processed into the α-melanocyte-stimulating hormone. Eventually, the signalling of melanocortin-4-receptor is stimulated leading to a decreased food intake due to satiety signals [1-5]. Homozygous mutations in the LEP gene cause congenital leptin deficiency disrupting the normal regulation of the body weight. Leptin levels in homozygous carriers of deleterious mutations are in most cases extremely low to undetectable [5, 6]. Some deleterious mutations lead to a biologically inactive leptin. Leptin levels in these patients are seemingly normal for their body mass index (BMI) [6]. A rapid weight gain eventually leads to extreme obesity with hyperphagia, hypogonadism and impaired immune functions being concomitant symptoms [5, 7–10]. This form of monogenic obesity is infrequent, with a prevalence between 1 and 5% and predominantly affecting individuals with parental consanguinity [5, 7, 8, 11–14]. In 1997, the first deleterious LEP mutation (p.Gly133Valfs*15) was reported by Montague and colleagues [7]. It was detected in the homozygous state in two cousins descending from a consanguineous family with the unaffected parents being heterozygous for the variant. Due to this frameshift mutation, the LEP protein was truncated as 14 aberrant amino acids and a premature stop codon were introduced. This led to a rapid onset of obesity after normal birth weight [7]. Subsequent treatment with recombinant leptin led to a substantial weight loss and a decrease in energy intake [11, 15]. Further, besides frameshift mutations, pathogenic nonsense, and non-synonymous variants as well as deletions in LEP have been reported [5]. The functional effects of these mutations are diverse. For instance, a deletion (p.Ile35del) detected in two homozygous obese patient leads to a complete loss of the second exon of LEP and the removal of an isoleucine from the N-terminus of the protein [16, 17]. Additionally, the non-synonymous variant p.Asp100Tyr was detected in an extremely obese boy from a consanguineous family. He showed high serum leptin levels and a pronounced history of infections. Functional analyses revealed normal leptin expression and secretion but a dysfunctional bio-inactive leptin that did not induce Stat3 phosphorylation [6, 14]. In 2017, Nunziata et al. [18] estimated the prevalence of putatively damaging mutations in the LEP gene using the Exome Aggregation Consortium (ExAC) database. Based on data from 60,706 samples, it was estimated that one in 15,000,000 individuals is potentially a homozygous carrier of a deleterious LEP mutation (determined by in silico tools), while approximately one in 2,000 individuals harbours a heterozygous leptin variant [18]. Upon inclusion of functionally relevant LEP variants described in the literature, the authors estimated higher prevalence of hetero- and homozygosity of 1:1,050 and 1:4,400,000, respectively [18]. To date, ExAC has been augmented into the Genome Aggregation Database (gnomAD) including more than 140,000 samples (version v2.1.1) [19]. Therefore, we aimed to estimate the prevalence of putatively deleterious non-synonymous, frameshift and nonsense (loss-of-function; LoF) mutations in the LEP gene based on this extended dataset represented in gnomAD v2.1.1.

Materials and methods

gnomAD

The gnomAD database (https://gnomad.broadinstitute.org/, accessed: Jan 24th, 2022) [19], encompasses 15,708 whole-genome and 125,748 exome sequencing datasets from individuals of various populations (v2.1.1, GRCh37/hg19) comprising more than 200 million genetic variants. The sequencing data predominantly originates from case-control studies of diseases diagnosed in adulthood, such as cardiovascular diseases or psychiatric disorders. To ensure high quality data, all samples were subjected to a quality control, excluding samples with low sequencing quality, samples from second-degree relatives or higher, and data from patients with severe paediatric diseases. In total, six global and eight sub-continental populations are included, while populations from the Middle East, Central and Southeast Asia, Oceania and Africa being generally underrepresented. The mean coverage of the LEP gene was ~ 80x for exome and ~ 30x for genome data [19].

Leptin variants and their predicted functional implications

In gnomAD, the LEP gene (canonical transcript ENST00000308868.4) was analysed and data pertaining to non-synonymous and LoF variants as well as the corresponding population-specific allele counts, and frequencies were extracted (see S1 Table). Consequences on the leptin protein by non-synonymous variants were predicted utilizing various in silico tools, namely Sorting Intolerant From Tolerant (SIFT) [20], Polymorphism Phenotyping v2 (PolyPhen2) [21], MutationTaster2021 [22], Functional Analysis through Hidden Markov Models–multiple kernel learning (FATHMM-MKL) [23] and Protein Variation Effect Analyzer (PROVEAN) [24]. Predictions by SIFT, FATHMM-MKL and PROVEAN were obtained with the help of the Variant Effect Predictor (VEP) [25]. For LoF variants, gnomAD presents predictions whether the respective LoF variant is a high- or low-confidence LoF based on results of either the LOFTEE tool or a manual curation, shown below the information of VEP on gnomAD’s variant page [19, 26]. SIFT classifies variants as either ‘tolerated’ or ‘deleterious’, while PolyPhen2 categorizes the mutations into ‘benign’, ‘possibly damaging’ and ‘probably damaging’. For PolyPhen2, the ‘HumVar’ classifier model was applied. MutationTaster2021 subjects each variant to several in silico tools itself and subsequently annotates each substitution as either ‘benign’ or ‘deleterious’. FATHMM-MKL and PROVEAN classify the variants into two categories: ‘neutral’ and ‘damaging’. Except MutationTaster2021, all these tools exclusively analyse non-synonymous variants. Thus, we annotated frameshift mutations with the LoF confidence predictions stated on the variant’s page. To obtain additional hints for a putative clinical significance of a given variant (non-synonymous and LoF), the database ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) [27] was checked. Based on the preceding in silico analyses, the probability of hetero- and homozygous variants predicted to be pathogenic was calculated applying the Hardy-Weinberg equilibrium (HWE) with the assumption of a perfect population (see Eq (1); p = allele frequency of allele A, q = allele frequency of allele a) as performed by Nunziata et al. [18]. Hence, the prevalence of the heterozygous (2qp) and homozygous (including compound heterozygous; q) variants were determined (see Eq (1)). To assess the prevalence of homozygous variants, the frequencies (q) of the individual alleles were calculated and subsequently summed up. Subtraction of the prevalence of homozygosity from the prevalence of homozygous including the compound heterozygous variants revealed the corresponding frequencies for the compound heterozygotes. When analysing the individual populations, substitutions were considered pathogenic if at least two of the applied in silico tools identified the variant as ‘damaging’ or ‘deleterious’ or if it was a high-confidence LoF variant. Further, a literature search was performed. The PubMed database was screened for the term ‘congenital leptin deficiency’ and each individual non-synonymous or LoF variant extracted from gnomAD, to compile a list containing all obese subjects carrying a LEP variant and putative functional implications. This list was extended with references for each individual variant deposited in NCBI (https://www.ncbi.nlm.nih.gov/), Online Mendelian Inheritance in Man (OMIM; https://www.omim.org/), Ensembl (https://www.ensembl.org/) and LitVar (https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar/) and two review articles [18, 28]. Allele counts were derived from gnomAD.

Results

In total, 75 non-synonymous and LoF variants in the LEP gene were deposited in gnomAD. Of these, 68 were non-synonymous (90.70%), five were frameshift (6.67%) and one each was an in-frame deletion (1.33%) or splice acceptor variant (1.33%). Across all populations, the non-synonymous variant rs17151919 (p.Val94Met) was the most frequent polymorphism with an overall allele frequency (AF) of 0.84% (see Table 1 and S1 Table). A total of 105 homozygous and 2,167 heterozygous carriers of rs17151919 were observed (see S1 and S2 Tables). Yet, in silico tools predicted a non-pathogenic potential (see S2 Table).

Table 1

Summary of non-synonymous and LoF variants in the LEP gene deposited in gnomAD.

Population	Sample size	Total number of variants*	Number of non-synonymous variants	Number of LoF variants	Most common variant (AF)
All populations	141,456	75	68	7	rs17151919 (0.84%)
All populations (females)	64,754	49	45	4	rs17151919 (1%)
All populations (males)	76,702	50	45	5	rs17151919 (0.71%)
African-American	12,487	13	12	1	rs17151919 (8.41%)
Ashkenazi Jewish	5,185	1	1	0	rs17151919 (0.26%)
East-Asian	9,977	14	13	1	rs148407750 (0.31%)
European, Finnish	12,562	3	3	0	rs751272426 (0.04%)
European, non-Finnish	64,603	41	37	4	rs17151919 (0.04%)
Latino/Admixed American	17,720	14	14	0	rs17151919 (0.45%)
Others^a	3,614	8	8	0	rs17151919 (0.38%)
South Asian	15,308	16	14	2	rs17151919 (0.03%)

Table 1 summarizes the number of LEP variants (*non-synonymous and LoF) deposited in gnomAD (see S1 Table for full dataset). The most common variants detected in various populations were predicted to be benign (see S2 Table). The population with the term ‘Others’ refers to individuals of mixed population, for whom an unambiguous ethnicity could not be assigned (a). AF: allele frequency. LoF: loss-of-function. Considering the populations individually, the samples of the group ‘Others’ for which no population could unambiguously be assigned, showed the highest occurrence of non-synonymous and LoF variants when correcting for the respective population size and assuming that all variants are equally frequent (0.0022; eight variants in total; see Table 1 and S1 Table). Occurrence rates of LEP variants in populations from East and South Asian countries were lower with 0.0014 (total of 14 variants) and 0.0010 (total of 16 variants), respectively. Within the African-American population, non-synonymous and LoF variants showed a population size-corrected frequency of 0.00104 (total of 13 variants). Lower occurrences were detected in the Latino/Admixed population (0.0008; total of 14 variants), the European, non-Finnish (0.0006; total of 41 variants), the European, Finnish (0.00024; total of three variants) and the Ashkenazi Jewish population (0.0002; one variant; see Table 1 and S1 Table). Generally, for all populations, the majority of variants was annotated as non-synonymous mutations (> 85%) and were rare (AF < 1%, see Table 1 and S1 Table). Solely, the non-synonymous and putatively benign single nucleotide polymorphism (SNP) rs17151919 (see S1 and S2 Tables) was frequent in African-Americans with an AF of 8.4%. Further, this SNP was the most commonly detected variant in European, non-Finnish individuals as well as in the African-American, Latino/Admixed American, Ashkenazi Jewish, South Asian and ‘Others’ populations (see Table 1 and S1 Table). Conversely, in Finnish samples, the variant rs751272426 (AF = 0.0047%) was the most common, while rs148407750 (AF = 0.311%) was the most abundant variant in people from East Asia. To assess functional implications of the LEP variants in gnomAD, we initially analysed the variants with various in silico tools (see S2 Table). Accordingly, SIFT assigned 19 variants as ‘deleterious’, while PolyPhen2 predicted 13 variants to be ‘possibly damaging’ and 16 to be ‘probably damaging’. Ten variants were assigned as ‘deleterious’ by MutationTaster2021. ‘Damaging’ classifications for 24 and 20 variants were obtained by FATHMM-MKL and PROVEAN, respectively (see Fig 1 and S2 Table). Additionally, five of six LoF variants were indicated to be high-confidence LoF variants or in-frame deletions (see S2 Table).

Fig 1

Predictions of the applied in silico tools.

Predictions of the applied in silico tools.

All 75 non-synonymous and LoF variants located in LEP were analysed with SIFT, PolyPhen2, MutationTaster2021, FATHMM-MKL and PROVEAN. Unless MutationTaster2021, all tools were unable to predict implications of the seven LoF variants (grey). In ClinVar, which was examined as an additional pathogenicity prediction tool, solely six of the 75 non-synonymous and LoF gnomAD variants were deposited (not shown). Of these, four were of ‘uncertain significance’, while the remaining two were predicted to be ‘benign’ (rs17151919 and rs28954113). The preceding in silico analyses have already assigned rs17151919 as ‘benign’, whereas rs28954113 was classified as ‘deleterious’ by all five computational tools. Further, previous studies have reported clinical cases with severe obesity caused by the amino acid exchange of rs28954113 (p.Asn103Lys) [6, 29–31]. Accordingly, we retained the classification of rs28954113 as ‘pathogenic’. Thus, twenty-two variants across all populations were predicted to be ‘benign’ (see Table 2 and S2 Table). Fifty-three variants were indicated to be ‘pathogenic’ by at least one in silico tool, while 32 and 19 revealed a pathogenic effect in at least two and three tools, respectively. Collectively, one in approximately 53 individuals will be a carrier of a non-synonymous or LoF variant located in LEP regardless of the pathogenicity (see Table 2). The prevalence for a homozygous and compound heterozygous variant was ~ 1:14,100 and ~ 1:50,000, respectively. Slightly lower prevalence were detected for variants predicted to be ‘benign’ (see Table 2 and S2 Table). Generally, for indicated pathogenic variants across all populations, if at least one tool predicted pathogenicity, the prevalence of compound heterozygosity is higher than the prevalence of homozygous variants (see Table 2).

Table 2

Estimated prevalence of hetero- and homozygous as well as compound heterozygous variants across all populations.

Number of tools predicting pathogenic effect	Number of (pathogenic) variants*	Number of carriers of pathogenic variants		Estimated prevalence of heterozygous mutations	Estimated prevalence of homozygous and compound heterozygous mutations	Estimated prevalence of homozygous mutations	Estimated prevalence of compound heterozygous mutations
Number of tools predicting pathogenic effect	Number of (pathogenic) variants*	Heterozygous	Homozygous	Estimated prevalence of heterozygous mutations		Estimated prevalence of homozygous mutations	Estimated prevalence of compound heterozygous mutations
0	22^{a, b}	2,256^b	105^b	1: 58	1: 13,200	1: 14,200	1: 186,000
≥ 0	75^b	2,486^b	105^b	1: 53	1: 11,000	1: 14,100	1: 50,000
≥ 1	53	230	0	1: 616	1: 1,510,000	1: 7,050,000	1: 1,930,000
≥ 2	32	67	0	1: 2,100	1: 17,830,000	1: 328,200,000	1: 18,850,000
≥ 3	19	32	0	1: 4,400	1: 78,160,000	1: 741,000,000	1: 87,410,000

Here, the rounded estimated prevalence for variants across all populations applying different pathogenicity definitions are presented. Variants were considered deleterious if the stated number of in silico tools revealed a pathogenic prediction (*). If no tool indicated to a damaging effect (a), the variant is likely ‘benign’. Due to varying allele frequencies across the individual populations, the Hardy-Weinberg equilibrium is not fulfilled when investigating all or exclusively benign variants (b). Further, when applying various pathogenicity definitions based on the number of in silico tools predicting a damaging effect, it is evident that the more stringent this definition, the lower the prevalence (see Table 2). Consequently, we decided to classify variants as ‘pathogenic’ if at least two in silico tools indicated a deleterious impact (definition applied for subsequent analyses). A total of 67 individuals throughout all populations carried at least one of these variants heterozygously, while no homozygous carriers were detected. Hence, the estimated prevalence of heterozygosity for a putatively harmful LEP mutation was approximately 1:2,100, while the prevalence for a homozygous variant was ~ 1:17,830,000 for individuals of all populations (see Tables 2 and 3).

Table 3

Estimated prevalence of putatively pathogenic LEP variants for the individual populations in gnomAD.

Population	Sample size	Number of putatively deleterious mutations*	Estimated prevalence for heterozygous mutations	Estimated prevalence for homozygous/compound heterozygous mutations
All populations	141,456	32	1: 2,100	1: 17,830,000
All populations (females)	64,754	19	1: 2,200	1: 18,640,000
All populations (males)	76,702	22	1: 2,100	1: 17,190,000
African-American	12,487	4	1: 1,800	1: 12,730,000
Ashkenazi Jewish	5,185	0	NA	NA
East-Asian	9,977	6	1: 770	1: 2,360,000
European, Finnish	12,562	0	NA	NA
European, non-Finnish	64,603	16	1: 2,700	1: 28,980,000
Latino/Admixed American	17,720	5	1: 3,000	1: 34,890,000
Others^a	3,614	5	1: 720	1: 2,090,000
South Asian	15,308	3	1: 1,300	1: 6,510,000

Estimated rounded prevalence for deleterious LEP variants are shown. Variants were considered deleterious if at least two in silico tools revealed a pathogenic prediction (*). Individuals of the ‘Others’ population could not be assigned unambiguously to one of the other ethnicities (a). NA: not available. gnomAD further provides sex-specific allele counts for each variant (see Table 1). Thus, we replicated the probability estimations of possibly pathogenic variants (as defined above) for both sexes separately. This revealed that about one in 2,200 women carries a heterozygous and possibly harmful LEP variant. In males, the prevalence of a heterozygous variant was marginally higher with ~1:2,100. The chance to harbour a homozygous/compound heterozygous, pathogenic variant in females was estimated to be ~1:18,640,000. For males, this prevalence was again higher at ~1:17,190,000. Next, we determined the likelihood of a putatively deleterious LEP variant in the distinct populations. As none of the variants detected in the Finnish and Ashkenazi Jewish population was predicted to have a pathogenic effect, we were unable to calculate the correlated prevalence (see Table 3 and S2 Table). Generally, pronounced variations between the global populations were observed (see Table 3). Individuals whose ethnicity could not be clearly assigned (’Others’) were determined to be at highest risk to harbour a putatively pathogenic LEP variant either hetero- or homozygously (see Table 3). The second greatest risk for being a pathogenic LEP variant carrier was detected for individuals of the East Asian population. Conversely, the lowest risk for both hetero- and homozygous variants was found in the Latino/Admixed population (see Table 3). In order to expand the pathogenicity predictions with reported clinical cases, we have performed a literature search (e.g. PubMed, OMIM, etc.) and have found 20 variants reported in at least one clinical case (see S3 Table). Of these, five were listed in the non-synonymous and LoF variants extracted from gnomAD. In turn, three of those, have already been assigned as ‘pathogenic’ by our in silico analyses (by at least two tools). Generally, all other variants reported in a clinical case were not available in gnomAD. When we considered the variants declared as ‘pathogenic’ by at least two in silico tools and variants reported in a clinical case for our estimates, we obtained higher prevalence for heterozygous (1: 1,300) as well as homozygous carriers (1: 6,380,000) across all populations (see S4 Table). Likewise, higher or similar prevalence were found when repeating this calculation for the individual populations. Once again, individuals whose ethnicity could not be unambiguously assigned (’Others’) and East Asians were at higher risk of being a carrier of a putatively pathogenic leptin variant (see S4 Table). Additionally, we have conducted a literature search for any functional implication of the variants. In total, seven non-synonymous and LoF variants listed in gnomAD were found to be functionally characterised by either a comprehensive computational analyses or by in vitro studies (see S2 Table). Of these, six were already assigned as ‘pathogenic’ by our in silico analyses. Solely, rs17151919 has been previously classified as ‘benign’ (by in silico tools and ClinVar),but was reported to be functionally relevant [32]. Calculating the prevalence estimates for variants that have been characterized as ’pathogenic’ by at least two tools and have a functional relevance revealed equally higher prevalence as the inclusion of variants found in clinical cases (see S4 Table). This was again valid for all populations. Resembling higher prevalence estimates were detected when variants from case reports as well as variants with a functional implication were added to the mutations predicted as ‘pathogenic’ by in silico tools (see S4 Table). Again, the prevalence rates for the individual ethnicities vary considerably. Statistically, one in six African-Americans carries a heterozygous and pathogenic LEP variant, while the risk for being a carrier in Finnish Europeans is lower (1: 1,400; see S4 Table).

Discussion

Homozygous pathogenic mutations in the leptin gene lead to a deficiency of biologically active leptin and cause severe early-onset obesity [5, 7, 8, 11, 14]. Through the implementation of reference databases, such as ExAC and gnomAD, prevalence assessments of potentially harmful variants in the general population have become feasible. Yet, solely one study has explored the prevalence of LEP variants in the general population using these reference datasets [18]. As of today, the gnomAD database is the largest publicly available repository with data of genetic variants [26]. More than 125,000 exome and 15,000 whole-genome sequence datasets are contained in gnomAD v2.1.1 [19]. Based on these datasets, it had been estimated that each individual carries approximately 200 coding variants with allele frequencies less than 0.1%. Despite the large sample size, gnomAD will lack on average 27 ± 13 novel coding mutations per exome based on the current number of samples included [26]. The data contained in gnomAD has been subjected to a stringent quality control excluding data of participants with known severe paediatric diseases or related individuals [19, 26]. Notably, due to this removal of samples with known paediatric diseases, potentially relevant and pathogenic variants with regard to early manifested obesity may have been omitted. Additionally, variation data regarding global cohorts are deposited in gnomAD. Still, non-Finnish European samples are overrepresented, while samples from the Middle East, Central Asia and Africa are generally underrepresented [19]. Since congenital leptin deficiency caused by mutations in leptin are more prevalent in patients from Pakistan and the Middle East [5, 7, 11, 16], there is a lack of data pertaining to deleterious leptin mutations in the general Middle Eastern population. It can be assumed that higher incidence of putatively harmful variants might be observed in these populations. Additionally, no individual-level phenotype data is available. Thus, it is unclear whether the datasets might be skewed for overweight or obese individuals, which is feasible considering the globally increasing prevalence of both [33]. Across all populations, we detected that approximately one in 2,100 carries a potentially deleterious (at least two in silico tools indicated a pathogenicity) heterozygous variant in LEP. In addition, the prevalence of a homozygous variant across all populations was about 1:17,830,000. Despite the larger sample size and a resultant greater number of variants in gnomAD, our results resemble the estimated prevalence based on the ExAC database reported by Nunziata and colleagues [18]. Further, when including variants reported in clinical cases and functional studies, we also obtained higher prevalence rates for hetero- and homozygosity and were thus able to confirm the previous findings [18]. Heterozygous variants were estimated to be more prevalent in the general public. Previously, these were predominantly detected in healthy unaffected individuals [5, 7]. Heterozygous carriers generally show lower BMI z-scores and lower body fat percentages than homozygous individuals [34]. For some variants in LEP, heterozygous carriers suffering from obesity have been reported [35-37]. Still, it can be assumed that heterozygous variants generally have an additive effect on the carrier’s body weight. Across all populations, we report that these compound heterozygous variants are less prevalent than single heterozygous mutations but more frequent than homozygosity. In addition, we observed deviations in prevalence rates between populations. For example, individuals from East Asia and individuals, for whom no ethnicities could be determined (‘Others’), showed a higher prevalence of both hetero- and homozygous mutations than other populations. Strong disparities were also evident at the SNP level. For instance, the polymorphism rs17151919 was generally infrequently detected. In African-Americans, however, it was a common variant (AF > 5%), which has already been reported for other study groups [32]. Further, an African-American-specific association of rs17151919 with lower leptin levels was reported. This SNP was also associated with a higher BMI in African children, but not in adults [32]. We are aware that in silico tools are no substitute for functional in vitro analyses. This is particularly evident for the deletion p.Ile35del, as neither gnomAD, nor most in silico tools do provide predictions of functional implications. Yet, it is known that this deletion causes the loss of exon 2 of the LEP gene and thus a congenital leptin deficiency with resultant obesity [5, 16]. Additionally, the performance of the individual tools varies considerably, even across different populations and variant types [38, 39]. For instance, SIFT and the predecessor of PolyPhen2, PolyPhen, were found to perform better when predicting LoF than gain-of-function variants [38]. Likewise, the pathogenicity of variants with an AF < 1% across all populations or variants with an AF between 1 and 25% in individual ethnicities was shown to be more challenging to accurately predict [39]. Previously, one study has demonstrated that SIFT and PROVEAN yield the most accurate prediction of pathogenicity, while MutationTaster2021 and FATHMM had comparatively low accuracy and specificity [40]. Conversely, other studies have shown that especially SIFT, PolyPhen2 and MutationTaster2021 exhibited a high sensitivity but a low specificity [39, 41]. Thus, the usage and evaluation of diverse tools appears to be essential. Initially, we have tested, how the number of tools indicating a pathogenic effect, affected our prevalence estimates (see Table 2). We have seen that the more stringent this criterion of pathogenicity was defined, the lower the obtained prevalence. Accordingly, we classified variants as potentially harmful if at least two tools indicated a damaging effect. Still, it remains uncertain whether these classifications can be corroborated by clinical and functional data. Due to these considerations, we have additionally checked the ClinVar database to obtain additional pathogenicity indications and have performed a literature search to find reported clinical cases carrying LEP variants and to identify mutations that have been described to be functionally relevant (see S2 Table). Notably, ClinVar solely contained six of the 75 non-synonymous and LoF variants listed in gnomAD. The majority of those were of ‘uncertain significance’, while two were assigned as ‘benign’. Interestingly, one was predicted to be ‘pathogenic’ by all here investigated computational tools. This pathogenic indication was even supported by multiple clinical cases of severe obesity (see S3 Table) [6, 29–31]. Hence, further research is urgently required to elucidate the unambiguous significance of many LEP variants for the phenotype of congenital leptin deficiency. Similarly, the literature search screening e.g., PubMed, OMIM and LitVar, determined 20 LEP mutations in total that were at least detected in one obese individual. Again, solely five of those were included in the variant list extracted from gnomAD. The non-synonymous variants p.Asp100Asn (rs724159998) [42], p.Asn103Lys (rs28954113) [6, 29–31], the frameshift mutation p.Gly133ValfsTer15 (rs1307773933) [7, 15–17, 43–45] and the in-frame deletion p.Ile35del (rs747703977) [16, 17] were detected in extremely obese patients being homozygous carriers [5]. Of these, solely one mutation (rs1800564) deviated in its pathogenicity predictions. Likewise, in functional studies, seven variants were included in the gnomAD list. Again, the only variant showing deviating pathogenicity classification between the in silico analyses and the literature, was rs17151919. As we have detected higher prevalence rates when including variants reported either in clinical cases or functional studies, these might be caused by the allele counts and frequencies of rs17151919 and rs1800564. These two variants were more frequently found than the rest of the non-synonymous and LoF variants in gnomAD. For instance, relative to the majority of the as ‘pathogenic’ predicted variants by the in silico tools, rs1800564 has been found more frequently in the gnomAD population (43 heterozygous carriers) and thus presumably accounts for the higher prevalence rates. The same applies to the SNP rs17151919. Despite all these remarks that need to be considered in the interpretation of our results, in silico tools do help to gain preliminary indications of putatively pathogenic variants. It is even recommended by the American College of Medical Genetics and Genomics (ACMG) and the European Society of Human Genetics (ESHG) to use computational predictions to support the interpretation of variants [39]. Generally, the application of the HWE is affected by several factors, like mutations, natural selection, non-random mating, genetic drift, gene flow, population structures and sizes [46, 47]. For instance, for the fulfilment of the HWE an infinite population size is assumed. Yet, this can never be met by any population in nature [47]. Further, the ‘Wahlund effect’ influences the HWE. In populations with multiple subpopulations, individuals might mate within those subpopulations but never between them, resulting in an underestimation of homozygotes by the HWE in the overall population [46]. Generally, it is challenging to predict the impact of the gnomAD populations and their characteristics on the HWE and thus our results.

Conclusion

The gnomAD database is the largest publicly available reference dataset including various global study groups. By utilizing these datasets, we estimated the prevalence of putatively damaging leptin variants. We identified 32 possibly damaging mutations in 67 heterozygous and no homozygous carriers. The estimated prevalence of a heterozygous variant was roughly 1:2,100, while the probability for homozygosity was 1:17,830,000 across all populations. Investigating each study group separately, this prevalence varied significantly, with individuals of mixed, unknown ethnicity (‘Others’) and East Asians being at greater risk of harbouring a hetero- or homozygous mutation with a harmful consequence. Yet, higher prevalence of functionally relevant variants were obtained upon inclusion of reported case and functional studies. In general, mutations in the LEP gene, which frequently result in congenital leptin deficiency, are extremely rare in the general population. Continued analysis of leptin mutations along phenotypic and clinical data may improve our understanding of monogenic obesity.

Raw data as extracted from gnomAD.

This contains the raw data as it was extracted from gnomAD [19]. This includes the 75 non-synonymous and LoF LEP variants with its corresponding allele counts and frequencies. It represents the minimal dataset underlying the performed analyses. (XLSX) Click here for additional data file.

Summary of the results of the in silico analyses of LEP variants deposited in gnomAD.

This represents the in silico predictions for each variant present in all populations by various tools, namely SIFT [20], PROVEAN [24], PolyPhen2 [21], MutationTaster2021 [22] and FATHMM-MKL [23]. The tools are ordered by their reported accuracy (left: highest accuracy; right: lowest accuracy) [40]. The literature references refer either to a reported clinical case or a functional study. (PDF) Click here for additional data file.

Reported cases with LEP mutations.

This represents all clinical cases reported with either obesity and/or congenital leptin deficiency carrying a LEP variant. Not all reported variants were listed in gnomAD. NA: not available. (PDF) Click here for additional data file.

Estimated prevalence for the populations in gnomAD based on in silico tools and literature references.

This presents the prevalence estimations using varying definitions of pathogenicity. Here, all variants are classified as ‘pathogenic’ if they were predicted as harmful by at least two in silico tools and were either reported in a clinical case or in a functional study. Further, solely variants listed in the non-synonymous and LoF LEP variants of gnomAD were included in the calculations. NA: not available. (PDF) Click here for additional data file. 12 May 2022 PONE-D-22-08349 Prevalence Estimates of Putatively Pathogenic Leptin Variants in the gnomAD Database PLOS ONE Dear Dr. %Hinney%, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we have decided that your manuscript does not meet our criteria for publication and must therefore be rejected. Specifically: This study does not provide any novel information to the scientific world as mentioned by one of the reviewer. However I am sure comments of the learned reviewer will help you to do a better analysis at not only a gene level but at genome or disease level. I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision. Kind regards, Tiratha Raj Singh Academic Editor PLOS ONE [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Dear authors, It was my pleasure to read your manuscript entitled “Prevalence Estimates of Putatively Pathogenic Leptin Variants in the gnomAD Database”, but unfortunately I am suggesting the journal to reject it for publication. My recommendation is based on the fact that it doesn’t add anything new and the whole paper is actually based on a very simple calculation. How many potentially pathogenic variants have been identified in the investigated gene, extract the combined minor allele frequency and then by using the Hardy Weinberg equilibrium, to estimate the prevalence of affected individuals. As I hope you can understand, this calculation while valid (I would suggest though using the ACMG criteria and not merely basing potential pathogenicity on a majority vote among prediction algorithms) and well presented, is not enough for a full research article. If you are interested in pursuing further this type of research I would suggest to concentrate not on one gene, but a disease, correlate with clinical data and draw conclusions about the concordance or discordance between them, while making a mini literature review of the subject. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] - - - - - For journal use only: PONEDEC3 31 May 2022 Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Response: We thank the reviewer for the positive evaluation. ________________________________________ 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A Response: We have performed the (statistical) analysis appropriately and rigorously. We have even added more analyses and data regarding the leptin gene as a comparable study very recently published (PLoS One) for the FTO gene in weight regulation (Souza Junior et al. 2022 PMID: 34990463). ________________________________________ 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Response: We again thank the reviewer for the positive evaluation. ________________________________________ 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Response: We thank the reviewer. ________________________________________ 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Dear authors, It was my pleasure to read your manuscript entitled "Prevalence Estimates of Putatively Pathogenic Leptin Variants in the gnomAD Database", but unfortunately I am suggesting the journal to reject it for publication. My recommendation is based on the fact that it doesn't add anything new and the whole paper is actually based on a very simple calculation. How many potentially pathogenic variants have been identified in the investigated gene, extract the combined minor allele frequency and then by using the Hardy Weinberg equilibrium, to estimate the prevalence of affected individuals. As I hope you can understand, this calculation while valid (I would suggest though using the ACMG criteria and not merely basing potential pathogenicity on a majority vote among prediction algorithms) and well presented, is not enough for a full research article. If you are interested in pursuing further this type of research I would suggest to concentrate not on one gene, but a disease, correlate with clinical data and draw conclusions about the concordance or discordance between them, while making a mini literature review of the subject. Response: We are grateful for the very positive evaluation and that the manuscript was a pleasurable read. Thus, we hope that other reader would also profit from our manuscript. We like to point out our novel findings. Our analysis adds to the scientific knowledge pertaining to the leptin gene, as we have calculated not only the prevalence of homozygotes and heterozygotes of leptin variants, but also of carriers of compound heterozygous variants. This has never been reported before for leptin variants derived from a public database. We are the first to report frequencies derived from the currently largest database comprising more the 125,000 individuals (Gnomad). We have also calculated population-specific and gender-specific prevalences. These analyses have not been published before. We have also checked a large number of in silico tools to obtain educated knowledge about the functional relevance or pathogenicity of the detected variants. Thus, our study adds substantial knowledge to genetic variations in the leptin gene and the implications for obesity in general. To point out that the HWE test is appropriate (as pointed out by the reviewer) we added a respective reference (Lines 128-129): ‘assumption of a perfect population (see Equation (1); p = allele frequency of allele A, q = allele frequency of allele a) as performed by Nunziata et al. (17). Submitted filename: Response to Reviewers.docx Click here for additional data file. 12 Jul 2022

PONE-D-22-08349R1

Prevalence Estimates of Putatively Pathogenic Leptin Variants in the gnomAD Database

PLOS ONE Dear Dr. Hinney, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Aug 26 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Alvaro Galli Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. 3. Please expand the acronym “BMBF” (as indicated in your financial disclosure) so that it states the name of your funders in full. This information should be included in your cover letter; we will change the online submission form on your behalf. 4. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 5. Please upload a new copy of Figure 1 as the detail is not clear. Please follow the link for more information: https://blogs.plos.org/plos/2019/06/looking-good-tips-for-creating-your-plos-figures-graphics/" https://blogs.plos.org/plos/2019/06/looking-good-tips-for-creating-your-plos-figures-graphics/ Additional Editor Comments (if provided): [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: (No Response) Reviewer #3: (No Response) Reviewer #4: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes Reviewer #3: Partly Reviewer #4: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: N/A Reviewer #3: N/A Reviewer #4: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes Reviewer #3: Yes Reviewer #4: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes Reviewer #3: Yes Reviewer #4: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: Introduction: Appropriate Methodology: In accordance to goal of study, Used relative tools and equations. Tools used for various analysis are standard and have been used in similar studies. Statistical analysis: Statistics applied is as per study demand Results: Presented in comprehensive way and highlight the aim of study Discussion: Could add more discussion on the methodology and tools used for prevalence estimations, this will add more weight-age and authenticity to results Overall: Study design is good. Methodology is appropriate. Results are in accordance with previous reported individual studies. Discussion might need more substance Reviewer #3: The authors submitted the prevalence estimates for homozygous/compound heterozygous LEP pathogenic variants. They estimated the pathogenicity of the variants registered to gnomAD from the results of several in silico tools' analysis results, and calculated the prevalence from the allele frequencies registered in the gnomAD based on the estimated pathogenicity. I strongly disagree to estimate the pathogenicity of variants only by in silico tools. At least, clinical information is needed. This kind of studies, we usually pick up only variants which were evaluated pathogenic by clinical databases such as ClinVar or HGMD professional. Those databases evaluate the pathogenicity by their own algorithms including clinical information. When we have clinical data, we can evaluate the pathogenicity by ourselves using ACMG criteria. However, this study lacks the evaluation of clinical information. I don't think this algorithm can determine the pathogenicity of varints in LEP. Reviewer #4: Thank you for the invitation to review and sorry for the delay. I have a few comments: 1. I think that through a systematic review the authors could derived greater confidence in which variants they are assessing as pathogenic or not. The phenotype of congenital leptin deficiency is very clear and with the widespread adpotion of exome sequencing for suspected monogenic obesity, the majority of pathogenic variants have probably been reported. In addition, there is extensive literature of in vitro functional characterisation of leptin mutants. The authors only have to consider a comparatively small number of variants (i.e. <100) so it would be possible to annotate each variant as to whether it had ever been described in a clinical case and whether it had been shown to be a loss of function (or hypomorphic) variant in vitro. 2. The leptin gene is highly resistant to loss of function variants and this analysis assumes that compound heterozygotes will be present at the expected rate based on the prevalence of the individual alleles. This may be true but I'm not sure we have evidence to support that. The public availability of the UK BioBank data would allow for identification of exactly the number of compound heterozygotes in a large cohort, though I appreciate this is a substantial undertaking compared to your current analysis. [It would also be biased by my point (3) below.] 3. The gnomAD database is primarily composed of individuals not known to have a monogenic disease. Leptin deficiency causes quite a dramatic phenotype of hyperphagia. I anticipate they they may be under-represented in gnomAD compared to their true prevalence. (Unlike for asymptomatic or very late-onset conditions.) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Reviewer #3: No Reviewer #4: Yes: ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 4 Aug 2022 Response to Editor - The acronym ‘BMBF’ stands for ‘Bundesministerium für Bildung & Forschung‘. - Our ‚financial disclosure‘ shall be as follows: ‘This study was funded by the Deutsche Forschungsgemeinschaft (DFG; A.H.: HI 865/2-1; P.F.P.: Heisenberg professorship; project number: 398707781), the Bundesministerium für Bildung & Forschung (BMBF; A.H.: 01GS0820; PALGER 2017-33: 01DH19010). We further acknowledge support by the Open Access Publication Fund of the University of Duisburg-Essen. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.’ - We have further added our minimal underlying dataset as our new S1 Table. - We have updated the manuscript regarding to your recommendations. Response to Reviewers Response: We thank all Reviewers and the Academic Editor for the kind feedback to our manuscript “Prevalence estimates of putatively pathogenic leptin variants in the gnomAD database”! We hope that future readers will enjoy our work as well. Generally, we have corrected some grammatical errors. By incorporating the feedback below, we have done some restructuring; particularly of the discussion. As we have cross- checked all our calculations, we have discovered some minor errors (especially rounding errors). These were also corrected. Please find below our answers to the helpful comments (in italic). Our changes as found in the ms have been highlighted in red (see also Track Changes version). The lines stated are based on the Track Changes version (with a complete markup). Reviewer #2: Introduction: Appropriate Methodology: In accordance to goal of study, Used relative tools and equations. Tools used for various analysis are standard and have been used in similar studies. Statistical analysis: Statistics applied is as per study demand Results: Presented in comprehensive way and highlight the aim of study Response: We thank the Reviewer for this positive feedback. Discussion: Could add more discussion on the methodology and tools used for prevalence estimations, this will add more weight-age and authenticity to results Response: We thank the Reviewer for this comment. We have added a deeper discussion regarding the applied in silico tools, as well as a short paragraph regarding factors influencing the Hardy-Weinberg-Equilibrium. Please find our changes in lines 341 ff., 386 ff. and 423 ff. Lines 341 ff.: Additionally, the performance of the individual tools varies considerably, even across different populations and variant types [38, 39]. For instance, SIFT and the predecessor of PolyPhen2, PolyPhen, were found to perform better when predicting LoF than gain-of-function variants [38]. Likewise, the pathogenicity of variants with an AF < 1% across all populations or variants with an AF between 1 and 25 % in individual ethnicities was shown to be more challenging to accurately predict [39]. Previously, one study has demonstrated that SIFT and PROVEAN yield the most accurate prediction of pathogenicity, while MutationTaster2021 and FATHMM had comparatively low accuracy and specificity [40]. Conversely, other studies have shown that especially SIFT, PolyPhen2 and MutationTaster2021 exhibited a high sensitivity but a low specificity [39, 41]. Thus, the usage and evaluation of diverse tools appears to be essential. Initially, we have tested, how the number of tools indicating a pathogenic effect, affected our prevalence estimates (see Table 2). We have seen that the more stringent this criteria of pathogenicity was defined, the lower the obtained prevalence. Accordingly, we classified variants as potentially harmful if at least two tools indicated a damaging effect. Still, it remains uncertain whether these classifications can be corroborated by clinical and functional data. Lines 386 ff.: Despite all these remarks that need to be considered in the interpretation of our results, in silico tools do help to gain preliminary indications of putatively pathogenic variants. It is even recommended by the American College of Medical Genetics and Genomics (ACMG) and the European Society of Human Genetics (ESHG) to use computational predictions to support the interpretation of variants [39]. Lines 423 ff.: Generally, the application of the HWE is affected by several factors, like mutations, natural selection, non-random mating, genetic drift, gene flow, population structures and sizes [46, 47]. For instance, for the fulfilment of the HWE an infinite population size is assumed. Yet, this can never be met by any population in nature [47]. Further, the ‘Wahlund effect’ influences the HWE. In populations with multiple subpopulations, individuals might mate within those subpopulations but never between them, resulting in an underestimation of homozygotes by the HWE in the overall population [46]. Generally, it is challenging to predict the impact of the gnomAD populations and their characteristics on the HWE and thus our results. Overall: Study design is good. Methodology is appropriate. Results are in accordance with previous reported individual studies. Discussion might need more substance Response: We again thank the Reviewer for this positive feedback. We are grateful that our work and its relevance was recognised. Reviewer #3: The authors submitted the prevalence estimates for homozygous/compound heterozygous LEP pathogenic variants. They estimated the pathogenicity of the variants registered to gnomAD from the results of several in silico tools' analysis results, and calculated the prevalence from the allele frequencies registered in the gnomAD based on the estimated pathogenicity. I strongly disagree to estimate the pathogenicity of variants only by in silico tools. At least, clinical information is needed. This kind of studies, we usually pick up only variants which were evaluated pathogenic by clinical databases such as ClinVar or HGMD professional. Those databases evaluate the pathogenicity by their own algorithms including clinical information. When we have clinical data, we can evaluate the pathogenicity by ourselves using ACMG criteria. However, this study lacks the evaluation of clinical information. I don't think this algorithm can determine the pathogenicity of varints in LEP. Response: We thank the Reviewer for this relevant critic. We are aware that in silico tools are no substitute to in vitro studies and studying these without clinical data might have some disadvantages. Still, we think that they give a reasonable first indication towards pathogenicity. To consider your comment, we have checked ClinVar and HGMD. Unfortunately, we do not have access to HGMD Professional. Yet, as another Reviewer suggested to perform a literature search to obtain information about all clinical data, we have already obtained all (and even additional) data deposited in the normal version of HGMD. In ClinVar, solely six of our 75 investigated variants of gnomAD were included. Of those, four were assigned to be of ‘uncertain significance’, while the remaining two (rs17151919 and rs28954113) were classified as ‘benign’. Yet, rs28954113 (p.Asn103Lys) is a variant that was reported in various clinical cases (see S3 Table) and was predicted to be ‘pathogenic’ by all five of our in silico tools. Thus, we retained the ‘pathogenic’ implication for this variant. Therefore, the ClinVar analysis could not add further information about the pathogenicity of the variants. We still included our ClinVar examination in the ms, while we did not include the information regarding HGMD. Please find the updated information in lines 130-131, 196 ff. and 358 ff. Lines 130-131: To obtain additional hints for a putative clinical significance of a given variant (non-synonymous and LoF), the database ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) [27] was checked. Lines 196 ff.: In ClinVar, which was examined as an additional pathogenicity prediction tool, solely six of the 75 non-synonymous and LoF gnomAD variants were deposited (not shown). Of these, four were of ‘uncertain significance’, while the remaining two were predicted to be ‘benign’ (rs17151919 and rs28954113). The preceding in silico analyses have already assigned rs17151919 as ‘benign’, whereas rs28954113 was classified as ‘deleterious’ by all five computational tools. Further, previous studies have reported clinical cases with severe obesity caused by the amino acid exchange of rs28954113 (p.Asn103Lys) [6, 29-31]. Accordingly, we retained the classification of rs28954113 as ‘pathogenic’. Lines 358 ff.: Due to these considerations, we have additionally checked the ClinVar database to obtain additional pathogenicity indications and have performed a literature search to find reported clinical cases carrying LEP variants and to identify mutations that have been described to be functionally relevant (see S2 Table). Notably, ClinVar solely contained six of the 75 non-synonymous and LoF variants listed in gnomAD. The majority of those were of ‘uncertain significance’, while two were assigned as ‘benign’. Interestingly, one was predicted to be ‘pathogenic’ by all here investigated computational tools. This pathogenic indication was even supported by multiple clinical cases of severe obesity (see S3 Table) [6, 29-31]. Hence, further research is urgently required to elucidate the unambiguous significance of many LEP variants for the phenotype of congenital leptin deficiency. Reviewer #4: Thank you for the invitation to review and sorry for the delay. I have a few comments: 1. I think that through a systematic review the authors could derived greater confidence in which variants they are assessing as pathogenic or not. The phenotype of congenital leptin deficiency is very clear and with the widespread adpotion of exome sequencing for suspected monogenic obesity, the majority of pathogenic variants have probably been reported. In addition, there is extensive literature of in vitro functional characterisation of leptin mutants. The authors only have to consider a comparatively small number of variants (i.e. <100) so it would be possible to annotate each variant as to whether it had ever been described in a clinical case and whether it had been shown to be a loss of function (or hypomorphic) variant in vitro. Response: We thank the Reviewer for this valid comment. We have thus performed a literature search in PubMed (for ‘congenital leptin deficiency’ and each individual variant), OMIM (‘leptin deficiency’), NCBI (each variant), Ensembl (each variant) and LitVar (each variant) to obtain information about reported clinical cases and functional implications. The determined variants can be found in S3 Table. We have further annotated each variant whether it was reported in a clinical case or functional study in our S2 Table. If the resultant variants were included in the gnomAD non-synonymous and LoF variant list, we have re- calculated the prevalence estimates when including variants of clinical cases and functional studies. The results are represented in S4 Table. Please find further information in lines 146 ff., 258 ff. and 368 ff. Lines 146 ff.: Further, a literature search was performed. The PubMed database was screened for the term ‘congenital leptin deficiency’ and each individual non-synonymous or LoF variant extracted from gnomAD, to compile a list containing all obese subjects carrying a LEP variant and putative functional implications. This list was extended with references for each individual variant deposited in NCBI (https://www.ncbi.nlm.nih.gov/), Online Mendelian Inheritance in Man (OMIM; https://www.omim.org/), Ensembl (https://www.ensembl.org/) and LitVar (https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar/) and two review articles [18, 28]. Allele counts were derived from gnomAD. Lines 258 ff: In order to expand the pathogenicity predictions with reported clinical cases, we have performed a literature search (e.g. PubMed, OMIM, etc.) and have found 20 variants reported in at least one clinical case (see S3 Table). Of these, five were listed in the non- synonymous and LoF variants extracted from gnomAD. In turn, three of those, have already been assigned as ‘pathogenic’ by our in silico analyses (by at least two tools). Generally, all other variants reported in a clinical case were not available in gnomAD. When we considered the variants declared as ‘pathogenic’ by at least two in silico tools and variants reported in a clinical case for our estimates, we obtained higher prevalence for heterozygous (1 : 1,300) as well as homozygous carriers (1 : 6,380,000) across all populations (see S4 Table). Likewise, higher or similar prevalence were found when repeating this calculation for the individual populations. Once again, individuals whose ethnicity could not be unambiguously assigned ('Others') and East Asians were at higher risk of being a carrier of a putatively pathogenic leptin variant (see S4 Table). Additionally, we have conducted a literature search for any functional implication of the variants. In total, seven non-synonymous and LoF variants listed in gnomAD were found to be functionally characterised by either a comprehensive computational analyses or by in vitro studies (see S2 Table). Of these, six were already assigned as ‘pathogenic’ by our in silico analyses. Solely, rs17151919 has been previously classified as ‘benign’ (by in silico tools and ClinVar), but was reported to be functionally relevant [32]. Calculating the prevalence estimates for variants that have been characterized as 'pathogenic' by at least two tools and have a functional relevance revealed equally higher prevalence as the inclusion of variants found in clinical cases (see S4 Table). This was again valid for all populations. Resembling higher prevalence estimates were detected when variants from case reports as well as variants with a functional implication were added to the mutations predicted as ‘pathogenic’ by in silico tools (see S4 Table). Again, the prevalence rates for the individual ethnicities vary considerably. Statistically, one in six African-Americans carries a heterozygous and pathogenic LEP variant, while the risk for being a carrier in Finnish Europeans is lower (1: 1,400; see S4 Table). Lines 368 ff.: Similarly, the literature search screening e.g. PubMed, OMIM and LitVar determined 20 LEP mutations in total that were at least detected in one obese individual. Again, solely five of those were included in the variant list extracted from gnomAD. The non- synonymous variants p.Asp100Asn (rs724159998) [42], p.Asn103Lys (rs28954113) [6, 29- 31], the frameshift mutation p.Gly133ValfsTer15 (rs1307773933) [7, 15-17, 43-45] and the in- frame deletion p.Ile35del (rs747703977) [16, 17] were detected in extremely obese patients being homozygous carriers [5]. Of these, solely one mutation (rs1800564) deviated in its pathogenicity predictions. Likewise, in functional studies, seven variants were included in the gnomAD list. Again, the only variant showing deviating pathogenicity classification between the in silico analyses and the literature, was rs17151919. As we have detected higher prevalence rates when including variants reported either in clinical cases or functional studies, these might be caused by the allele counts and frequencies of rs17151919 and rs1800564. These two variants were more frequently found than the rest of the non- synonymous and LoF variants in gnomAD. For instance, relative to the majority of the as ‘pathogenic’ predicted variants by the in silico tools (highest number of heterozygous carriers in all populations was 7 for rs1307773933), rs1800564 has been found more frequently in the gnomAD population (43 heterozygous carriers) and thus presumably accounts for the higher prevalence rates. The same applies to the SNP rs17151919. 2. The leptin gene is highly resistant to loss of function variants and this analysis assumes that compound heterozygotes will be present at the expected rate based on the prevalence of the individual alleles. This may be true but I'm not sure we have evidence to support that. The public availability of the UK BioBank data would allow for identification of exactly the number of compound heterozygotes in a large cohort, though I appreciate this is a substantial undertaking compared to your current analysis. [It would also be biased by my point (3) below.] Response: We again thank the Reviewer for this important feedback. Unfortunately, we do not have access to the UK Biobank data and because of the considerable time and, above all, costs incurred; it is beyond our capacity to gain access to the data. However, we might consider this approach for an individual paper in the future. But to still consider your comment, we analysed genotyping data from the NCBI ALFA (over 2 million subjects included) and 1000Genomes projects. Sadly, solely few of our analysed variants were included in these datasets. For instance, in the 1000Genomes project (Phase 3), deposited in Ensembl, only seven of the 75 variants were detected. Thus, we decided to not include these findings in our ms. 3. The gnomAD database is primarily composed of individuals not known to have a monogenic disease. Leptin deficiency causes quite a dramatic phenotype of hyperphagia. I anticipate they they may be under-represented in gnomAD compared to their true prevalence. (Unlike for asymptomatic or very late-onset conditions.) Response: Again, thank you for this comment. We know that paediatric and monogenic diseases are under-represented in gnomAD. Nevertheless, the aim of our study was to estimate the prevalence of pathogenic LEP variants in the general population without known monogenic diseases. As we have stated in our discussion, it is still feasible that gnomAD encompasses multiple overweight or obese individuals causing a bias in our results. Yet, we believe that we still get a comprehensive insight in the prevalence of these putatively pathogenic variants in the general public. Submitted filename: Response to Reviewers.pdf Click here for additional data file. 2 Sep 2022 Prevalence estimates of putatively pathogenic leptin variants in the gnomAD database PONE-D-22-08349R2 Dear Dr. Hinney, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Alvaro Galli Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: All comments have been addressed Reviewer #3: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes Reviewer #3: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: Yes Reviewer #3: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: Study design is good. Methodology is appropriate. Results are in accordance with previous reported individual studies. Authors have incorporated the suggestions. Reviewer #3: I really appreciate the authors for their effort to answer my previous comment. But I'm very sorry to say that I still can't agree with the method to estimate the pathogenicity of the variants only by in silico tools. They tried to connect the clinical data with variants from previous reports, but only partially connected. I think the prevalence calculated in this paper is over estimates. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Reviewer #3: No ********** 9 Sep 2022 PONE-D-22-08349R2 Prevalence estimates of putatively pathogenic leptin variants in the gnomAD database Dear Dr. Hinney: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Alvaro Galli Academic Editor PLOS ONE

46 in total

1. Identification of two novel missense mutations in the human OB gene.

Authors: S M Echwald; S B Rasmussen; T I Sørensen; T Andersen; A Tybjaerg-Hansen; J O Clausen; L Hansen; T Hansen; O Pedersen
Journal: Int J Obes Relat Metab Disord Date: 1997-04

2. Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations.

Authors: Sarah E Flanagan; Ann-Marie Patch; Sian Ellard
Journal: Genet Test Mol Biomarkers Date: 2010-08

3. Testing for Hardy-Weinberg proportions: have we lost the plot?

Authors: Robin S Waples
Journal: J Hered Date: 2014-11-25 Impact factor: 2.645

4. Changes in levels of peripheral hormones controlling appetite are inconsistent with hyperphagia in leptin-deficient subjects.

Authors: Sadia Saeed; Paul R Bech; Tayyaba Hafeez; Rabail Alam; Mario Falchi; Mohammad A Ghatei; Stephen R Bloom; Muhammad Arslan; Philippe Froguel
Journal: Endocrine Date: 2013-07-04 Impact factor: 3.633

5. Human leptin deficiency caused by a missense mutation: multiple endocrine defects, decreased sympathetic tone, and immune system dysfunction indicate new targets for leptin action, greater central than peripheral resistance to the effects of leptin, and spontaneous correction of leptin-mediated defects.

Authors: M Ozata; I C Ozdemir; J Licinio
Journal: J Clin Endocrinol Metab Date: 1999-10 Impact factor: 5.958

Review 6. Molecular genetic aspects of weight regulation.

Authors: Johannes Hebebrand; Anke Hinney; Nadja Knoll; Anna-Lena Volckmar; André Scherag
Journal: Dtsch Arztebl Int Date: 2013-05-10 Impact factor: 5.594

7. Identification of new sequence variants in the leptin gene.

Authors: M K Karvonen; U Pesonen; P Heinonen; M Laakso; A Rissanen; H Naukkarinen; R Valve; M I Uusitupa; M Koulu
Journal: J Clin Endocrinol Metab Date: 1998-09 Impact factor: 5.958

8. Severe Early Onset Obesity due to a Novel Missense Mutation in Exon 3 of the Leptin Gene in an Infant from Northwest India

Authors: Devi Dayal; Keerthivasan Seetharaman; Inusha Panigrahi; Balasubramaniyan Muthuvel; Ashish Agarwal
Journal: J Clin Res Pediatr Endocrinol Date: 2017-12-08

9. The mutational constraint spectrum quantified from variation in 141,456 humans.

Authors: Konrad J Karczewski; Laurent C Francioli; Grace Tiao; Beryl B Cummings; Jessica Alföldi; Qingbo Wang; Ryan L Collins; Kristen M Laricchia; Andrea Ganna; Daniel P Birnbaum; Laura D Gauthier; Harrison Brand; Matthew Solomonson; Nicholas A Watts; Daniel Rhodes; Moriel Singer-Berk; Eleina M England; Eleanor G Seaby; Jack A Kosmicki; Raymond K Walters; Katherine Tashman; Yossi Farjoun; Eric Banks; Timothy Poterba; Arcturus Wang; Cotton Seed; Nicola Whiffin; Jessica X Chong; Kaitlin E Samocha; Emma Pierce-Hoffman; Zachary Zappala; Anne H O'Donnell-Luria; Eric Vallabh Minikel; Ben Weisburd; Monkol Lek; James S Ware; Christopher Vittal; Irina M Armean; Louis Bergelson; Kristian Cibulskis; Kristen M Connolly; Miguel Covarrubias; Stacey Donnelly; Steven Ferriera; Stacey Gabriel; Jeff Gentry; Namrata Gupta; Thibault Jeandet; Diane Kaplan; Christopher Llanwarne; Ruchi Munshi; Sam Novod; Nikelle Petrillo; David Roazen; Valentin Ruano-Rubio; Andrea Saltzman; Molly Schleicher; Jose Soto; Kathleen Tibbetts; Charlotte Tolonen; Gordon Wade; Michael E Talkowski; Benjamin M Neale; Mark J Daly; Daniel G MacArthur
Journal: Nature Date: 2020-05-27 Impact factor: 69.504