| Literature DB >> 35456375 |
Alina Urnikyte1, Laura Pranckeniene1, Ingrida Domarkiene1, Svetlana Dauengauer-Kirliene1, Alma Molyte1, Ausra Matuleviciene1, Ingrida Pilypiene1, Vaidutis Kučinskas1.
Abstract
Most genetic variants are rare and specific to the population, highlighting the importance of characterizing local population genetic diversity. Many countries have initiated population-based whole-genome sequencing (WGS) studies. Genomic variation within Lithuanian families are not available in the public databases. Here, we describe initial findings of a high-coverage (an average of 36.27×) whole genome sequencing for 25 trios of the Lithuanian population. Each genome on average carried approximately 4,701,473 (±28,255) variants, where 80.6% (3,787,626) were single nucleotide polymorphisms (SNPs), and the rest 19.4% were indels. An average of 12.45% was novel according to dbSNP (build 150). The WGS structural variation (SV) analysis identified on average 9133 (±85.10) SVs, of which 95.85% were novel. De novo single nucleotide variation (SNV) analysis identified 4417 variants, where 1.1% de novo SNVs were exonic, 43.9% intronic, 51.9% intergenic, and the rest 3.13% in UTR or downstream sequence. Three potential pathogenic de novo variants in the ZSWIM8, CDC42EP1, and RELA genes were identified. Our findings provide useful information on local human population genomic variation, especially for de novo variants, and will be a valuable resource for further genetic studies, and medical implications.Entities:
Keywords: SNV; de novo variation; newborns; trios; whole genome sequencing
Mesh:
Year: 2022 PMID: 35456375 PMCID: PMC9028680 DOI: 10.3390/genes13040569
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
The average number of autosomal single nucleotide genetic variants per genome identified in the Lithuanian cohort.
| Parents ( | Newborns ( | |
|---|---|---|
| Raw reads in M | 78,447 | 83,755 |
| Bases (Gb) | 11,832 | 12,633 |
| Coverage depth | 35.46 | 37.88 |
| SNVs | 4,704,096 | 4,696,226 |
| SNPs | 3,791,674 | 3,783,578 |
| Insertions (Hom) | 155,784 | 153,755 |
| Insertions (Het) | 277,288 | 276,262 |
| Deletions (Hom) | 152,914 | 150,766 |
| Deletions (Het) | 293,193 | 291,467 |
| Indels (Het) | 22,332 | 22,377 |
Figure 1The minor allele frequency distribution for 1,508,407 SNPs in the Lithuanian population samples.
Figure 2Principal component analysis of the first two PCs of individuals from Lithuania and 27 populations from the 1000 Genomes Project Phase3 dataset. Abbreviations as indicated in the text.
Summary statistics for deletions, duplications, and insertions in parent and newborn groups.
| Parents ( | Newborns ( | ||
|---|---|---|---|
| Statistics | Deletions | ||
| Mean | 4159.39 | 4191.33 | 0.5611 |
| SE | 33.63 | 47.08 | |
| Median | 4126.0 | 4172.5 | |
| Mode | 3963 | #N/A | |
| SD | 23,543 | 23,064 | |
|
| |||
| Mean | 34,884 | 35,408 | 0.2747 |
| SE | 3.94 | 4.76 | |
| Median | 346 | 356 | |
| Mode | 365 | 355 | |
| SD | 27.61 | 23.31 | |
|
| |||
| Mean | 4620.49 | 4661.54 | 0.5033 |
| SE | 49.98 | 63.05 | |
| Median | 4612.0 | 4697.5 | |
| Mode | #N/A | #N/A | |
| SD | 34,983 | 30,889 | |
SE—standard error, SD—standard deviation.
Figure 3The distribution of de novo indels in genome regions according genome sequence function.