| Literature DB >> 32350262 |
Jenefer M Blackwell1, Timo Lassmann2, Alexia L Weeks1, Heather A D'Antoine3, Melita McKinnon3, Genevieve Syn1, Dawn Bessarab4, Ngiare Brown5, Steven Y C Tong3,6, Bo Reményi3, Andrew Steer7, Lesley-Ann Gray8,9, Michael Inouye8,9, Jonathan R Carapetis1.
Abstract
Whole exome sequencing (WES) is a popular and successful technology which is widely used in both research and clinical settings. However, there is a paucity of reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 50 Aboriginal individuals from the Northern Territory (NT) of Australia and compare these to 72 previously published exomes from a Western Australian (WA) population of Martu origin. Sequence data for both NT and WA samples were processed using an 'intersect-then-combine' (ITC) approach, using GATK and SAMtools to call variants. A total of 289,829 variants were identified in at least one individual in the NT cohort and 248,374 variants in at least one individual in the WA cohort. Of these, 166,719 variants were present in both cohorts, whilst 123,110 variants were private to the NT cohort and 81,655 were private to the WA cohort. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.Entities:
Mesh:
Year: 2020 PMID: 32350262 PMCID: PMC7190730 DOI: 10.1038/s41597-020-0463-1
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Whole exon coverage statistics. Panels show coverage at 20X and 30X for the WA population (a,b) and 20X and 30X coverage for the NT population (c,d). Each bar represents an individual sample and the percentage of bases with at least 20X or 30X coverage. Red lines mark 80% and 60% coverage at 20X and 30X depths, respectively, which all NT samples and most WA samples achieve.
Fig. 2Matrix layout for the intersections of variant identified in the WA and NT populations relative to the hg19 human reference genome. Dark circles in the matrix indicate sets that are part of the intersection.
Fig. 3Annotation of identified variants (a) and consequence (b) for the WA and NT populations.
Fig. 4Ts/Tv ratio calculated individually for all individuals using SNVs passing the VQSR threshold. The first 50 samples on the X-axis are the NT samples, the remainder are the WA samples. The differences in average Ts/Tv ratios between NT and WA samples reflects differences in the exonic/intronic sequence ratios in the two different capture panels employed for WES.
| Measurement(s) | Aboriginal Australian • DNA • sequence feature annotation |
| Technology Type(s) | Whole Exome Sequencing • DNA sequencing • sequence annotation |
| Factor Type(s) | ancestry • sex • age |
| Sample Characteristic - Organism | Homo sapiens |
| Sample Characteristic - Location | Northern Territory |