| Literature DB >> 31856831 |
Yanhuang Jiang1, Chengkun Wu2, Yanghui Zhang3, Shaowei Zhang1, Shuojun Yu1, Peng Lei1, Qin Lu1, Yanwei Xi4, Hua Wang5,6, Zhuo Song7.
Abstract
BACKGROUND: An important task in the interpretation of sequencing data is to highlight pathogenic genes (or detrimental variants) in the field of Mendelian diseases. It is still challenging despite the recent rapid development of genomics and bioinformatics. A typical interpretation workflow includes annotation, filtration, manual inspection and literature review. Those steps are time-consuming and error-prone in the absence of systematic support. Therefore, we developed GTX.Digest.VCF, an online DNA sequencing interpretation system, which prioritizes genes and variants for novel disease-gene relation discovery and integrates text mining results to provide literature evidence for the discovery. Its phenotype-driven ranking and biological data mining approach significantly speed up the whole interpretation process.Entities:
Keywords: Distributed parallel computing; Gene prioritization; NGS data interpretation; Neural network; Text mining
Mesh:
Year: 2019 PMID: 31856831 PMCID: PMC6923899 DOI: 10.1186/s12920-019-0637-x
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1The architecture of GTX.Digest.VCF system
Fig. 2The managing web page of VCF cards
Fig. 3The result web page
Fig. 4DTM algorithm running on the AWS platform
Fig. 5Experimental results
1: HP:0000465: Webbed neck 2: HP:0001520: Large for gestational age 3: HP:0001744: Splenomegaly 4: HP:0003623: Neonatal onset 5: HP:0005580: Duplication of renal pelvis 6: HP:0008752: Laryngeal cartilage malformation 7: HP:0011703: Sinus tachycardia 8: HP:0200128: Biventricular hypertrophy |