| Literature DB >> 31586224 |
Zeeshan Ahmed1,2, Saman Zeeshan3, Ruoyun Xiong4,3, Bruce T Liang5.
Abstract
BACKGROUND: The last decade has seen a dramatic increase in the availability of scientific data, where human-related biological databases have grown not only in count but also in volume, posing unprecedented challenges in data storage, processing, analysis, exchange, and curation. Next generation sequencing (NGS) advancements have facilitated and accelerated the process of identifying genetic variations. Adopting NGS with Whole-Genome and RNA sequencing in a diagnostic context has the potential to improve disease-risk detection in support of precision medicine and drug discovery. Several bioinformatics pipelines have been developed to strengthen variant interpretation by efficiently processing and analyzing sequence data, whereas many published results show how genomics data can be proactively incorporated into medical practices and improve utilization of clinical information. To utilize the wealth of genomics and health, there is a crucial need to generate appropriate gene-disease annotation repositories accessed through modern technology.Entities:
Year: 2019 PMID: 31586224 PMCID: PMC6778157 DOI: 10.1186/s40169-019-0243-8
Source DB: PubMed Journal: Clin Transl Med ISSN: 2001-1326
Fig. 1PAS-Gen navigating graphical user interfaces with examples of searched Gene, Gene to Disease, and Disease to Gene results. PAS-Gen (iPhone XS and 8) screen display includes About, Register User, Reset Password, Main, Menu, Genomics, Clinical Genomics, Genes, and Genes and Disease interfaces. Example 1 shows a search by entering an incomplete gene name “BRCA” (BReast CAncer gene) that reveals the for protein coding genes “BRCA1” and “BRCA2” and related details. Example 2 is a search using keyword “cancer” that presents 6443 genes known to be involved in different kinds of cancers. In example 3, a search for a specific disease “lung cancer” resulted in a total of 11 genes and related diseases. Example 4 demonstrates a search for the gene “RFWD2”, and results revealed 17 disease matches including a protein coding gene with Ensembl ID “ENSG00000143207” at Chromosome 1 associated with the disease “Autism”. Detailed results are attached in Additional file 1
Fig. 2PAS-Gen components design, development, and data flow. PAS-Gen is an iOS app developed with Swift programming language, XCODE integrated development environment for MacOS, MySQL database management system, PHP scripting language, and UNIX-based web and database servers
PAS-Gen database description: type and sub-types of genes
| # | Gene types | Gene sub-types |
|---|---|---|
| 1 | Protein coding | Coding |
| 2 | processed_transcript | non_coding |
| 4 | lincRNA | non_coding |
| 5 | Antisense | non_coding |
| 6 | IG_C_gene | non_coding |
| 7 | bidirectional_promoter_lncRNA | non_coding |
| 8 | polymorphic_pseudogene | non_coding |
| 9 | transcribed_unitary_pseudogene | non_coding |
| 10 | transcribed_unprocessed_pseudogene | non_coding |
| 11 | transcribed_processed_pseudogene | non_coding |
| 12 | sense_overlapping | non_coding |
| 13 | scRNA | non_coding |
| 14 | non_coding | non_coding |
| 15 | unprocessed_pseudogene | non_coding |
| 16 | IG_V_gene | non_coding |
| 17 | unitary_pseudogene | non_coding |
| 18 | vaultRNA | non_coding |
| 19 | TR_C_gene | non_coding |
| 20 | sense_intronic | non_coding |
| 21 | snRNA | non_coding |
| 22 | processed_pseudogene | non_coding |
| 23 | TEC | non_coding |
| 24 | TR_V_pseudogene | non_coding |
| 25 | TR_V_gene | non_coding |
| 26 | macro_lncRNA | non_coding |
PAS-Gen database includes protein coding and 25 non-coding gene types (processed transcript, lincRNA, antisense, IG C gene, bidirectional promoter lncRNA, polymorphic pseudogene, transcribed unitary pseudogene, transcribed unprocessed pseudogene, transcribed processed pseudogene, sense overlapping, scRNA, non coding, unprocessed pseudogene, IG V gene, unitary pseudogene, vaultRNA, TR C gene, sense intronic, snRNA, processed pseudogene, TEC, TR V pseudogene, TR V gene, macro lncRNA)
PAS-Gen database description and statistics
| Categories | Count |
|---|---|
| Genes-disease combinations | 98,064 |
| Gene types | 26 |
| Chromosomes | 24 |
| Genes (including aliases) | 13,216 |
| Genes (Ensembl IDs) | 10,598 |
| Unique diseases | 12,257 |
| Genes-disease combinations based on actionable genes | 32,089 |
| Distinguished genes-disease source combinations | 809 |
| Cancer leading genes | 8063 |
PAS-Gen database includes genes-disease combinations, gene types, chromosomes, genes (including aliases), genes (Ensembl IDs), diseases, actionable, source combinations, and cancer leading genes
Fig. 3PAS-Gen (iPhone 8) screenshot examples of gene results (top two shown) from searches for the four most common diseases: a 931 results for Diabetes, b 60 results for Obesity, c 391 results for Schizophrenia, and d 313 results for Autism. Detailed results are attached in Additional file 1
Fig. 5PAS-Gen screenshots examples of gene results (top two shown) for searches of common genetic diseases: a 117 results for Thalassemia, b 49 results for Down syndrome, c 91 results for Cystic Fibrosis, and d 18 results for Sickle Cell Anemia. Detailed results are attached in Additional file 1
Fig. 6PAS-Gen screenshots examples of gene results (top two shown) for searches of common genetic diseases: a 16 results for Tay-Sachs disease, b 31 results for Fragile X Syndrome, c 64 results for Hemophilia, and d 81 results for Huntington. Detailed results are attached in Additional file 1
Fig. 4PAS-Gen screenshot examples of gene results (top two shown) from searches of the most common diseases: a 512 Heart and related diseases, b 168 results for Polydactyly, c 79 results for Spina Bifida, and d 6443 results for Cancer. Detailed results are attached in Additional file 1