| Literature DB >> 35882976 |
Clara W T Koh1, Justin S G Ooi1, Gabrielle L C Joly1, Kuan Rong Chan2.
Abstract
Opening and processing gene expression data files in Excel runs into the inadvertent risk of converting gene names to dates. As pathway analysis tools rely on gene symbols to query against pathway databases, the genes that are converted to dates will not be recognized, potentially causing voids in pathway analysis. Molecular pathways related to cell division, exocytosis, cilium assembly, protein ubiquitination and nitric oxide biosynthesis were found to be most affected by Excel auto-conversion. A plausible solution is hence to update these genes and dates to the newly approved gene names as recommended by the HUGO Gene Nomenclature Committee (HGNC), which are resilient to Excel auto-conversion. Herein, we developed a web tool with Streamlit that can convert old gene names and dates back into the new gene names recommended by HGNC. The web app is named Gene Updater, which is open source and can be either hosted locally or at https://share.streamlit.io/kuanrongchan/date-to-gene-converter/main/date_gene_tool.py . Additionally, as Mar-01 and Mar-02 can each be potentially mapped to 2 different gene names, users can assign the date terms to the appropriate gene names within the Gene Updater web tool. This user-friendly web tool ensures that the accuracy and integrity of gene expression data is preserved by minimizing errors in labelling gene names due to Excel auto-conversions.Entities:
Mesh:
Year: 2022 PMID: 35882976 PMCID: PMC9325790 DOI: 10.1038/s41598-022-17104-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Schematic of Gene Updater. If old gene names are provided, these genes will be automatically converted to the updated approved gene names. If dates are provided, all genes, with the exception of MAR-01 and MAR-02 will be converted to the new approved gene names. For MAR-01 and MAR-02, users can assign the genes to either MTARC1, MARCHF1, MTARC2 or MARCHF2 within Gene Updater.
Human gene names that are most frequently converted to dates in Excel. The respective updated gene name and gene description is also provided.
| Previous gene name | Excel Date Conversion | HUGO Gene | Entrez gene description (Homo sapiens) |
|---|---|---|---|
| DEC1 | Dec-01 | DELEC1 | Deleted in oesophageal cancer 1 |
| MARC1 | Mar-01 | MTARC1 | Mitochondrial amidoxime reducing component 1 |
| MARCH1 | Mar-01 | MARCHF1 | Membrane associated ring finger 1 |
| MARCH2 | Mar-02 | MARCHF2 | Membrane associated ring finger 2 |
| MARC2 | Mar-02 | MTARC2 | Mitochondrial amidoxime reducing component 2 |
| MARCH3 | Mar-03 | MARCHF3 | Membrane associated ring finger 3 |
| MARCH4 | Mar-04 | MARCHF4 | Membrane associated ring finger 4 |
| MARCH5 | Mar-05 | MARCHF5 | Membrane associated ring finger 5 |
| MARCH6 | Mar-06 | MARCHF6 | Membrane associated ring finger 6 |
| MARCH7 | Mar-07 | MARCHF7 | Membrane associated ring finger 7 |
| MARCH8 | Mar-08 | MARCHF8 | Membrane associated ring finger 8 |
| MARCH9 | Mar-09 | MARCHF9 | Membrane associated ring finger 9 |
| MARCH10 | Mar-10 | MARCHF10 | Membrane associated ring finger 10 |
| MARCH11 | Mar-11 | MARCHF11 | Membrane associated ring finger 11 |
| SEPT1 | Sep-01 | SEPTIN1 | Septin 1 |
| SEPT2 | Sep-02 | SEPTIN2 | Septin 2 |
| SEPT3 | Sep-03 | SEPTIN3 | Septin 3 |
| SEPT4 | Sep-04 | SEPTIN4 | Septin 4 |
| SEPT5 | Sep-05 | SEPTIN5 | Septin 5 |
| SEPT6 | Sep-06 | SEPTIN6 | Septin 6 |
| SEPT7 | Sep-07 | SEPTIN7 | Septin 7 |
| SEPT8 | Sep-08 | SEPTIN8 | Septin 8 |
| SEPT9 | Sep-09 | SEPTIN9 | Septin 9 |
| SEPT10 | Sep-10 | SEPTIN10 | Septin 10 |
| SEPT11 | Sep-11 | SEPTIN11 | Septin 11 |
| SEPT12 | Sep-12 | SEPTIN12 | Septin 12 |
| SEPT14 | Sep-14 | SEPTIN14 | Septin 14 |
| SEP15 | Sep-15 | SELENOF | 15 kDa selenoprotein |
Figure 2Top 10 enriched pathways based on genes that are frequently converted to dates in Excel. Genes were analysed against the GO Biological Processes database with Enrichr, and all presented pathways had adjusted p-values < 0.05.
Figure 3Gene Updater dashboard which is used to resolve duplicate gene symbols. Red arrows indicate the dropdown widgets that allows users to assign the correct gene name for Mar-01 and Mar-02.
Journals publishing text or Excel files with date-related errors in June 2022.
| Journal | Text/Excel files found | Text/Excel files with gene symbols | Total number of files with date-related errors | Files without MAR-01 or MAR-02 date terms (%) | Files with MAR-01 or MAR-02 date terms | |
|---|---|---|---|---|---|---|
| With gene description column (%) | Without gene description column (%) | |||||
| BMC Genomics | 87 | 7 | 3 | 3 (100.0) | 0 (0.0) | 0 (0.0) |
| Nature | 8 | 1 | 0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Genome Biology | 8 | 7 | 0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Nucleic Acids Research | 40 | 10 | 1 | 1 (100.0) | 0 (0.0) | 0 (0.0) |
| Human Molecular Genetics | 14 | 7 | 0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| BMC Bioinformatics | 1 | 1 | 1 | 1 (100.0) | 0 (0.0) | 0 (0.0) |
| Nature Communications | 143 | 41 | 18 | 6 (33.3) | 7 (38.9) | 5 (12.2) |
| PLoS One | 55 | 7 | 5 | 4 (80.0) | 0 (0.0) | 1 (14.3) |
| Genome Research | 0 | 0 | 0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Genes Development | 0 | 0 | 0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| RNA | 0 | 0 | 0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Total | 356 | 81 | 28 | 15 (53.6) | 7 (25.0) | 6 (21.4) |
Number of files with and without MAR-01/MAR-02 genes are indicated, and the respective percentages presented in parentheses. Gene Updater web tool is able to rectify most datasets, besides those datasets with Mar-01 and Mar-02 terms but without gene description information (indicated on the right most column).