Literature DB >> 30833669

Using Machine Learning to Predict Sensorineural Hearing Loss Based on Perilymph Micro RNA Expression Profile.

Matthew Shew1, Jacob New2, Helena Wichova3, Devin C Koestler4, Hinrich Staecker3.   

Abstract

Hearing loss (HL) is the most common neurodegenerative disease worldwide. Despite its prevalence, clinical testing does not yield a cell or molecular based identification of the underlying etiology of hearing loss making development of pharmacological or molecular treatments challenging. A key to improving the diagnosis of inner ear disorders is the development of reliable biomarkers for different inner ear diseases. Analysis of microRNAs (miRNA) in tissue and body fluid samples has gained significant momentum as a diagnostic tool for a wide variety of diseases. In previous work, we have shown that miRNA profiling in inner ear perilymph is feasible and may demonstrate distinctive miRNA expression profiles unique to different diseases. A first step in developing miRNAs as biomarkers for inner ear disease is linking patterns of miRNA expression in perilymph to clinically available metrics. Using machine learning (ML), we demonstrate we can build disease specific algorithms that predict the presence of sensorineural hearing loss using only miRNA expression profiles. This methodology not only affords the opportunity to understand what is occurring on a molecular level, but may offer an approach to diagnosing patients with active inner ear disease.

Entities:  

Year:  2019        PMID: 30833669      PMCID: PMC6399453          DOI: 10.1038/s41598-019-40192-7

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Hearing loss is the most common neurodegenerative disease worldwide and is estimated to affect over 432 million adults and 34 million children worldwide[1]. Unaddressed hearing loss is estimated to pose an annual global cost of over 750 billion US dollars[1]. Despite the significant disease burden and economic impact of hearing loss, diagnosing and treating this condition remains a significant challenge because of the limited ability to perform biopsies in order to understand what aberrant mechanisms are occurring on a molecular level. There are myriad of etiologies that can lead to hearing loss, including: genetic, infectious, noise trauma, and multifactorial disorders such as presbycusis. Clinicians often rely on an assortment of objective testing, including audiometry and vestibular testing, to guide diagnosis and treatment. While these tests provide a measure of function, they do not provide a molecular diagnosis and often do not correctly reflect the cellular site of lesion[2]. MiRNAs are 19–23 base pair single stranded RNA sequences that regulate post translational gene expression[3]. These molecules have been identified in all body fluids and are recognized for their promising role as a diagnostic and prognostic marker for neurodegenerative diseases such as Alzheimer’s and various cancers[4-8]. We recently demonstrated that miRNA profiling within the inner ear is a feasible methodology and can potentially offer insight into what is occurring on a cellular and molecular level in various inner ear pathologies[9]. In our search for specific hearing loss related biomarkers, we were able to demonstrate that various inner ear diseases, from Meniere’s disease to otosclerosis, express different and distinct miRNA profiles[9]. Similarly, recent investigations have also identified several key and distinct miRNAs within the venous blood in patients with sudden sensorineural hearing loss compared to healthy controls[10]. However, one of the challenges facing analysis of miRNA data from the inner ear is the immense and variable expression patterns across various diseases that may not be common to all cases. Machine learning (ML) is a subdiscipline of artificial intelligence (AI) and borrows from multiple disciplines including mathematics, statistics, and computer science[11,12]. The field of ML is broadly concerned with two types of tasks: supervised and unsupervised learning. Supervised learning uses prior information on the outcome of interest (labeled data) with the goal of learning a function that, given data on the both the outcome and predictor variables, best approximates the relationship between the predictors and outcome. Supervised learning can be further subdivided into classification and regression depending on the nature of the outcome variable; the former being used when the outcome is categorical and the latter when the outcome is continuous. Conversely, unsupervised learning does not use labeled outputs, but rather seeks to learn and infer the underlying structure present within a set of data. An example unsupervised learning would the use of gene expression microarray data to identify molecular subtypes of a given disease, or otherwise groups/clusters of subjects with a similar gene expression profile. Simply put, ML methods are used by researchers to analyze large amounts of data to find patterns, and in doing so, better solve problems. With the ever-advancing nature of computing power, ML has gained significant popularity within the scientific community. While ML has been slow to assimilate into healthcare, we have seen ML applications ranging from diagnosing skin cancer[13], diagnosing brain tumors through MRI[14], diagnosing glaucoma[15], optimizing drug therapies[16], analyzing large genome sequencing[17,18], to predicting various diseases and clinical outcomes[12,19-21]. In the current study, we used ML to build disease specific algorithms to predict the presence of sensorineural hearing loss in different inner ear pathologies based on perilymph-derived miRNA expression profiles of the inner ear. Subsequently, we applied our algorithms to de-identified patient samples and established the presence and varying degree of sensorineural hearing loss through miRNA expression profile alone. This methodology offers a promising approach for inner diagnosis, prognosis, and monitoring for various neurotologic diseases. Likewise, using this approach we may be able to understand what may be occurring on a molecular level in inner ear disease specific states in a manner that was previously not possible.

Results

We collected perilymph from a total of sixteen patients. Four patients had otosclerosis and perilymph was collected while undergoing stapedectomy. These patients had a pure conductive hearing loss and served as controls since sampling perilymph from patients without an active ear disorder that requires a surgery is not possible at this point in time. Twelve patients underwent cochlear implantation in which perilymph was collected upon opening up the round window for electrode insertion. Seven out of twelve patients had profound SNHL and were classified as not having any residual hearing (PTA > 80 dB), with a mean PTA of 108.9 dB. A total of five out of twelve patients with severe SNHL were classified as having residual hearing (PTA < 80 dB), with a mean PTA of 69.4 dB (Fig. 1). As the use of ML to predict inner ear disease using miRNA signatures represents a novel application, we elected to build and test multiple ML models to ascertain if one was superior to others.
Figure 1

Average pure tone average (PTA) in decibels (dB) for cochlear implant patients with sensorineural hearing loss with and without residual hearing.

Average pure tone average (PTA) in decibels (dB) for cochlear implant patients with sensorineural hearing loss with and without residual hearing.

Conductive Hearing Loss versus Sensorineural Hearing Loss

A ML model was constructed to differentiate between SNHL (patients undergoing cochlear implantation) and CHL (patients undergoing stapedectomy). Using a 70/30 split of the data into training/testing sets, both the decision forest and logistic regression ML models were able to differentiate between SNHL and CHL with 100% accuracy in the testing set. The decision jungle and neural network ML were able to differentiate between SNHL and CHL with 80% accuracy in the testing set. We cross validated the models using a leave-one-out cross approach, and all 4 models had a misclassification error of 6.25%. In order to understand how the model was built, we applied the permutation feature of importance to the ML models. The permutation feature of importance is a feature option within Azure ML software that allows the model to understand and evaluate the weighted importance of each data point, in this case miRNA, within the constructed model. The most heavily weighted miRNAs used to construct the ML model included: human miRNA 4767, miRNA 182 5p, miRNA 6754 5p, miRNA 6797 3p, miRNA 6806 3p, miRNA 6860, miRNA 4278, miRNA 3975, miRNA 4655, and miRNA 4732. The miRNAs were considered significant if they were used in the construction of two or more ML models. The miRNAs identified were compared to gene expression in the inner ear, and miRNA function and potential relationship to hearing loss was evaluated using Ingenuity Pathway Analysis (IPA) software. miRNA 4767, 6754 5p, 6797 3p, 6860, and 4732 had no significant known interactions identified. On the other hand, miRNA 182 5p, 6806 3p, 4278, 3975, and 4655 were identified as high probability or with proven interactions in previous microRNA – gene interaction experiments. Table 1 summarizes the key miRNAs with predicted downstream interactions identified that were significant in building ML model to differentiate SNHL from CHL.
Table 1

Critical miRNA identified and predicted downstream gene expression targets for sensorineural hearing loss vs conductive hearing loss.

miRNA 182Entrez Gene NameLocationFamilymiRNA - mRNA Match
ADCY6Adenylate Cyclase 6Plasma MembraneenzymeProven
IGF1RInsulin Like Growth Factor 1 ReceptorPlasma Membranetransmembrane receptorProven
MITFMelanogenesis Associated Transcription FactorNucleustranscription regulatorProven
MTDHMetadherinCytoplasmtranscription regulatorProven
PPARAPeroxisome Proliferator Activated Receptor AlphaNucleusligand-dependent nuclear receptorProven
RARGRetinoic Acid Receptor GammaNucleusligand-dependent nuclear receptorProven
TP53Tumor Protein p53Nucleustranscription regulatorProven
miRNA 4278
ARCN1archain 1CytoplasmotherHigh Probability
CASC4cancer susceptibility 4CytoplasmotherHigh Probability
CD81CD81 moleculePlasma MembraneotherHigh Probability
EIF4A1eukaryotic translation initiation factor 4A1Cytoplasmtranslation regulatorHigh Probability
FOSBFosB proto-oncogene, AP-1 transcription factor subunitNucleustranscription regulatorHigh Probability
GDI1GDP dissociation inhibitor 1CytoplasmotherHigh Probability
HSP90B1heat shock protein 90 beta family member 1CytoplasmotherHigh Probability
KAT2Blysine acetyltransferase 2BNucleustranscription regulatorHigh Probability
NT5DC35′-nucleotidase domain containing 3OtherotherHigh Probability
RHOCras homolog family member CPlasma MembraneenzymeHigh Probability
SRSF2serine and arginine rich splicing factor 2Nucleustranscription regulatorHigh Probability
STOMstomatinPlasma MembraneotherHigh Probability
THRAthyroid hormone receptor, alphaNucleusligand-dependent nuclear receptorHigh Probability
miRNA 3975
ACTG1actin gamma 1CytoplasmotherHigh Probability
BSGbasigin (Ok blood group)Plasma MembranetransporterHigh Probability
DCTN5dynactin subunit 5CytoplasmotherHigh Probability
EPS15epidermal growth factor receptor pathway substrate 15CytoplasmotherHigh Probability
GJB2gap junction protein beta 2Plasma MembranetransporterHigh Probability
IFI6interferon alpha inducible protein 6CytoplasmotherHigh Probability
PPP1R1Bprotein phosphatase 1 regulatory inhibitor subunit 1BCytoplasmphosphataseHigh Probability
SC5Dsterol-C5-desaturaseCytoplasmenzymeHigh Probability
miRNA 4655
CDC42cell division cycle 42CytoplasmenzymeHigh Probability
FAM126Afamily with sequence similarity 126 member ACytoplasmotherHigh Probability
FOXO1forkhead box O1Nucleustranscription regulatorHigh Probability
GPX1glutathione peroxidase 1CytoplasmenzymeHigh Probability
NFIXnuclear factor I XNucleustranscription regulatorHigh Probability
SERPINF1serpin family F member 1Extracellular SpaceotherHigh Probability
THRAthyroid hormone receptor, alphaNucleusligand-dependent nuclear receptorHigh Probability
UBE3Aubiquitin protein ligase E3ANucleusenzymeHigh Probability
XRCC6X-ray repair cross complementing 6NucleusenzymeHigh Probability
Critical miRNA identified and predicted downstream gene expression targets for sensorineural hearing loss vs conductive hearing loss.

Severity of sensorineural hearing loss

A ML model was constructed to differentiate between cochlear implant candidates with different degrees of sensorineural hearing loss. Groups were categorized as either having residual hearing (cochlear implantation patients with PTA < 80 dB) or no residual hearing (cochlear implant patients with PTA > 80 dB). All four ML models, which included: decision forest, decision jungle, logistic regression, and neural networks, were able to differentiate between cochlear implant candidates with and without residual hearing with 100% accuracy (Fig. 2B). To cross validate these models, we used a leave-one-out cross validation approach. The misclassification error for each model is: decision forest, 0%; logistic regression, 8.33%; decision jungle, 25%; and neural network, 41.67%. In order to understand how the model was built, we applied the permutation feature of importance to the ML models. The most heavily weighted miRNAs used to construct the ML model included human miRNA 184, miRNA 660, miRNA let 7a 5p, miRNA 3142, and miRNA 335. The miRNAs were considered significant if they were used in the construction of two or more ML models. The miRNAs identified were compared to gene expression in the inner ear, and miRNA function and potential relationship degree of SNHL were evaluated using IPA software. Table 2 summarizes the key miRNA and predicted downstream targets which included miRNA 184, miRNA 660, and miRNA let 7a 5p. No significant known interactions were predicted using IPA software for miRNA 3142 and miRNA 335.
Figure 2

Representative evaluation and scoring maps for machine learning (ML) training models and testing set. X axis represents the actual diagnosis while the Y axis represents the predicted diagnostic class based on the testing set. (A) Decision forest and decision jungle ML model built to diagnose sensorineural hearing loss (SNHL) compared to conductive hearing loss (CHL) using a 70/30 split. Decision forest is able to diagnose with 100% accuracy while decision jungle is able to diagnose with 80% accuracy. (B) Decision forest, decision jungle, logistic regression, and neural networks ML models built to diagnose SNHL with and without residual hearing. All four ML models were able to diagnose with 100% accuracy.

Table 2

Critical miRNA identified and predicted downstream gene expression targets for sensorineural hearing loss with and without residual hearing.

miRNA -184Entrez Gene NameLocationFamilymiRNA - mRNA Match
AKT2AKT serine/threonine kinase 2CytoplasmKinaseProven
NFATC2nuclear factor of activated T cells 2Nucleustranscription regulatorProven
ADD1adducin 1CytoplasmOtherHigh Probability
AGRNagrinPlasma MembraneOtherHigh Probability
APLNapelinExtracellular SpaceOtherHigh Probability
BCL2L1BCL2 like 1CytoplasmOtherHigh Probability
CAMK2Bcalcium/calmodulin dependent protein kinase II betaCytoplasmkinaseHigh Probability
CDC25Acell division cycle 25ANucleusphosphataseHigh Probability
CSF1colony stimulating factor 1Extracellular SpacecytokineHigh Probability
CYCScytochrome c, somaticCytoplasmtransporterHigh Probability
GRIN1glutamate ionotropic receptor NMDA type subunit 1Plasma Membraneion channelHigh Probability
HTR1A5-hydroxytryptamine receptor 1APlasma MembraneG-protein coupled receptorHigh Probability
ID1inhibitor of DNA binding 1, HLH proteinNucleustranscription regulatorHigh Probability
IL15RAinterleukin 15 receptor subunit alphaPlasma Membranetransmembrane receptorHigh Probability
IL7Rinterleukin 7 receptorPlasma Membranetransmembrane receptorHigh Probability
IQSEC. 3IQ motif and Sec. 7 domain 3CytoplasmotherHigh Probability
LASP1LIM and SH3 protein 1CytoplasmtransporterHigh Probability
LRRC8Aleucine rich repeat containing 8 VRAC subunit APlasma Membraneion channelHigh Probability
LYNX1Ly6/neurotoxin 1Plasma MembranetransporterHigh Probability
NFATC2nuclear factor of activated T cells 2Nucleustranscription regulatorHigh Probability
NFATC2IPnuclear factor of activated T cells 2 interacting proteinNucleusotherHigh Probability
PACSIN1protein kinase C and casein kinase substrate in neurons 1CytoplasmkinaseHigh Probability
PDGFBplatelet derived growth factor subunit BExtracellular Spacegrowth factorHigh Probability
PGA5 (includes others)pepsinogen 3, group I (pepsinogen A)Extracellular SpacepeptidaseHigh Probability
PIGQphosphatidylinositol glycan anchor biosynthesis class QCytoplasmenzymeHigh Probability
PLPP3phospholipid phosphatase 3Plasma MembranephosphataseHigh Probability
PLTPphospholipid transfer proteinExtracellular SpaceenzymeHigh Probability
S100A7AS100 calcium binding protein A7ACytoplasmotherHigh Probability
SIRPAsignal regulatory protein alphaPlasma MembranephosphataseHigh Probability
SLASrc like adaptorPlasma MembraneotherHigh Probability
SRCSRC proto-oncogene, non-receptor tyrosine kinaseCytoplasmkinaseHigh Probability
TNK2tyrosine kinase non receptor 2CytoplasmkinaseHigh Probability
TSPEARthrombospondin type laminin G domain and EAR repeatsExtracellular SpaceotherHigh Probability
VAMP1vesicle associated membrane protein 1CytoplasmtransporterHigh Probability
miRNA-660
AOC3Amine Oxidase, Copper Containing 3Plasma MembraneenzymeHigh Probability
APOBEC3FApolipoprotein B MRNA Editing Enzyme Catalytic Subunit 3FNucleusDeaminaseHigh Probability
CDH13Cadherin 13Plasma MembraneotherHigh Probability
DNTTDNA NucleotidylexotransferaseNucleusDNA polymeraseHigh Probability
EXOC1Exocyst Complex Component 1CytoplasmTransporterHigh Probability
FOLH1Folate Hydrolase 1Plasma MembraneenzymeHigh Probability
GRIN2BGlutamate Ionotropic Receptor NMDA Type Subunit 2BPlasma Membranetransmembrane receptorHigh Probability
HIF1AHypoxia Inducible Factor 1 Subunit AlphaNucleustranscription regulatorHigh Probability
KCNJ2Potassium Voltage-Gated Channel Subfamily J Member 2plasma Membraneion channelHigh Probability
NRCAMNeuronal Cell Adhesion MoleculePlasma MembraneotherHigh Probability
QPRTQuinolinate PhosphoribosyltransferaseCytoplasmenzymeHigh Probability
SAA1Serum Amyloid A1CytoskeletonotherHigh Probability
VDAC1Voltage Dependent Anion Channel 1Plasma Membraneion channelHigh Probability
Let 7a 5p
FAM105Afamily with sequence similarity 105 member AOtherotherProven
FAM96Afamily with sequence similarity 96 member AExtracellular SpaceotherProven
GRPEL2GrpE like 2, mitochondrialCytoplasmotherProven
KCNJ16potassium voltage-gated channel subfamily J member 16Plasma Membraneion channelProven
MARS2methionyl-tRNA synthetase 2, mitochondrialCytoplasmenzymeProven
MIR4500microRNA 4500CytoplasmmicroRNAProven
SLC1A4solute carrier family 1 member 4Plasma MembranetransporterProven
SLC38A1solute carrier family 38 member 1Plasma MembranetransporterProven
SMOXspermine oxidaseCytoplasmenzymeProven
SYPL1synaptophysin like 1Plasma MembranetransporterProven
ADAMTS8ADAM metallopeptidase with thrombospondin type 1 motif 8Extracellular SpacepeptidaseHigh Probability
AGBL2ATP/GTP binding protein like 2CytoplasmenzymeHigh Probability
DSCR8Down syndrome critical region 8 (non-protein coding)OtherotherHigh Probability
FRMD4BFERM domain containing 4BCytoplasmotherHigh Probability
INTS6Lintegrator complex subunit 6 likeNucleusotherHigh Probability
MIR4500microRNA 4500CytoplasmmicroRNAHigh Probability
PQLC2PQ loop repeat containing 2CytoplasmtransporterHigh Probability
SDR42E1short chain dehydrogenase/reductase family 42E, member 1OtherenzymeHigh Probability
SLC35D2solute carrier family 35 member D2CytoplasmtransporterHigh Probability
TMEM211transmembrane protein 211OtherotherHigh Probability
TTLL4tubulin tyrosine ligase like 4CytoplasmenzymeHigh Probability
Representative evaluation and scoring maps for machine learning (ML) training models and testing set. X axis represents the actual diagnosis while the Y axis represents the predicted diagnostic class based on the testing set. (A) Decision forest and decision jungle ML model built to diagnose sensorineural hearing loss (SNHL) compared to conductive hearing loss (CHL) using a 70/30 split. Decision forest is able to diagnose with 100% accuracy while decision jungle is able to diagnose with 80% accuracy. (B) Decision forest, decision jungle, logistic regression, and neural networks ML models built to diagnose SNHL with and without residual hearing. All four ML models were able to diagnose with 100% accuracy. Critical miRNA identified and predicted downstream gene expression targets for sensorineural hearing loss with and without residual hearing. A regression ML model was constructed using pure tone average (PTA) in decibels to further study the relationship between miRNA and severity of SNHL. The root mean squared error for our best model was 21.26 dB, allowing us to predict hearing loss with an expected error in the 21 dB range. We observed a decrease in sensitivity of the model when comparing directly with PTA as a continuous variable, compared to categorical characterization using 80 dB as a cut off. The decreased accuracy is consistent with the heterogenous nature of hearing loss within our patient cohort.

Discussion

MiRNAs were initially discovered in 1993 and have since been shown to play an essential role in post translational regulation of gene expression through messenger RNA (mRNA) degradation and splicing[3,22]. Because miRNA play a critical role in cell gene expression, miRNA activity has been increasingly recognized as a critical component to many disease states making them a promising biomarker[5,23]. Interestingly, miRNA profiles have shown unique expression patterns to different diseases states, from heart failure[24] to individual cancers[7] (colon[25-27], ovarian[28], and clear cell carcinoma[29]), neurodegenerative diseases[4], different cell types[30], and play an integral role embryonic development[8]. Pertinent to hearing loss, miRNA have shown to be a promising biomarker and diagnostic marker for otherwise difficult to diagnose neurodegenerative diseases such as Alzheimer’s and Parkinsons[4,5,31]. Additionally, miRNA have shown to play a critical role in inner ear development, demonstrate tissue and site specific expression, and may exhibit expression patterns specific to SNHL[10,32-34]. Taken together, miRNA expression profiling may serve as a promising diagnostic and prognostic tool for the inner ear. We sought to see if ML could differentiate between distinct inner ear miRNA expression profiles specific to different active inner ear disease states. To date, we have no biopsy equivalent for inner ear disease and have no objective insight into what is occurring on a molecular level in patients with active inner ear disease. The knowledge we have assimilated to date is based on objective testing and the study of various otologic and neurotologic pathologies using temporal bone pathology and animal models. Here, we have shown that various inner ear diseases may demonstrate specific and unique miRNA expression profiles in the very simple model of presence or absence of sensorineural loss, and a model differentiating varying severity of sensorineural hearing loss. Utilizing the unique expression profiles, we can use machine learning to build algorithms to differentiate between different degrees of sensorineural hearing loss with high accuracy, opening the door to further refining the application of this methodology. Artificial intelligence (AI) and its subdiscipline machine learning (ML) have seamlessly integrated themselves into our modern-day culture, but they have been slow to assimilate into health care. ML enables one to analyze large amounts of data, understand pattern recognition, and make predictions using what it has learned. ML borrows from many subdisciplines including statistics and computer science. However while statistics primarily focuses on inferences related to causation and examines how a system of components relate to one another, ML is novel in that it makes predictions based on large sets of data and experiences[35]. We can supply an un-analyzed perilymph miRNA profile into our ML algorithms and subsequently make accurate predictions to discern between different degrees of sensorineural hearing loss. By analyzing the function of miRNAs identified by this process we may be able to indirectly identify the molecular pathways involved in different inner ear diseases. ML is optimally used when applied to large and complex datasets, such as gene expression profiles, where it can analyze various patterns to make predications without being specifically programmed to do so. Using the permutation feature of importance we can gain significant insight into how the different decision algorithms are built and how certain factors are weighted (Tables 1 and 2). ML and permutation feature of importance also offers a novel way to analyze miRNA that may be playing a critical role in various inner ear pathologies. For example, looking at SNHL vs CHL, we inputted the critical miRNA identified (4278, 3975, 4655) into the IPA software, and isolated downstream targets that have been experimentally proven. We then analyzed downstream targets that overlap with two or more of the critical miRNA identified. Of interest we identified KCNJ10, HCN, and Otoferlin. All three genes have been experimentally proven with the individual miRNAs and shown to play critical roles in SNHL on a molecular level. KCNJ10 has been shown within the stria vascularis to have a critical role in generating endocochlear potentials[36]. Hyperpolarized-activated cyclic nucleotide cation (HCN) channels have been shown to play an essential role within the spiral ganglia and propagating post synaptic potentials[37]. Otoferlin has been shown to play a critical role in Ca2+ evoked vesicular exocytosis within inner hair cells[38]. Similarly, miRNA Let 7 family was critical in differentiating CI patients with and without residual hearing. Investigators have experimentally shown how Let 7 miRNA family plays a critical role in RAS signaling[39], and along similar lines RAS/MAPK pathway is crucial in inner hair survival following noise induced hearing loss[40]. The unique miRNA profiles identified through ML can offer significant understanding to aberrant disease mechanisms on a molecular level and also potentially identify site specific pathology. Furthermore, using SNHL with and without residual as a categorical differentiation we are able to construct ML models with nearly 100% accuracy. However, when analyzing PTA as a continuous variable we lose some accuracy with a root mean squared error of 21.26 dB, allowing us to predict +/−21 dB. While these results do raise some inconsistency issues using miRNAs as a direct measure to varying degrees of SNHL, they are not surprising given the heterogeneity of hearing loss within our patient cohorts. PTA is an average hearing loss across multiple frequencies. While patients may have similar PTAs, they can have significantly different patterns and frequencies affected. Secondly patients have different underlying disease mechanisms leading to hearing loss, such as presbycusis versus early onset genetic hearing loss. Taking both these factors into account, the decreased accuracy using PTA as a continuous variable is not surprising. As we continue to grow our patient perilymph database and homogenize our SNHL patients with similar patterns and frequencies, we would expect an improvement within our regression ML model. Despite the increasing recognition of the unique role miRNAs play in development and various disease states, identifying and validating miRNA target genes has been a great challenge[41,42]. Computational analysis on miRNAs and regulation of gene expression are based on aligning and predicting 5′ miRNA and 3′ complimentary known mRNA sequences. While this methodology is the mainstay of miRNA discovery and gene expression analysis, it does have its limitations[42,43]. As we continue to grow our miRNA perilymph database, key and critical miRNA identified will need to be validated through miRNA/mRNA target validation, co expression, and ultimately their regulatory effect on gene expression through luciferase and other validated expression assays[41]. Validation experiments will be critical in moving forward, particularly as the key miRNA unique to different inner ear disease states are pursued as potential drug therapy targets. While the ML methodologies employed herein offer an exciting prospect, we must acknowledge several limitations. Definitive conclusions are limited by our small sample size of 16 patients. As we continue to grow our database and include a wider range of pathologies, we do not expect to maintain 100% accuracy in our predictions; however, we do expect the sensitivity of our algorithms to improve. ML algorithms are adaptive, therefore, as we continue to recruit additional patients, the ML algorithms will continue to learn from these new “experiences” and adapt its output accordingly. However, our first goal was to demonstrate that inner ear miRNA expression profiling may be unique to disease specific inner ear pathologies. Using ML, we were able to successfully delineate these unique miRNA expression profiles and use them to predict ongoing inner ear pathologies in patients in real time, a methodology not previously possible on a molecular level. Finally, one of the major limitations of ML is the “black box” that are the predictive algorithms. We can control the data that goes in and the desired predictions we wish to build, but we have limited insight into the exact mechanics behind each algorithm[44]. One of the shortcomings of ML as compared to traditional statistical methodologies is that we cannot put any significantly meaningful inferences behind how the model is built. There are no confidence intervals or odds ratios equivalent for ML. The only meaningful insight we have into the algorithm is using the different analytical features such as permutation feature of importance to assess the contribution of each miRNA to the constructed model. While we can understand which limited number of the miRNA being used and at what weighted fraction, unfortunately we do not know if its upregulation or downregulation of a pathway for each specific miRNA. There are many ongoing efforts to understand and unlock this “black box” but unfortunately we still have yet to find a definitive solution[45].

Conclusions

MiRNAs’ unique role and expression profiles are well recognized to play an integral role in different disease states from cancer to neurodegenerative states making them an exciting biomarker. MiRNAs are well established to play a critical role in inner ear development and some preliminary investigations have shown its potential role in SNHL. In this study, we demonstrate using ML we can delineate the miRNA expression profile unique to different inner ear pathologies. This methodology not only provides an understanding of inner ear pathology on a molecular level, but may also offer a novel method to diagnose and prognose patients with active inner ear disease. Similar to how a patient undergoes a lumbar puncture to collect CSF for meningitis, one could theoretically undergo a “round window tap” to diagnose, prognose, and monitor therapeutic interventions for Meniere’s disease or SNHL. While our methodology may offer an exciting prospect for stratification of patients with inner ear disease, multiple safety and validation studies will need to be performed. Additional patient recruitment will be needed to continue to grow our perilymph miRNA database to improve the predictive algorithms and its potential ability to differentiate various inner ear pathologies.

Methods

Human perilymph sampling was approved by the University of Kansas Human Studies Committee and Institutional Review Board (IRB). All experiments performed were in accordance with relevant guidelines and regulations approved by the University of Kansas IRB. Patients were recruited if they were undergoing a surgical procedure in which the inner ear was opened, and informed consent was obtained prior to perilymph collection. Procedures in which patients were recruited included patients undergoing stapedectomy for otosclerosis or cochlear implantation for sensorineural hearing loss (SNHL). All patients received standard of care and underwent standard surgical treatment for either stapedectomy or cochlear implantation. Only difference is when patient’s inner ear was opened for their indicated procedure, a small sterile glass capillary tube was used to collect approximately 2–5 µL volume of perilymph[9].

Sampling for stapedectomy

The skin of the external auditory canal was injected with 0.5 ml of 1% lidocaine + 1:100,000 epinephrine. Using a round knife, a cut was made in the skin of the external canal and the dependent middle flap was carefully elevated medially revealing the middle ear structures. Using the Omniguide™ CO2 laser with a power setting a four watts, 0.1 second single bursts, the stapes superstructures were removed. Using the laser, a rosette fenestration was made in the stapes footplate. Upon making the fenestration, perilymph could be seen coming out laterally from the vestibula. We then removed excess perilymph with a sterile glass capillary tube. After successful collection of perilymph fluid, the stapes footplate fenestration was enlarged to accommodate the stapes prosthesis. The prosthesis was then hooked around the incus and the surgery was completed in a standard fashion.

Sampling for cochlear implantation

Through a post auricular incision, a mastoidectomy and facial recess exposure of the round window were completed in a standard fashion. The wound was irrigated with antibiotic solution. The oval and round windows were identified, and the bony overhang of the round window niche was removed with a 1 mm micro drill. Using an angled pick, the round window was opened. At this point there was free flow of excess perilymph out of the cochlea which was sampled using a sterile glass capillary tube. The implant electrode was then inserted into the cochlea.

microRNA analysis

The perilymph was collected as described above. Total RNA was extracted with Trizol reagent (Thermofisher, cat #15596018) and purified by centrifuging with phase lock heavy gel (Tiagen, cat # WMS-2302830). RNA was analysed using The Agilent RNA6000 Pico kit using an Agilent Bioanalyzer 2100 yielding on average 0.5–2 ng of total RNA per sample. Samples were processed and analysed with an Affymetrix miRNA 4.0 array to determine the presence of micro RNAs. The Affymetrix miRNA 4.0 array interrogates all miRNA sequences listed in miRBase Release 20; interrogating 30,434 mature miRNAs from 203 organisms of which 2,578 are from humans. The arrays were background corrected, normalized and gene-level summarized using the Robust Multichip Average (RMA) algorithm[9]. This normalization step makes inferences on miRNA expression across conditions possible. In order to ascertain which miRNAs were significantly expressed in each array, for each miRNA probe in the array, a detection p-value was computed based on a Wilcoxon Rank-Sum test of the miRNA probe set signals compared to the distribution of a GC matched background signal comprising of anti-genomic probes in the same array. The detection p-values were adjusted for multiple hypothesis testing (FDR) using the Benjamini and Hochberg method. These analyses were performed using Affymetrix Expression Console Software. miRNAs with a normalized log2 signal intensity ≥7 and an adjusted detection p-value (FDR) ≤ 0.05 were considered significantly expressed in the assay and were considered in downstream ML model development.

Data analysis

Patient selection and audiometry

All patients underwent pure tone audiometry prior to their respective procedures. Air and bone conduction thresholds were determined, and a CT scan of the temporal bone was performed for all patients with diagnosis of otosclerosis. Only patients with conductive hearing loss (CHL) and radiologically confirmed otosclerosis were included in the CHL control group (n = 4). For patients that met audiologic criteria for cochlear implantation, the frequency pure tone average (PTA) was determined. A PTA of 70 dB was used as a limit identifying patients with residual hearing (PTA < 80 dB) (n = 5) and no residual hearing (PTA > 80 dB) (n = 7) (Fig. 1).

Machine learning analysis

In order to analyse the miRNA dataset unique to various inner ear pathologies, we constructed a supervised machine learning classification model using the opensource Azure Machine Learning Studio (Microsoft Corporation). Data was uploaded form the Affymetrix miRNA 4.0 array, and formatted for Azure ML. The log2 transformed signal intensities were used to construct the ML models. The data was randomly split 70/30 into training and testing sets. ML models were built using the training data and subsequently tested on the remaining 30% of data comprising the testing set. We considered multiple multiclass decision models, including: multiclass decision forest (minimum samples per lead node 1; number of random splits 128; maximum depth of decision tree 32; number of decision trees 8), multiclass decision jungle (number of optimization steps per decision DAG layer 2048; maximum width of decision DAGs 128; maximum depth of the decision DAGs 32; number of decision DAGs 8), multiclass logistic regression, and multiclass neural network. Boostrap aggregation was built into model generation for each of the fitted models. “Tune model hyperparameters”, an option within Azure ML software, was used to apply an entire grid wide sweep and determine the optimum parameters settings. After the model was built and tuned, it was then scored and evaluated using the testing set data (Figs 2 and 3). A leave-one-out cross validation approach was subsequently used to better evaluate each model’s performance. To represent the results of the cross validation, the leave-one-out misclassification error rate was determined for each model as the ratio of the number of misclassified samples to the total number of samples. A permutation-based approach was applied to assess feature importance within each multiclass trained decision model in order to analyse which miRNAs had the most influence, along with their corresponding weight in construction of the respective trained model. The metric used in the permutation feature analysis was accuracy, and permutation feature importance scores are determined as: permutation feature importance score = base model accuracy − model accuracy after shuffling a given feature. Thus, a high permutation feature importance score is indicative of a feature with a large influence on the model’s accuracy. We compared the ability of the models to distinguish between patients with CHL and SNHL (stapedectomy vs. cochlear implant patients) and within the cochlear implant group evaluated the ability of the models to distinguish between patients with and without residual hearing.
Figure 3

Representative machine learning (ML) experiment mapped for sensorineural hearing loss with and without residual hearing using Azure Machine Learning Studio (Microsoft Corporation).

Representative machine learning (ML) experiment mapped for sensorineural hearing loss with and without residual hearing using Azure Machine Learning Studio (Microsoft Corporation).

Evaluation of permutation features

Data analysis was performed along two fronts to understand pathogenesis and look for patterns of expression unique to different disease types. miRNAs identified as significantly influencing the model development were identified. Using Ingenuity Pathway Analysis (IPA) software (Qiagen Bioinformatics) these miRNAs were analysed alongside a human inner ear cDNA library. Known and highly predicted miRNA cochlear mRNA interactions were identified and mapped out to delineate and understand the different regulatory pathways.
  7 in total

1.  Establishment of a prediction tool for ocular trauma patients with machine learning algorithm.

Authors:  Seungkwon Choi; Jungyul Park; Sungwho Park; Iksoo Byon; Hee-Young Choi
Journal:  Int J Ophthalmol       Date:  2021-12-18       Impact factor: 1.779

2.  Objective Assessment System for Hearing Prediction Based on Stimulus-Frequency Otoacoustic Emissions.

Authors:  Qin Gong; Yin Liu; Runyi Xu; Dong Liang; Zewen Peng; Honghao Yang
Journal:  Trends Hear       Date:  2021 Jan-Dec       Impact factor: 3.293

3.  Proteome profile of patients with excellent and poor speech intelligibility after cochlear implantation: Can perilymph proteins predict performance?

Authors:  Martin Durisin; Caroline Krüger; Andreas Pich; Athanasia Warnecke; Melanie Steffens; Carsten Zeilinger; Thomas Lenarz; Nils Prenzler; Heike Schmitt
Journal:  PLoS One       Date:  2022-03-03       Impact factor: 3.240

4.  Isolation of sensory hair cell specific exosomes in human perilymph.

Authors:  Pei Zhuang; Suiching Phung; Athanasia Warnecke; Alexandra Arambula; Madeleine St Peter; Mei He; Hinrich Staecker
Journal:  Neurosci Lett       Date:  2021-10-04       Impact factor: 3.197

Review 5.  Precision machine learning to understand micro-RNA regulation in neurodegenerative diseases.

Authors:  Lucile Mégret; Cloé Mendoza; Maialen Arrieta Lobo; Emmanuel Brouillet; Thi-Thanh-Yen Nguyen; Olivier Bouaziz; Antoine Chambaz; Christian Néri
Journal:  Front Mol Neurosci       Date:  2022-09-09       Impact factor: 6.261

6.  Distinct MicroRNA Profiles in the Perilymph and Serum of Patients With Menière's Disease.

Authors:  Matthew Shew; Helena Wichova; Madeleine St Peter; Athanasia Warnecke; Hinrich Staecker
Journal:  Front Neurol       Date:  2021-06-16       Impact factor: 4.003

Review 7.  Extracellular Biomarkers of Inner Ear Disease and Their Potential for Point-of-Care Diagnostics.

Authors:  Sahar Sadat Mahshid; Aliaa Monir Higazi; Jacqueline Michelle Ogier; Alain Dabdoub
Journal:  Adv Sci (Weinh)       Date:  2021-12-26       Impact factor: 16.806

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.