Literature DB >> 27635401

BP Neural Network Could Help Improve Pre-miRNA Identification in Various Species.

Limin Jiang¹, Jingjun Zhang², Ping Xuan³, Quan Zou⁴.

Abstract

MicroRNAs (miRNAs) are a set of short (21-24 nt) noncoding RNAs that play significant regulatory roles in cells. In the past few years, research on miRNA-related problems has become a hot field of bioinformatics because of miRNAs' essential biological function. miRNA-related bioinformatics analysis is beneficial in several aspects, including the functions of miRNAs and other genes, the regulatory network between miRNAs and their target mRNAs, and even biological evolution. Distinguishing miRNA precursors from other hairpin-like sequences is important and is an essential procedure in detecting novel microRNAs. In this study, we employed backpropagation (BP) neural network together with 98-dimensional novel features for microRNA precursor identification. Results show that the precision and recall of our method are 95.53% and 96.67%, respectively. Results further demonstrate that the total prediction accuracy of our method is nearly 13.17% greater than the state-of-the-art microRNA precursor prediction software tools.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
MicroRNAs
RNA Precursors

Year: 2016 PMID： 27635401 PMCID： PMC5011242 DOI： 10.1155/2016/9565689

Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411

1. Introduction

MicroRNAs are some of the most important noncoding RNA genes with rather short length. They regulate the expression of whole organism genes at the posttranscriptional level [1]. miRNA is widely involved in the metabolic activity of the body as well as in many important life processes, including cell proliferation and apoptosis, cell differentiation, growth and development of plants and animals, and organ formation [2-4]. Recently, several studies have shown that microRNAs are related to several cancers [5-7] and other diseases [8-10]. Caligiuri et al. [11] proposed that methods and compositions involving miRNAs are useful for the treatment of various diseases and cancers. Some miRNAs are regarded as potential therapeutic targets for various diseases [12]. Recently, the target gene (cancer gene) drugs, which developed in accordance with the theory on miRNA's gene silencing, have been used for incurable disease that has become a threat to human health problems for years [13]. In addition, the viral genome can encode a large number of miRNAs by itself. Through combination with target genes and coding by viruses or host cell, these miRNAs can lead to immune escape or antiviral effect against the host cell. Therefore, the accurate prediction of miRNA and its target genes, as well as the correct understanding of miRNA mechanism, has important practical significance in medical treatments. Thus, the research on novel miRNA identification is rather essential. Feature selection mainly dominated the performance of the prediction model in the machine learning process [14-20]. In addition, effective features can represent the characteristics of the entire sequence data, which enables easy-to-build better prediction model. To represent the microRNA precursors, Xue et al. [21] proposed 32D novel triplet features, which involved secondary structure information. Jiang et al. [22] found that random rearrangement of the sequence could help obtain significant free-energy features. However, the free-energy computation for many random rearrangement sequences is very time consuming. Wei et al. [23] combined Xue et al.'s features and triplet nucleotide frequency to 98D features and obtained good performance result in human pre-miRNA identification. However, more features would not mean better performance because of some irrelevant and redundant features in the high dimensional or ultra-high dimensional feature set. The purpose of feature selection is to eliminate the irrelevant and redundant features of the feature set. In addition, the training time could be reduced effectively by the feature selection optimization [24]. Some studies focus on developing computational predictors by incorporating the sequence-order or structure-order effects [25, 26]. Several works indicated that proper features could improve the prediction performance of classification in a certain extent. For example, Wang et al. [27] employed the feature selection techniques to optimize the features in miR-SF. They proved that an optimized feature subset could improve the prediction performance. In addition, some popular recently proposed multiobjective optimization evolutionary algorithms can also be used as a possibly promising feature selection approach [28, 29]. Another factor that affects the performance of machine learning prediction method is the classifier algorithm. The selection of different classifiers often leads to the difference of classification results. Several different classifiers and strategies were employed for miRNA identification. Bayesian classifier algorithm was tested for predicting miRNA across different species in 2006 [30]. The method also utilized the multiple species of miRNA sequences and structural features. It proved that miRNA genes could be detected effectively in large scale of different species genomes. MiPred classifier was tested for predicting miRNA in 2007 [22]. The method utilized random forest classifier algorithm. The prediction accuracy of MiPred is 10% higher than that of Triplet-SVM; the sensitivity and specificity of MiPred can reach to 95.09% and 98.21%. CSHMM classifier was also used for mining miRNA sequences from the genome [31], which utilized the Markov model. Overall, the accuracy of machine learning algorithm was up to 90%. The machine learning method is more accurate than the other methods. In this study, we chose backpropagation neural network as the classifier. It has three advantages, including better generalization performance, faster learning speed, and good learning ability.

2. miRNA Identification with BP Neural Network

2.1. Pre-miRNA Features

2.1.1. n-Gram Frequency

Some studies showed that the local primary sequence is crucial to the pre-miRNA sequence [32]. Thus, the n-gram frequency is often applied for the feature map in the selection of the primary sequence feature [33, 34]. However, no good methods are still available for tuning the value of n. In general, we choose n by comparing the effect of n-gram frequency with different n-values. In our feature set, we select the different n values (n = 2,3, 4) for comparison. The different frequency characteristics have almost the same effect on the classifier. Thus, consider that its base and adjacent base have practical biological significance. We chose n as 3. A total of 64 (64 = 43)-dimensional frequency features were calculated.

2.1.2. Triple Structure Sequence

In addition to high specificity of the primary sequence features, the secondary structure sequence of pre-miRNA is also a contributing factor. To analyze the contribution of the secondary structure, the secondary structure prediction software RNAfold is used to calculate the potential structures. In the secondary structure, each nucleotide of the sequence corresponds to two states, matching and nonmatching: record matching as “(” or “)” and nonmatching as “·.” In the structure, three character groups are considered as a unit, and every “)” is replaced as “(.” Thus, 8 (8 = 23) different combinations are available as a unit, including “(((,” “((·,” “(·(,” “·((,” “(··,” “·(·,” “··(,” and “···.” To characterize pre-miRNA sequence better, the first nucleotide of the corresponding subsequence was added to the front of each structure unit. This provides 32 (32 = 4 × 8) different combinations, that is, “A(((,” “U((·,” …, “G·((,” “G···.” For a sequence, the occurrence frequency of each combination is determined and coded into the 32D feature vector as the input of the classifier. This calculated 32D triple structure sequence feature is used to train the SVM classifier; the inclusion of the SVM classifier significantly improved the classification ability of pre-miRNA sequences [21].

2.1.3. Energy Characteristics

The real pre-miRNA sequences are generally more stable and show a lower minimum of free energy (MFE) than the randomly generated pre-miRNA. Therefore, energy characterization is often used to describe the structure pre-miRNA sequence as an aspect of feature extraction of the pre-miRNA sequence. To do this, the MFE value is obtained by using RNAfold to calculate the structure.

2.1.4. Structural Diversity Characteristics

The potential for nucleotide pairing in the sequence is a significant characteristic that can also be used to describe the pre-miRNA sequence. This includes both traditional Watson-Crick nucleotide pairing (A–U pairing and C–G pairing) and also other forms of nucleotide pairing, such as the G–U pairing that can occur in the loop of RNA hairpin structures. We included possible G–U pairing in our description of base pairing. To summarize, we extracted 98 features for the input of the neural network, including 64-dimensional n-gram frequency characteristics, 32-dimensional triple structure sequence characteristics, one-dimensional energy feature, and one-dimensional structural diversity characteristics.

2.2. Fixing the Number of Nodes in the Hidden Layer

In general, to select the number of nodes in the hidden layer in changing the BP neural network structure is difficult. Technically, a hidden layer could facilitate operation. However, too many hidden layers can reduce the operation rate. Currently, no theoretical methods are available to fix the number of nodes in the hidden layer. However, the number generally depends on the empirical formula, as calculated in where M represents the neuron number of the hidden layers, N is the neuron number of the input layers, L is the neuron number of the output layers, and α is a constant between 1 and 10. In this study, N = 98 and L = 1. Therefore, (1) can be used for any values between 11 and 20. A comprehensive analysis of the training results with different numbers of nodes in the hidden layer was performed with the error set to 0.0001. A total of 621 samples were used to train the network, and one sample was used to test the network. The results are shown in Table 1.

Table 1

Corresponding training results with different numbers of nodes in the hidden layers.

Hidden layers	Training times	Training errors	Hidden layers	Training times	Training errors
11	43	9.57718e − 005	12	39	9.88418e − 005
13	17	9.42136e − 005	14	65	9.92537e − 005
15	34	9.88206e − 005	16	74	8.38658e − 005
17	48	7.82527e − 005	18	157	6.63468e − 005
19	7	9.46711e − 005	20	47	9.3627e − 005

From the data shown in Table 1, the increased number of nodes in the hidden layer did not result in better convergence. Additionally, the increased number of nodes increased the network parameters and greatly increased the amount of calculation of the classifier. Thus, keeping 13 nodes in the hidden layers required relatively less training times and less error and still produced relatively good training effects.

2.3. Fixing the Number of Nodes in the Output Layer

Two kinds of output exist, positive and negative, which are represented as 1 for a positive sample and 0 for a negative sample. The topology structure of this prediction method based on BP neural network is shown in Figure 1.

Figure 1

Topology structure of the BP neural network.

2.4. Selecting Training and Test Model Samples

The collection and organization of training samples are often limited by the objective conditions. Appropriate numbers of training samples are required to achieve sufficient precision. Therefore, it refers to the rule of experience:where P represents the numbers of training samples and P is the total of network connection weight equal to the sum of nodes of the input and hidden layers. In this study, 2236 samples were used for training. The data set used for the pre-miRNAs was downloaded from http://bioinf.sce.carleton.ca/SMIRP [35], and these data include negative and positive samples for Arabidopsis lyrata. The FASTA file was converted to ARFF file using a jar package written by Java converting the reference index to numerical form. We randomly selected real pre-miRNAs and pseudo pre-miRNAs to evaluate our algorithm.

2.5. Error Evaluation Steps Based on BP

The structure of the intelligent diagnosis model contains three layers of 98-13-1. First, we set the nodes of the input, output, and hidden layers as N, M, and L, respectively. Assuming the training sample set {ξ , Y} ⊂ R × R , the weight matrix between the input and hidden layers can be written as V = (v ), where V = (v , v ,…,v ) ∈ R and m = 1,2,…, M. We assume the connection weight matrix between the hidden and output layers as W = (w ), where W = (v , v ,…,v ) ∈ R , l = 1,2,…, L. Then, respectively, take g and f as the activation function of each node of the hidden and output layers. To simplify the derivation, we use the vector function G(X) for X = (x 1, x 2,…,x ) ∈ R , where G(X) = (g(x 1), g(x 2),…,g(x )) ∈ R . After input of the sample ξ ∈ R , the actual output can be calculated by The error function is defined in Objectively, the target of BP training is to compute the W and V to minimize the solution of the error function E(W, V). With this, a combination of gradient descent, common, and simple derivatives was used. To simplify the derivation process, we derive Then, the error function can be written as The corresponding gradient function of W and V can then be expressed as For arbitrary initial values of W 0 ∈ R and V 0 ∈ R , gradient descent rules to modify the weight of the BP learning algorithm are applied in where η represents the learning efficiency. ΔW is the partial derivative of the error function relative to W. ΔV is the partial derivative of the error function relative to V.

2.6. Selection of Training Functions and Related Parameters

The above analysis allows fixing of the BP neural network structure. Table 2 shows the chosen training functions and the relevant parameters.

Table 2

Basic parameters of the classifier based on BP neural network.

Setting items	The value set
The learning rate	0.1
Error bounds	0.0001
The number of iterations	1000
Transfer function of hidden layer nodes	Tansig
Transfer function of output nodes	Purelin
The training function	Trainlm

This condition allows establishment of a complete classifier based on BP neural network structure. The model generation and training are summarized in Figure 2.

Figure 2

Process flow of model generation and training.

2.7. Measurement

The use of pattern recognition and machine learning methods can be used as a two-way classification problem. Four kinds of prediction results are presented in Table 3.

Table 3

Measurements for the classification problems.

Classification result
Actual result	Forecast result
Actual result	P	N
P	TP	FN
N	FP	TN

The four kinds of prediction results are true positive (TP), the number of positive cases that were correctly predicted; false positive (FP), the number of positive cases represented by error prediction; true negative (TN), the number of counter negative examples that were correctly predicted; and false negative (FN), the number of negative cases represented by error prediction. Many evaluation indicators can be used for the classification results. First, the accuracy rate (ACC) is the ratio of the correctly predicted cases for the entire data set. Precision and recall can also be used as evaluation indicators in tests of pattern recognition models. Precision is expressed as the ratio of the correctly predicted values for the entire positive data set and recall reflects the number correctly judged as positive examples in the positive example test set [36]. The above three indicators are expressed in Additionally, sensitivity and specificity parameters may be used to evaluate the function of the model. Sensitivity record (SE) is the same as the recall and specificity record (SP) calculated in accordance with A challenge may be presented if the positive and negative test sets are unbalanced in the study of biological information. In most cases, the number of positive samples is far less than the number of negative samples. In a few cases, the number of positive samples may be much larger than the number of negative samples. We can easily obtain ACC-SP when the number of positive samples is greater than the negative samples. In this case, the classifier only reflects the classification effect of the negative samples and is unable to accurately express the prediction effect of the classifier on the entire test data set. To solve this problem, researchers typically use the geometric mean (Gm) as described in Matthew's correlation coefficient (MCC) [16, 21, 37, 38] can provide more equitable response forecast ability when a large difference exists between the number of positive samples and the number of negative samples. MCC can be expressed as Currently, studies on miRNA commonly use one or more of these above evaluation indices. In this work, we estimate the overall performance of the classifier by analysis of ACC, SE, SP, Gm, and MCC.

3. Results and Discussion

3.1. Analysis of Feature Set Performance

To select a better feature set for classification, we needed to determine the effect of different feature subsets on the performance of the classifier. To do this, we used the BP neural network method with the same training set (553 positive samples and 1150 samples) to test different feature sets, with the results shown in Table 4.

Table 4

Comparison of classification results based on different feature sets.

Features	SP (%)	SE (%)	Gm (%)	ACC (%)
B	67.89	68.25	68.07	68.00
C	92.74	76.42	84.19	88.03
A + B	91.79	90.41	91.10	91.31
A + C	94.03	80.85	87.19	89.67
B + C	96.12	85.21	90.50	92.49
A + B + C	96.33	86.51	91.29	93.42

Notes: A: energy feature and structural diversity; B: 32-dimensional triad structure characteristic; C: 64-dimensional n-gram frequency characteristics.

From Table 4, we learn that the accuracy of the entire feature sets can be as high as 93.42%. This result indicates that our feature set is more effective for processing of a more complex structure or sequence diversity. Considering that the feature sets used here are not very large and each feature subset is highly independent, reducing the dimension of the feature vector is no longer needed.

3.2. Performances of BP

V-fold cross-validation with moderate computational complexity is widely used for model selection. The selection of V is important because V not only determines the number of samples but also determines the computational complexity. Usually, a value of V between 5 and 10 is selected based on experience. Statistical performance shows little improvement when V selection is greater than 10. Again, computational complexity must be considered; thus a value between 5 and 10 is best [32]. We divided the samples into two cases for training and testing. In the first one, a large difference was observed between the number of positive and negative samples: 518 positive samples and 1078 negative samples as the training set and 166 positive samples and 366 negative samples for the test set. The second case included equal numbers of positive and negative samples: 552 positive samples and 552 negative samples as the training set and 138 positive samples and 138 negative samples for the test set. These training and testing were repeated five times. The testing performance is shown in Figures 3 and 4.

Figure 3

Different test results for varying sample quantities.

Figure 4

Different test results for same sample quantity.

From comparison of the data in Figures 3 and 4, no significant difference was observed between the actual output and the expected output of each test. As described above, the evaluation of the reference index is shown in Table 5.

Table 5

Evaluation of the reference index.

	Training sample	Test sample	Output sample	Correct sample	Precision (%)	Recall (%)	Gm (%)
D
Positive	553	138	128	124	96.0	90.0	93.43
Negative	1150	287	296	282	95.38	98.19	93.43

E
Positive	552	138	136	128	94.10	92.82	93.98
Negative	552	138	140	130	92.87	94.12	93.98

Note: D: sample set has different numbers of positive and negative samples; E: the sample set has equal numbers of positive and negative samples; correct sample: the number of correct predictions.

From the data presented in Table 5, the number of samples affects the accuracy and recall rate of the positive samples. In particular, the precision and recall rate of the negative samples decreased with the decrease in the number of negative samples in the training set. This result indicates that the more the samples in the training process, the better the classification effect of the classifier. At the same time, the precision and recall rate of the number of positive samples were affected. With the number of negative samples in the training set increased, the number of correct predictions increased by four and the number of error predictions was reduced by eight. This result shows that the precision and recall rate of the positive samples decreased with the increase in the number of the negative samples.

3.3. Comparison with Other Methods

The performance of our method was compared with other methods: J48, random forest, LibD3C [39], Adaboost, string kernel SVM [40], LibSVM, and GBDT, which were classified on the same data set. The data set contains 691 real pre-miRNAs and 1437 pseudo pre-miRNAs. As shown in Table 6 and Figure 5, the results demonstrate that the total prediction accuracy of our method is 13.64% greater than the string kernel SVM model and nearly 2% greater than the LibD3C and LibSVM models. The overall performance of the models as measured by MCC was in the following order: GBDT (0.8682), BP (0.8662), LibSVM (0.8510), LibD3C (0.8510), Adaboost (0.8120), random forest (0.7720), J48 (0.7200), and string kernel SVM (0.6002).

Table 6

Comparison of the BP with alternative models.

	ACC	Precision	Recall	MCC
BP	95.53%	96.00%	96.67%	0.8662
GBDT	94.27%	94.76%	96.87%	0.8682
LibSVM	93.52%	93.60%	93.50%	0.8510
LibD3C	93.52%	93.50%	93.50%	0.8510
Adaboost	91.82%	91.80%	91.80%	0.8120
Random forest	90.13%	90.00%	90.10%	0.7720
J48	87.78%	87.70%	87.80%	0.7200
String kernel SVM	81.89%	99.37%	46.31%	0.6002

Figure 5

Comparison results of different models.

Thus, we conclude that the BP method allows improved recognition accuracy.

3.4. Performance on Different Species

To demonstrate the validity and the universal applicability of the BP method, we analyzed six other species: Anolis carolinensis, Arabidopsis thaliana, Drosophila melanogaster, Drosophila pseudoobscura, Epstein-Barr virus, and Xenopus tropicalis. The results shown in Figure 6 indicate that the accuracy of the GBDT is better than BP method in some situations, but the BP method has been achieved fairly good results in terms of ACC, precision, recall, and MCC.

Figure 6

Test comparison results for six different species.

4. Conclusions

Identification of miRNAs is the first step toward understanding their biological characteristics. Many approaches have been proposed to predict pre-miRNAs in recent years. However, feature extraction in these methods can result in information redundancy. To overcome this drawback, a BP neural network algorithm together with optimal 98D features was employed for this analysis. We compare our method with the existing methods of J48, random forest, LibD3C, Adaboost, GBDT, string kernel SVM, and LibSVM, which were trained on the same training data set. The results demonstrate that the total prediction accuracy of our method is 13.17% greater than the string kernel SVM model and nearly 2% greater than LibD3C and LibSVM. After the identification step, functional analysis is also important for miRNA research. If human miRNA and diseases were focused on, two main approaches would be employed to predict the relationship. The first one is the statistical comparison analysis for the miRNA or isomiR expression [41]. The second one is the network analysis and prediction for miRNA-disease relationship [42-45]. Several advanced machine learning, network techniques, and bioinspired models can be utilized on this problem, including random forest [46], semisupervised learning [47], HeteSim Scores [48], spiking neural P systems [49-52], and membrane computing_ENREF_51 [53-57]. Functional analysis of the novel detected miRNAs would be our future works.

43 in total

1. Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences.

Authors: Eric Bonnet; Jan Wuyts; Pierre Rouzé; Yves Van de Peer
Journal: Bioinformatics Date: 2004-06-24 Impact factor: 6.937

2. ncRDeathDB: A comprehensive bioinformatics resource for deciphering network organization of the ncRNA-mediated cell death system.

Authors: Deng Wu; Yan Huang; Juanjuan Kang; Kongning Li; Xiaoman Bi; Ting Zhang; Nana Jin; Yongfei Hu; Puwen Tan; Lu Zhang; Ying Yi; Wenjun Shen; Jian Huang; Xiaobo Li; Xia Li; Jianzhen Xu; Dong Wang
Journal: Autophagy Date: 2015 Impact factor: 16.016

3. Combining multi-species genomic data for microRNA identification using a Naive Bayes classifier.

Authors: Malik Yousef; Michael Nebozhyn; Hagit Shatkay; Stathis Kanterakis; Louise C Showe; Michael K Showe
Journal: Bioinformatics Date: 2006-03-16 Impact factor: 6.937

Review 4. Briefing in family characteristics of microRNAs and their applications in cancer research.

Authors: Qicong Wang; Leyi Wei; Xinjun Guan; Yunfeng Wu; Quan Zou; ZhiLiang Ji
Journal: Biochim Biophys Acta Date: 2013-08-14

5. Representation of fluctuation features in pathological knee joint vibroarthrographic signals using kernel density modeling method.

Authors: Shanshan Yang; Suxian Cai; Fang Zheng; Yunfeng Wu; Kaizhi Liu; Meihong Wu; Quan Zou; Jian Chen
Journal: Med Eng Phys Date: 2014-08-03 Impact factor: 2.242

6. On some classes of sequential spiking neural p systems.

Authors: Xingyi Zhang; Xiangxiang Zeng; Bin Luo; Linqiang Pan
Journal: Neural Comput Date: 2014-02-20 Impact factor: 2.026

7. Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources.

Authors: Yuansheng Liu; Xiangxiang Zeng; Zengyou He; Quan Zou
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2016-04-05 Impact factor: 3.710

8. Prediction and Validation of Disease Genes Using HeteSim Scores.

Authors: Xiangxiang Zeng; Yuanlu Liao; Yuansheng Liu; Quan Zou
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2016-02-12 Impact factor: 3.710

9. Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis.

Authors: Hui Ding; Peng-Mian Feng; Wei Chen; Hao Lin
Journal: Mol Biosyst Date: 2014-08

10. RBMMMDA: predicting multiple types of disease-microRNA associations.

Authors: Xing Chen; Chenggang Clarence Yan; Xiaotian Zhang; Zhaohui Li; Lixi Deng; Yongdong Zhang; Qionghai Dai
Journal: Sci Rep Date: 2015-09-08 Impact factor: 4.379

10 in total

1. CL-PMI: A Precursor MicroRNA Identification Method Based on Convolutional and Long Short-Term Memory Networks.

Authors: Huiqing Wang; Yue Ma; Chunlin Dong; Chun Li; Jingjing Wang; Dan Liu
Journal: Front Genet Date: 2019-10-11 Impact factor: 4.599

2. New 3D graphical representation for RNA structure analysis and its application in the pre-miRNA identification of plants.

Authors: Xiangzheng Fu; Bo Liao; Wen Zhu; Lijun Cai
Journal: RSC Adv Date: 2018-09-03 Impact factor: 4.036

3. A novel method to identify pre-microRNA in various species knowledge base on various species.

Authors: Tianyi Zhao; Ningyi Zhang; Ying Zhang; Jun Ren; Peigang Xu; Zhiyan Liu; Liang Cheng; Yang Hu
Journal: J Biomed Semantics Date: 2017-09-20

4. MiR-93-5p Promotes Cell Proliferation through Down-Regulating PPARGC1A in Hepatocellular Carcinoma Cells by Bioinformatics Analysis and Experimental Verification.

Authors: Xinrui Wang; Zhijun Liao; Zhimin Bai; Yan He; Juan Duan; Leyi Wei
Journal: Genes (Basel) Date: 2018-01-22 Impact factor: 4.096

5. Prediction of N-linked glycosylation sites using position relative features and statistical moments.

Authors: Muhammad Aizaz Akmal; Nouman Rasool; Yaser Daanial Khan
Journal: PLoS One Date: 2017-08-10 Impact factor: 3.240

6. An improved method for identification of small non-coding RNAs in bacteria using support vector machine.

Authors: Ranjan Kumar Barman; Anirban Mukhopadhyay; Santasabuj Das
Journal: Sci Rep Date: 2017-04-06 Impact factor: 4.379

7. Missing Value Estimation Methods Research for Arrhythmia Classification Using the Modified Kernel Difference-Weighted KNN Algorithms.

Authors: Fei Yang; Jiazhi Du; Jiying Lang; Weigang Lu; Lei Liu; Changlong Jin; Qinma Kang
Journal: Biomed Res Int Date: 2020-06-21 Impact factor: 3.411