Literature DB >> 34840707

A Deep Learning-Based Text Classification of Adverse Nursing Events.

Wenjing Lu¹, Wei Jiang¹, Na Zhang¹, Feng Xue¹.

Abstract

Adverse nursing events occur suddenly, unpredictably, or unexpectedly during course of clinical diagnosis and treatment processes in the hospitals. These events adversely affect the patient's diagnosis and treatment results and even increase the patient's pain and burden. Additionally, It is high likely to cause accidents and disputes and affect normal medical work and personnel safety and is not conducive to the development of the health system. Due to the rapid development of modern medicine, health and safety of patients have become the most concerned issue in society and patient safety is an important part of medical care management. Research and events have shown that classified management of adverse nursing events, event analysis, and improvement measures are beneficial, specifically to the health system, to continuously improve the quality of medical care and reduce the occurrence of adverse nursing events. In the management of adverse nursing events, it is very important to categorize the text reports of adverse nursing events and divide these into different categories and levels. Traditional reports of adverse nursing events are mostly unstructured and simple data, often relying on manual classification, which is difficult to analyze. Furthermore, data is relatively inaccurate and practical reference significance is not obvious. In this paper, we have extensively evaluated various deep learning-based classification methods which are specifically designed for the healthcare systems. It becomes possible with the development of science and technology; text classification methods based on deep learning are gradually entering people's field of vision. Additionally, we have proposed a text classification model for adverse nursing events in the health system. Experiments and data comparison test of both the proposed deep learning-based method and existing methods in the text classification of nursing adverse events effect are performed. These results show the exceptional performance of the proposed mechanism in terms of various evaluation metrics.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34840707 PMCID： PMC8616661 DOI： 10.1155/2021/9800114

Source DB: PubMed Journal: J Healthc Eng ISSN： 2040-2295 Impact factor: 2.682

1. Introduction

Due to the rapid development of information technology and the continuous update of hospital information systems, the current nursing data is showing explosive growth. Additionally, hospital nursing adverse event records, complex data types, structured data, and texts and other unstructured data coexist in large quantities [1, 2]. A challenging issue is how to integrate large data technology with adverse nursing event data and processes it to obtain comprehensive, efficient, and accurate prediction results. This area has become an urgent need for nursing adverse event research [3, 4]. In recent years, medical and nursing safety has received more and more attention. Adverse events have a high incidence, large impact, and serious consequences, which may cause disability or death of patients. Likewise, it affects the personal safety of medical and nursing staff, prolongs the hospitalization time of patients, and increases medical costs and economic burdens [5]. In clinical work, there are many opportunities and long time for nursing staff to contact patients as nursing work is heavy and cumbersome and mistakes commonly occur [6]. At present, the number of adverse nursing events is close to half of all adverse medical events [7, 8]. Therefore, it is very important to pay attention to the management and prevention of nursing safety and control occurrence of adverse nursing events. At present, most hospitals have introduced a nursing adverse event reporting system, which can realize the reporting, review, and simple statistical functions of nursing adverse event. However, in the reporting stage, the content of nursing adverse event reporting is not standardized and cannot be unified and institutionalized. The reporting standards of various medical institutions are not uniform which result in unstructured reported content such as narration and description of the process of the event. Moreover, these procedures lack reasonable classification features and problems such as difficult manual analysis and many human factors [9, 10]. In addition, nursing adverse events focus on identifying and analyzing problems. Nursing staff may have problems such as underreporting and artificially lowering the level of adverse events due to a variety of factors. Aiming at the problem of how to intelligently analyze unstructured texts in adverse nursing events and reduce the impact of human factors, Cao and Ball [11] have developed and implemented a hospital nursing adverse event reporting system based on the life cycle of system development. However, the analysis of adverse nursing events in the system is mainly for structured data. Similarly, Clark [12] uses Bayesian algorithm to analyze the correlation between observations of nonclinical adverse events and observations of the same events of many approved drugs in clinical trials. Tomita et al. used the text mining studio tool to perform adverse event analysis on relevant medical text data such as electronic health records in nursing services [13]. Roy et al. proposed a machine learning model to improve the current evaluation and prediction techniques for risk of adverse events related to a variety of chronic diseases [14]. Dev et al. compared traditional machine learning and deep learning methods to automatically classify adverse events in pharmacovigilance [15]. Song Jie et al. have verified that natural language processing is unstructured for the analysis of adverse events in nursing. The feasibility of the text proves that natural language processing technology can effectively identify the unstructured text of adverse nursing events [16]. Kim designed the TEXT CNN model and used CNN (Convolution Neural Network) for text classification, which is simple and efficient [17]. Yin found through comparative experiments that CNN has a shorter training time compared with RNN (Recurrent Neural Network) in text classification, and its effect is good [18]. According to the aforementioned techniques, the current research, particularly those which are based on nursing adverse events, still needs to realize the statistical analysis of structured and unstructured text information as described. Additionally, this data needs to be processed and results of adverse events need to be reported. Information is very important and it is very important for nonstructural nursing adverse events. There are relatively few researches on textual information [19]. To make full use of the effective information in the unstructured text in the nursing adverse events, avoid dependence on characteristics, and improve the accuracy of the adverse event level prediction, deep learning based methods were reported to process natural language tasks. To address these issues, we have proposed a character-level deep learning technique-enabled Chinese text classification model. The proposed model does not need to use pretrained word vectors, grammatical structure, and other information and has the capacity to avoid the problem of dimensional disasters when solving nonlinear problems. Additionally, the proposed model is easy to realize the rapid classification of multiple languages and has a better classification effect. A graphical representation of the proposed deep learning-based text classification model is presented in Figure 1.

Figure 1

A graphical representation of the CNN model.

The remaining of the manuscript is organized according to the following structure as presented below. In subsequent section, a comprehensive review of the related literature is presented particularly with identified problems in every exiting method if any.

2. Related Work

Traditional text classification methods refer to text classification methods which are based on shallow machine learning models. The process is roughly divided into five (05) submodules as depicted in Figure 2.

Figure 2

The general process of traditional text classification.

Text pretreatment Text representation Feature dimension reduction Classifier construction Effect evaluation

2.1. Traditional Text Classification Methods

Text classification is a very classic problem in the field of natural language processing. Related research is traced back to the 1950s. At that time, it was classified by expert rules (pattern) and it even developed to the beginning of the 1980s. The advantage of using knowledge engineering to establish an expert system is that it solves significant problems quickly and easily, but it is obviously time-consuming and labor-intensive, and the coverage and accuracy are very limited. Later, with the development of statistical learning methods, especially the increase in number of online texts on the Internet after the 1990s and the rise of machine learning disciplines, a set of classic models for solving large-scale text classification problems has gradually formed. The main routine at this stage is artificial feature engineering. The process of training the text classifier is shown in Figure 3 below.

Figure 3

Feature engineering.

The whole text classification problem is split into two parts: feature engineering and classifier.

2.2. Feature Engineering

It is often the most time-consuming and labor-intensive in machine learning, but it is extremely important. Generally, machine learning problem is the process of converting data into information and then refining it to knowledge. The characteristic is the process of “data-to-information,” which determines the upper limit of the result, and the classifier is “information-to-knowledge.” The process is to approach this upper limit. However, feature engineering is different from the classifier model and does not have strong versatility. It often requires an understanding of feature tasks. The natural language field where the text classification problem is located naturally has its own unique feature processing logic, and most of the work on traditional subclassification tasks is also here. Text feature engineering includes three parts: text pretreatment, feature extraction, and text representation. The goal is to convert the text into a computer-understandable format and encapsulate enough information for classification, that is, a strong feature expression ability [20, 21]. The text pretreatment stage mainly includes operations such as text segmentation and removal of stop words. The English text also involves operations such as spell checking, stemming, or morphological restoration. When text segmentation is carried out, English text can be used as natural word segmentation due to the existence of spaces between words. Likewise, Chinese text segmentation directly matches words which are based on string matching methods or N-gram model or hidden Markov algorithm; models or algorithms such as conditional random field algorithms measure the probability of characters forming a word according to the cooccurrence frequency or probability of characters [22]. The method which is based on string matching cannot handle unregistered words whereas word segmentation methods, preferably those based on machine learning related models, require manual construction of features, which has a large amount of engineering, and construction quality will also affect the word segmentation.

2.3. Text Pretreating

It is the process of extracting keywords, specifically from text, to represent it. Chinese text processing mainly includes two stages: text segmentation and stop word removal. The reason for word segmentation is that various studies have shown that the feature granularity is much better than word granularity, because most classification algorithms do not consider word order information, which obviously loses too much “n-gram” information based on word granularity. Specific to Chinese word segmentation, unlike English, which has natural space intervals, it is necessary to design a complex word segmentation algorithm. Traditional algorithms mainly include forward/reverse/two-way maximum matching based on string matching, syntactic and semantic analysis disambiguation based on understanding, and mutual information/CRF method based on statistics. In recent years, with the development of deep learning, the word embedding + Bi-LSTM + CRF method has gradually become the mainstream. Stop words are high-frequency pronouns, conjunctions, prepositions, and other words that are meaningless to text classification. A stop vocabulary is usually maintained. The words appearing in the stop list are deleted during the feature extraction process, which is essentially a part of feature selection. The purpose of text representation is to convert pretreated text into a computer-understandable way, which is most important part of determining the quality of text classification. Traditionally, bag-of-words model (BOW) or the vector space model (VSM) is commonly used. The biggest disadvantage is that it ignores the contextual relationship of the text. Words are independent of each other and cannot represent semantic information. An example of the bag-of-words model is as follows: (0, 0, 0, 0, ..., 1, ... 0, 0, 0, 0). The size of the word database is at least one million; the model has two biggest problems which are high latitude and sparsity. The bag-of-words model is the basis of the vector space model, so the vector space model reduces the dimension through feature item selection and increases the density through feature weight calculation.

2.4. Feature Extraction

The feature extraction of the text representation method of the vector space model corresponds to the selection of feature items and the calculation of feature weights. The basic idea of feature selection is to independently rank the original feature items (terms) according to a certain evaluation index, select some of the feature items with the highest scores, and filter out the remaining feature items. Commonly used evaluations include document frequency, mutual information, information gain, X2 statistics, etc. The feature weight is mainly the classic TF-IDF method and various extensions [23]. The main idea is that the importance of a word is proportional to the word frequency in the category and inversely proportional to the number of occurrences of all categories.

2.5. Semantic-Based Text Representation

In addition to the vector space model, traditional methods of text representation based on semantics also have semantic-based text representation methods, such as LDA topic models, LSI/PLSI probabilistic latent semantic indexing, and other methods. It is generally considered that the text representation obtained by these methods is a deep representation of the document. Classifiers are basically statistical classification methods. Basically, most machine learning methods are applied in the field of text classification, such as Naïve Bayes classification algorithm (Naïve Bayes), KNN, SVM, maximum entropy, and neural networks. The text representation aims to transform the pretreated text into a format that can be recognized and processed by a computer. Common text representation models include Boolean model, vector space model, probability model, etc. However, these models either do not consider semantic relations and text relevance, or ignore relevance and positional relationship between feature words, or it is easy to form high-dimensional vectors and sparse vectors. It not only causes the loss of classification information, but also increases computational overhead.

2.6. Feature Dimensionality Reduction

It mainly includes feature selection and feature extraction. In the text representation stage, high-dimensional vectors are prone to appear when using one-hot models to represent text, which increases computational complexity and time consumption. Therefore, it is necessary to generate low-dimensional feature vectors to minimize the loss of classification information. Feature selection refers to constructing a feature vector from the feature word set that best represents the feature. Commonly used methods include information gain, document frequency, chi-square statistics, and mutual information, etc.

2.7. Feature Extraction

It refers to the linear mapping of feature vectors into low-dimensional space. Commonly used methods include principal component analysis and independent component analysis. The feature reduction process requires human participation. For example, the basic idea of the feature selection process is to use certain evaluation methods to assign different scores to each feature word and then artificially set a threshold, so that the feature words with a score higher than the threshold form new features gather. Human participation will have an impact on the final text classification results. The commonly used classification models in the stage of classifier construction include Naïve Bayes classification algorithm, K nearest neighbor algorithm, decision tree, Support Vector Machine (SVM), etc. They are limited in the case of data sets and limited computing units, the fitting of complex functions is limited, and the processing capacity for complex problems is restricted, and the classification effect is directly related to the effect of feature dimensionality reduction. Therefore, when applying these classification models to carry out text classification research, a lot of time and energy need to be spent on feature selection and feature extraction. The effect evaluation is to measure the classification performance of the classification model in the test set by using related indicators such as accuracy, recall, and F1 value.

2.8. Deep Learning

It is a new branch of machine learning and is a newly emerging field. It originated from the study of artificial neural networks and is a collective term for learning methods which are based on deep neural networks (DNN). Its primary task is to express objects in the problem to be processed through features. The main motivation is to study how to automatically extract multilayer feature representations from data, through a data-driven approach, and adopt a series of nonlinear processing. The core idea is to extract the features from low-level to high-level, from specific to abstract, and from general to specific semantics in the original data. Traditional machine learning methods rely too much on manual selection of features or representations and do not have the ability to automatically extract and organize information. Therefore, deep learning with the characteristics of unsupervised learning has made the progress of the previous actions. In recent years, the application of deep learning models in the field of natural language processing has achieved certain results and has become one of the research hotspots. Its research is mainly on the learning representation of words, sentences, and chapters and related applications. At present, some of the more mainstream models used in the field of deep learning research include convolutional neural networks, deep belief networks, long and short-term memory models, autoencoders, deep Boltzmann machines, and recurrent neural networks. Mikolo et al. proposed a new vector representation called word vector or word embedding by using neural network model learning, which contains the grammatical and semantic information of words. Compared with word bag representation, word vector representation is dense, low-dimensional, and continuous. Socher et al. used the recursive automatic coding model to deal with the semantic synthesis problem in sentiment analysis. Collobert et al. used word vector methods to process natural language processing tasks, such as named entity recognition, part-of-speech tagging, semantic role tagging, phrase recognition, etc. Li et al. applied the multicolumn convolutional neural network (MUTI-COLUMN) to the question answering system to solve the problem classification based on the knowledge base. Cui et al. used deep learning methods to solve the problem of learning topic expression and deal with the problem of statistical machine translation disambiguation. In addition, Zhang et al. used deep convolutional belief networks to learn vocabulary and sentence-level features and deal with and solve the related classification problems between words in sentences. The natural language processing field has a wide range, covering various problems of different levels and properties. This requires us to deal with different types of problems, and we need to design corresponding deep learning models according to the characteristics of each type of problem, in order to better solve the problem. After the training problem of deep neural networks was solved in 2006, deep learning has developed rapidly. It uses multilayer representation learning to transform the original data into abstract representations layer by layer, automatically learns features from the data, uses its powerful computing and learning capabilities to discover complex structures in high-dimensional data, and then uses the extracted feature information for classification and prediction [24, 25]. Traditional text classification methods have the following problems: high-dimensional vectors or loss of semantics is easy to form in the process of text representation, the process of feature dimensionality reduction requires manual participation, and the shallow model of the classifier construction stage has limited data learning capabilities. Based on this, this article intends to use word vectors for text representation and deep neural network models for feature extraction, learning, and classification. The text classification process based on deep learning mainly includes text pretreating, text representation, classification model construction, effect evaluation, and other steps (Figure 4).

Figure 4

Text classification process based on deep learning.

3. Proposed Deep Learning-Based Methodology

In this section, there is a detailed description of the proposed mechanism and working methodology, such as text classification, vocabulary, etc. These mechanisms are described in separate subsections as given below.

3.1. Text Classification Based on Deep Learning

Initially, construct a character-level text vocabulary through the training set and vectorize the classified categories and text data. For this purpose, we use the convolutional neural network CNN, as presented in [18], to extract abstract features of the text and then one-to-one Support Vector Machine (SVM), as presented in [19], a multiclassifier which classifies the extracted text features.

3.1.1. Build a Text Vocabulary

Due to the relatively high complexity of Chinese texts, when using traditional vector space models to represent text information, the problem of too high feature vector dimensions and sparse data will occur, resulting in an increase in computational complexity, resulting in a sharp increase in computational time. The vectorization method ignores position and related semantic information of the word in text, which leads to a decrease in classification accuracy. Therefore, in this paper, we have used text characters to construct the vocabulary. If the size of the vocabulary list W is set to n, sort according to the number of occurrences of each Chinese character in the training set (Train Set) text. Select the first n−1 Chinese characters that appear more frequently to construct the vocabulary list W, and set a special value to represent characters that do not appear in the vocabulary W. The characters in the vocabulary are represented by c = (0, 1, 2,…, 1999), where 0 is a special value, which means that it is not a character in the vocabulary list, then represent these values according to the character's position information in the vocabulary list.

3.1.2. Text Vectorization

(1) Classification Category Vectorization. If F represents a classification catalog and K categories are selected for classification, then labels and corresponding serial numbers of the K category texts are expressed in dictionary form as given below. (2) Vectorization of Text Data. Data in text dataset is vectorized according to the information such as frequency and location of Chinese characters generated in vocabulary W. The sequence length of each piece of data is uniformly set to j; that is, the maximum number of words in each word bag is j, the number of classification categories is k, the training set and verification set. The test set sizes are m1 and m2, respectively, whereas m3 represent data format after vectorization as shown in Table 1.

Table 1

Text data vectorization.

Dataset	Shape
X_train	[m1, j]
X_val	[m2, j]
X_test	[m3, j]
Y_train	[m1, k]
Y_val	[m1, k]
Y_test	[m1, k]

3.1.3. Deep Learning Model Construction

The Chinese nursing adverse event text classification model based on deep learning is mainly composed of two parts: feature extraction based on character-level CNN and text classification using SVM. By using character-level CNN to extract deep features from the vectorized content of the Chinese nursing adverse event text data and using the extracted features to represent a piece of Chinese nursing adverse event text information, the text is finally classified by the SVM classifier, as shown in Figure 1. (1) Convolutional Layer. The text processing of adverse events in Chinese nursing uses one-dimensional convolution. Each row of the matrix represents one-word segmentation. Truncating the word segmentation has no mathematical meaning. Therefore, the length of the convolution filter is always equal to n. One-dimensional convolution requires multiple width filters (filter) to get different feelings. The matrix S after text data vectorization is used as input. The convolutional layer has different types of filters (F). During convolution, the sentence fragment with the same width as the filter is taken as input, and the ith feature vector calculation formula of a sentence for Among these, k represents number of convolutional layers and ⊗ represents the convolution operator, S[i − m+1 : i, :]. It is a matrix block with a width of m. After the input matrix S of each sentence and the convolution kernel are calculated, the feature vector ci is output. To form a richer feature, each filter has p convolution kernels. (2) Pooling Layer. After text features are obtained through convolution, Max-Pooling is used to select strongest feature after convolution result is calculated. Additionally, the pooling adapt input width to convert the input of different lengths into a uniform length output. The maximum pooling result Cpoo is It is the result of convolution calculation. N is the length of a sentence composed of words and h is window size. (3) Fully Connected Layer. The pooled data is spliced into a vector in depth direction and provided to the fully connected layer. The Soft-Max compares the predicted label value and real value to adjust parameters. When performance is stable on the training set, pooled layer is obtained and high-dimensional feature representation is extracted. If CNN-Soft-Max is used, the output of the pooling layer can be fully connected. Soft-Max is used to calculate the label of the sentence: (4) SVM Multiclassifier. Deep learning-based Chinese nursing adverse event classification model is used as one-to-one approach in SVM to construct any two classes into one classifier. Therefore, k classes require (k−1) ∗ k/2 classifiers. Generally, this classification method takes less time than the one-to-many method and the effect is better. After the high-dimensional feature representation of the test set obtained by the pooling layer is extracted, it is put into the SVM multiclassifier. SVM model is trained until the accuracy rate is the highest. Using training parameters, test set data is sent to the model for classification as shown in Figure 5.

Figure 5

CNN-SVM model.

4. Experiment and Result Analysis

Theoretically, everything is possible such as a vague idea. However, theoretical ideas become reality if they are implemented in real environment. Therefore, various experiments were carried out to verify various claims of the proposed deep learning based classification scheme. Additionally, the proposed scheme is compared against well-known and field proven schemes in terms of numerous performance evaluation metrics.

4.1. Experimental Data and Metrics

The unstructured text data of adverse nursing events comes from reporting and registration system of adverse nursing events of a large tertiary hospital in China. Department 361 which was dedicated for research on the text classification of nursing adverse events, specifically deep learning-based models, was launched in 2014. For this purpose, 11,751 records in the system, specifically from 2014 to 2018 for a total of 5 years, were collected and a data sorting team (consisting of 2 in charges and several nurses) was set up, data sorting rules were set, and part of the data was removed. This team has completed their task (lack of nursing adverse event level or event history data) and reviewed the registered event level in the data with suitable corrections after agreement based on experience. The data distribution uses 4 candidate classification categories and dataset distribution is shown in Figure 6.

Figure 6

Data set level classification.

To verify performance of the proposed classification model, different algorithms were tested on the same dataset. After pretreating the original data, these datasets were randomly divided into three. Among these datasets, training set h-train contains 9737 data. Similarly, set h-test contains 1000 and verification set h-value contains 500 pieces of data, respectively. Recall rate R (Recall), F-value, and accuracy rate A (Accuracy) were selected to evaluate the effect of text classification in nursing adverse events. The formula for calculating recall rate R is shown in the following equation:where TP is to classify 2 paragraphs of similar nursing adverse event texts into 1 category and FP is to classify 2 dissimilar nursing event texts into 1 category. Similarly, accuracy rate P (Precision) is calculated based on the dissimilar adverse nursing event texts classified into different categories and similar ones classified into different categories. The F-value is the harmonic average of the precision rate P and the recall rate R. The calculation equation is as follows:where β=1, and weights of precision and recall in the F-value are the same. Likewise, accuracy rate A is calculated based on the classification of similar adverse events and the total number of texts.

4.2. Result Analysis

Based on the character-level deep learning classification model, dataset is trained and tested multiple times by setting different parameters whereas optimal parameter configuration for character-level CNN feature extraction is shown in Table 2. According to the obtained parameter configuration, feature extraction method, which is based on the character-level CNN, effectively resolves problem of excessively high feature vector dimensions and sparse data that occur in traditional feature extraction which is based on Term Frequency-Inverse Document Frequency (TF-IDF).

Table 2

Character-level CNN feature extraction parameter configuration list.

Parameter name	Parameter value
Number of convolution kernels	128
Size of convolution kernels	5 × 64
Number of candidate categories	4
Sequence length	6000
Character vector dimension	100
Pooling layer	1-Max-Pooling (1 ∗ 1)
Hidden layer	0.5
Size of batch processing	64
Iteration cycle	10
Learning rate	10‐x

In order to prevent overfitting in the network learning process, cross-validation [26] is used to perform the training process on the training set once and performance of the proposed model is evaluated on the validation set once. We have observed that the proposed method is simple and effective and the training time is short which often produces good results in experiments. The correct choice of learning rate is conducive to the rapid convergence of the network model to the optimal weight. Ada Max adaptive learning rate algorithm [27] is used in the proposed model, which has a small memory requirement and is suitable for large data sets and high-dimensional spaces and is used to select the appropriate learning rate. To verify accuracy of the Chinese text classification model of adverse events in nursing care based on character-level deep learning, multiple sets of experiments were designed and performed for comparison.

4.2.1. Traditional Chinese Classification Based on TF-IDF

For TF-IDF feature extraction and Chinese classification, experiment is based on the jieba Chinese word segmentation module to segment text, delete stop words, and use TF-IDF to achieve text vectorization and other pretreating operations. Then it uses logistic regression, random forest, SVM, and other classifiers to attain expected level of classification as shown in Figure 7.

Figure 7

Comparison of traditional classification models based on TF-IDF.

Similarly, experimental results on adverse event data show that SVM has better classification effect, but accuracy and F-value of these three methods are below 70% which is generally unsatisfactory. It is related to the feature level of TF-IDF extracted text and characteristics of unstructured text of adverse nursing events.

4.2.2. Character-Level CNN-Based Classification

The experiment is based on the Tensor Flow framework, using character-level CNN to achieve text feature extraction of Chinese nursing adverse events and implement the level classification of Chinese nursing adverse event texts according to CNN's own Soft-Max classifier or one-to-one SVM classifier. The experimental results are shown in Figure 8.

Figure 8

Comparison of classification using CNN models.

Using the same data set to compare the results of Experiments 1 and 2, comparison line chart of the algorithm's average recall, average F-value, and average accuracy rates is shown in Figure 9.

Figure 9

Comparison of various classification models.

The proposed deep learning based character-level Chinese nursing adverse event classification model and character-level CNN-Soft-Max-based model achieve text classification, F-value, and accuracy rate on the test set. Both models have better results than traditional classification model which is based on the TF-IDF feature extraction method. It is due to the fact that CNN convolutional neural network extracts deeper features in the Chinese nursing adverse event text. The character-level deep learning classification model has the highest accuracy rate, with an average accuracy rate of 78%. Obviously, the proposed character-level deep learning based classification model is better than the traditional TF-IDF-based classification model and the character-level CNN-Soft-Max-based classification model in the text classification of Chinese nursing adverse events.

5. Conclusion and Future Work

Automatic analysis of unstructured data or information, particularly through natural language processing technology, is the basis for big data analysis of hospital nursing adverse events. When hospital nursing adverse events occur, it is necessary to record all the details of the incident. Unstructured text information explains hospital care. In terms of adverse events, structured data has an irreplaceable important role. Therefore, natural language processing technology has become an auxiliary tool for the identification of unstructured text data of adverse events in hospital care. Compared with the traditional classification model based on TF-IDF and the classification model based on character-level CNN-Soft-Max, a character-level deep learning based hospital Chinese nursing adverse event text classification model is proposed in this paper which has greatly improved accuracy and realized nursing care effective classification of unstructured information in adverse events [28]. In future, further formation of an intelligent analysis and early warning system for nursing adverse events to assist clinical nurses in decision-making is the next research content.

5 in total

1. Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries.

Authors: Yan Xu; Kai Hong; Junichi Tsujii; Eric I-Chao Chang
Journal: J Am Med Inform Assoc Date: 2012-05-14 Impact factor: 4.497