Literature DB >> 34967385

An application for classifying perceptions on my health bank in Taiwan using convolutional neural networks and web-based computerized adaptive testing: A development and usability study.

Chen-Fang Hsu^1,2,3,4, Tsair-Wei Chien⁵, Yu-Hua Yan^6,7.

Abstract

BACKGROUND: The classification of a respondent's opinions online into positive and negative classes using a minimal number of questions is gradually changing and helps turn techniques into practices. A survey incorporating convolutional neural networks (CNNs) into web-based computerized adaptive testing (CAT) was used to collect perceptions on My Health Bank (MHB) from users in Taiwan. This study designed an online module to accurately and efficiently turn a respondent's perceptions into positive and negative classes using CNNs and web-based CAT.
METHODS: In all, 640 patients, family members, and caregivers with ages ranging from 20 to 70 years who were registered MHB users were invited to complete a 3-domain, 26-item, 5-category questionnaire asking about their perceptions on MHB (PMHB26) in 2019. The CNN algorithm and k-means clustering were used for dividing respondents into 2 classes of unsatisfied and satisfied classes and building a PMHB26 predictive model to estimate parameters. Exploratory factor analysis, the Rasch model, and descriptive statistics were used to examine the demographic characteristics and PMHB26 factors that were suitable for use in CNNs and Rasch multidimensional CAT (MCAT). An application was then designed to classify MHB perceptions.
RESULTS: We found that 3 construct factors were extracted from PMHB26. The reliability of PMHB26 for each subscale beyond 0.94 was evident based on internal consistency and stability in the data. We further found the following: the accuracy of PMHB26 with CNN yields a higher accuracy rate (0.98) with an area under the curve of 0.98 (95% confidence interval, 0.97-0.99) based on the 391 returned questionnaires; and for the efficiency, approximately one-third of the items were not necessary to answer in reducing the respondents' burdens using Rasch MCAT.
CONCLUSIONS: The PMHB26 CNN model, combined with the Rasch online MCAT, is recommended for improving the accuracy and efficiency of classifying patients' perceptions of MHB utility. An application developed for helping respondents self-assess the MHB cocreation of value can be applied to other surveys in the future.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34967385 PMCID： PMC8718177 DOI： 10.1097/MD.0000000000028457

Source DB: PubMed Journal: Medicine (Baltimore) ISSN： 0025-7974 Impact factor: 1.889

We introduced the My Health Bank (MHB) used in Taiwan by observing the perception of users on the matter. The CNN model was used to classify clusters derived from a 26-item 5-category questionnaire asking about their perceptions of MHB (PMHB26), which is novel and innovative. Rasch multidimensional CAT (MCAT) applied to the CNN model can save respondents’ time in answering items in a survey.

Introduction

As health information technology advances, numerous innovations have been applied to engage people with their own health data in self-management (SM) behaviors to improve their health outcomes, including Blue Button adopted in the United States,[] My Health Record, or Personally Controlled Electronic Health Record in Australia, and Summary Care Record under the United Kingdom's National Health Service.[] Such a system is also under development in Taiwan, called My Health Bank (MHB), by its universal, comprehensive, and single-payer program of the National Health Insurance (NHI).

Perceptions on MHB need to understand

The MHB is iconographic to cater to users who have become used to pictorial electronic devices. The new MHB version also flashes out individualized notes, reminding the user to make an appointment for the next dental scaling, physical check-up, or refill the prescription for chronic illness. The Taiwanese government-run National Health Insurance Administration set up the My Health Bank (MHB) system in September 2014 to narrow the gap between medical service providers and patients with respect to medical information, allowing the insured to retrieve personal health data on the Internet, making medical information more transparent and convenient for both patients and health care providers[] (Supplemental Digital Content 1). By the end of 2018, 1.03 million people registered on this platform. The basic concept behind this platform is to match the needs of dual parties and multiparty to achieve “interaction” and “exchange” of medical resources by either an actual platform or a virtual platform.[] Accordingly, the MHB system possessing the potential to become a patron saint for people's health needs a survey to classify MHB satisfaction using an appropriately scientific approach that is the first challenge we encountered in this study.

Many constructs surveyed in a study

Another major challenge we faced was multidimensional questions in a questionnaire. In the context of survey research, a construct is an abstract idea, underlying theme, or subject matter that one wishes to measure using survey questions. Some constructs are relatively simple (such as burnout, being bullied, or mental illness at the workplace[]) and can be measured using only one or a few questions, whereas other constructs are more complex (ie, a satisfaction or perception survey in this study) and may require a whole battery of questions to fully elucidate the construct to meet the research goals. Complex constructs contain multiple dimensions that are bound together by some commonalities that compose the construct as a whole, such as a questionnaire regarding the perceptions on the MHB (named PMHB26) containing 26 five-point items. It is thus necessary to develop an application that can deal with such questionnaires involving multidimensional scales, particularly using multidimensional computer adaptive testing (MCAT)[] for efficiency (with less item length to reduce respondent burden) and accuracy (without compromising the precision of assessment).

The convolutional neural networks might be helpful

Convolutional neural networks (CNNs) have been successfully used in healthcare settings in several forms,[] with their greatest impact being in the field of health informatics. The CNN, a famous deep learning method, can improve the prediction accuracy (up to 7.14%). CNN architecture was described as an interleaved set of feedforward layers implementing convolutional filters followed by reduction, rectification, or pooling layers[] and performed in R-language, Python-based utility software, MATLAB, Octave, Fortran, C#, Java# to write costs fast, or in C or C++ with Juann-Esteban to write fast codes.

CNN and MCAT used in MS Excel

A few have used CNNs on a small number of cases (eg, <1000) for supervised training in MS Excel, imitating 2-dimensional images with pixels to predict classification numbers from 2 to 4 using survey data.[] Missing data in the questionnaires are a major problem and another challenge that should be overcome. That is, all questions must be answered by all respondents. Although applications were designed for users to assess their health status, including burnout, being bullied, or mental illness at the workplace,[] the need to shorten items to reduce respondents’ burdens (ie, efficiency) without compromising the assessment purpose (ie, accuracy) exists. Unfortunately, the reliability of a test or questionnaires would decrease if the number of items in a measurement instrument was shortened according to the Spearman-Brown prophecy formula.[] Nonetheless, the CAT based on item response theory adapts to an examinee's ability level[] and provides the examinee with the next suitable item for answering the question. Accordingly, each patient under the CAT scenario is able to answer the fewest possible items with less burden, which could reach more or equivalently accurate outcomes.[] We were motivated to conduct this study incorporating CNNs into CAT to resolve the problems (ie, all questions must be answered) in previous studies.[] As such, it is thus necessary to develop an MCAT application incorporated with CNN for (1) efficiently collecting data regarding users’ perceptions of MHB satisfaction and effectively classifying MHB satisfaction into 2 classes: unsatisfied (−) and satisfied (+).

An online MHB-APP using CNN and MCAT

As with all forms of web-based technology development, no online MCAT assessment has incorporated a CNN to assess individual perceptions about satisfaction on an issue available until now. Afire solving the problems of missing responses in MCAT affecting the CNN computation, the next challenge is to develop an APP using CNN and MCAT for collecting the perceptions on MHB from patients’ experiences.

Study aims

Sentiment classifications play an important role in the age of artificial intelligence. Classifying a client's opinions online into positive and negative classes needs to be studied. The aims of this study are to design an online app that could accurately and efficiently classify a respondent's perceptions into positive and negative classes using CNNs and web-based MCAT.

Methods

Data source

Examinees invited to answer the MHB perception questions were selected from the outpatient department in a hospital in southern Taiwan. Before delivering the questionnaire to examinees, they were asked whether they had registered to MHB to meet the preliminary criteria used in this study. If so, the questionnaire could be filled anonymously. In all, 401 questionnaires were delivered, of which 391 were valid after removing those with incomplete or inaccurate data. The sample size was determined and recommended by Dillman's instruction based on the 1.03 million registered users in MHB and the 95% confidence intervals required for computation. As a result, the minimal sample size was set at 384, which is less than the final 391 eligible respondents in this study. This study was approved by the institutional review board of Show Chwan Memorial Hospital (IRB no. 1061107).

Contents of the questionnaire

The questionnaire contained the following 3 parts: satisfaction of MHB and its platform (12 items), agreement with the MHB functionality of resource exchange and integration (6 items), and satisfaction of the doctor-patient interaction and cocreation values existing in MHB (8 items) (Supplemental Digital Content 2). The 26-item 5-point Likert-type questionnaire was analyzed to examine the construct validity of the instrument. Higher scores mean a higher level of satisfaction with MHB.

Application displaying the classification on smartphones

A cell phone application was designed to predict the classification of MHB perception satisfaction using the CNN algorithm and model parameters. The resulting classification appears on participants’ smartphones. The visual representation with binary (satisfied + and unsatisfied −) category probabilities would be shown on a dashboard using Google Maps. The forest plot was applied to present the comparison of 3 domain scores yielded in the CNN PMHB26 model.

Exploratory factor analysis of extract factors

Exploratory factor analysis (EFA) was performed to determine the number of factors retained in the survey data. The criterion of factor loadings was set at 0.50. Furthermore, Fornell and Larcker suggested the use of validity significance (eg, average variance explained [AVE] > 0.5 or named convergent validity [CV]; Eq. 1) and construct reliability (CR) (>0.60; Eq. 2) or Cronbach reliability >0.70 (Eq. 3). where λ is the item loading to the construct domain, squared λ indicates the community to the factor, and ε denotes the measurement error. where λ and ε are defined as in Eq. 1. where k is the item length, σ denotes the variance for item i, and σ represents the variance of examinees on the summation score.

Rasch analysis

Construct validity

Rasch analysis was performed using Winsteps software to transform the ordinal responses into interval logit (ie, log odds) scores. The construct fitting to the Rasch model's expectation was examined using criteria of mean square errors in Infit and Outfit statistics (ie, mean square error, MNSQ), within an acceptable range of 0.5 to 1.5 compared with the classic test theory with factor loadings in EFA mentioned in the previous sections.

Rasch CAT

Efficient and accurate assessment can be simultaneously achieved through CAT.[] The online MCAT was designed based on the MHB perception instrument consisting of several constructs mentioned in Section 2.2. Not all items were answered in CAT (deemed missing responses on CNNs); therefore, we used the Rasch rating scale model to generate the expected responses and overcame the drawback of not having all the items answered in CAT.[] As such, all the feature variables were applied to the CNN PMHB26 model.

Rasch KIDMAP

Rasch KIDMAP has been used in educational fields.[] All responses were compared with those of the Rasch model and examined by misfit statistics (ie, Z score = [observed score − expected value]/standard deviation). Through KIDMAP, any aberrant behaviors on responses to the questionnaire were displayed and interpreted. The overall fit statistics were observed by examining the Outfit MNSQ ( = [ ]/L, where L is the item length) for an examinee on the assessment.

CNN model created in MS Excel

Feature variables

Feature variables were determined by observing the extracted items using the EFA and Rasch analysis mentioned in the previous sections.

Unsupervised and supervised learning

Unsupervised learning[] was performed using k-means clustering, which is a traditional, simple machine learning algorithm that is trained on a test dataset and then able to classify a new dataset using a prime (ie, k clusters) defined a priori. In contrast, supervised learning was preceded using the CNN model with the feature variables to predict the labels that were defined by k-means clustering.

CNN model used in MS Excel

Similar to previous studies, we implemented the CNN PMHB26 model in MS Excel (MP4 video in Supplemental Digital Content 3). The feature variables were transformed into the CNN model. Model parameters were estimated using the Solver add-in algorithms in MS Excel. The accurate rate, sensitivity, specificity, and receiver-operating characteristic (ROC) curve (area under the curve, AUC) were then computed. A cell phone application was designed to predict the classification of MHB perception satisfaction using the CNN algorithm and model parameters. The resulting classification appears on participants’ smartphones. The visual representation with binary (satisfied + and unsatisfied −) category probabilities is shown on a dashboard using Google Maps. The forest plot was applied to present the comparison of 3 domain scores yielded in the CNN PMHB26 model.

Statistical tools and data analysis

MedCalc 9.5.0.0 for Windows (MedCalc Software) and Statistical Package for the Social Sciences for Windows (version 18.0; IBM Corp., Armonk, NY) were used to calculate the sensitivity (Sens), specificity (Spec), corresponding AUC (Eqs. from 4 to 7), and k-means clustering to yield the class labels (ie, 0 for unsatisfied and 1 for satisfied). With reference to the frequency and percentage of the variables related to sex, a χ 2 test was performed to examine the difference in count distributions. The significance level was set at 0.05. Visual representations were used to display the classification effect using 2 curves based on the Rasch category characteristic curve (CCC)[] (ie, the curve from the bottom left corner to the top right corner denotes the success feature [the satisfied +], and the curve from the top left corner to the bottom right side denotes the failure attribute [the unsatisfied −]). The study flowchart is shown in Figure 1.

Figure 1

Study flowchart.

Results

Demographic data of the 391 cases

Of the 391 participants, 60.3% were women, 87% were over 54 years’ old, and 54% were single (Table 1). The mean annual income of 40.4% of the participants was between US $930 and $1560 per month. More than 60% graduated from a 4-year university or above, and 58.8% purchased additional commercial insurance. Regarding occupation, 19.4% worked in the private sector, 19.2% worked in service departments, and 20.2% endorsed others. The χ 2 test indicated that 4 variables were significantly different in sex distribution: age (P < .001), marital status (P < .001), mean income per month (P < .001), and occupation (P < .001) (Table 1).

Table 1

Descriptive statistics of personal characteristics (N = 391).

Variable	Female	%	Male	%	All	%	χ ²	ρ
Age, y							23.758 (2)	.001
≦54	221	56.5	119	30.4	340	87.0
55∼64	11	2.8	29	7.4	40	10.2
≧65	4	1.0	7	1.8	11	2.8
Marital status							11.920 (1)	.001
Single	144	36.8	67	17.1	211	54.0
Married	92	23.5	88	22.5	180	46.0
Mean income per month (USD)							18.307 (3)	.001
No income	35	9.0	13	3.3	48	12.3
≦930	82	21.0	49	12.5	131	33.5
930∼1560	100	25.6	58	14.8	158	40.4
≧1560	19	4.9	35	9.0	54	13.8
Educational level							9.493 (4)	.050
Junior high school	11	2.8	9	2.3	20	5.1
Senior high school	18	4.6	24	6.1	42	10.7
2-y College	38	9.7	29	7.4	67	17.1
4-y University	145	37.1	74	18.9	219	56.0
Graduate School	24	6.1	19	4.9	43	11.0
Business life insurance coverage							1.497 (1)	.132
Yes	133	34.0	97	24.8	230	58.8
No	103	26.3	58	14.8	161	41.2
Occupation							21.381 (5)	.001
Medical sector	48	12.3	15	3.8	63	16.1
Private sector	32	8.2	44	11.3	76	19.4
Government employee	21	5.4	16	4.1	37	9.5
Financial sector	33	8.4	28	7.2	61	15.6
Service sector	47	12.0	28	7.2	75	19.2
Others	55	14.1	24	6.1	79	20.2

Descriptive statistics of personal characteristics (N = 391).

Analyses of reliability and validity of the subscales

Three construct factors were extracted from PMHB26 (Table 2 and Supplemental Digital Content 2). All 26 items with Cronbach α ≥0.94, CR ≥0.80, and AVE ≥0.78 for each subscale were observed and showed internal consistency and stability in the study data (Table 2).

Table 2

Validity and average variable extracted.

Construct	Max. Infit	Max.Outfit	Cronbach α	CR	AVE
Platform operation (12 items)
	1.64	1.62	0.94	0.92	0.80
Threshold difficulties	−7.80
	−1.49
	3.02
	6.27
Resource exchange and integration (6 items)
	1.38	0.73	0.96	0.80	0.96
Threshold difficulties	−15.05
	−3.13
	5.48
	12.71
Values co-creation (8 items)
	1.20	1.48	0.98	0.96	0.78
Threshold difficulties	−14.65
	−2.77
	5.65
	11.77

Validity and average variable extracted. Rasch analysis indicated that all items have acceptable fit statistics (ie, MNSQ ≤1.5); however, the one (ie, #1 equipment required [eg, card reader]) with an Infit MNSQ of 1.64 and an Outfit MNSQ of 1.62 indicated that the data fit the Rasch model rather well. All threshold difficulties for each subscale exhibit monotonic increases from lower to higher with a gap (ie, between 2 threshold difficulties in difficulty) >1.5 logits (Table 2).

Comparison of prediction accuracies in the CNN PMHB26 model

Two classes (ie, n1 = 175 and n2 = 216) divided by k-means clustering were labeled 0 (unsatisfied) and 1 (satisfied). The cutoff point was set at 85 points using ROC analysis. The 26-item CNN PMHB26 model yielded higher accuracy rates with an accuracy of 98%, an AUC of 0.99 (0.99–1.00), and much smaller Type I and Type II errors (≤3%) based on the 391 cases (Table 3).

Table 3

CNN applied to prediction of the MHB utility (PMHB26) (n = 391).

	True condition		Statistics
CNN classifications and ACC	Satis (+)	Unsatis (−)	PPV/FOR	FDR/NPV
Positive	214	5	0.98	0.02
Negative	2	170	0.01	0.99
Sensitivity	0.99	—	—	—
FPR	0.01	—	—	—
FNR (Miss rate)	0.03	—	—	—
Specificity	0.97	—	—	—
AUC (95% CI)	0.98 (0.97–0.99)	—	—	—
Accuracy (ACC)	0.98

CNN applied to prediction of the MHB utility (PMHB26) (n = 391). Another visual representation that displays the classification effect is depicted using box plots, AUCs, and bimodal distributions in Figure 2. The AUC and its 95% CI were computed by Eqs. from 4 to 7 for AUC (=0.98 = (1–Spec) × ∗Sens/2 + (Senc + 1) × Spec/2 = (1 − 0.97) × (0.99)/2 + (0.99 + 1) × 0.97/2) and the 95%CI (0.97–0.99 = 0.98 ± 19.6 × SE, where SE = 0.0069 = ), respectively.

Figure 2

Two classes grouped using the convolutional neural networks approach.

Two classes grouped using the convolutional neural networks approach. Interested readers are encouraged to see the study process in Supplemental Digital Content 3 using the parameters based on the 391 cases to predict the accuracy of the CNN PMHB26 model.

Application detecting satisfactory classes on a web-based assessment

A CNN PMHB26 application for examinees predicting satisfactory classes was developed and is demonstrated at the top of Figure 3. One resulting example of the satisfied class is presented at the bottom of Figure 3 on the CCC (ie, category 0 from the top left corner to the bottom right corner, category 1 from the bottom left side to the top right side) based on the Rasch rating scale model, which is novel and innovative when using a visual display shown on Google Maps. A comparison of 3 domain scores for the respondent was made using the forest plot to display in Figure 4.

Figure 3

Snapshot of the application and the assessment output.

Figure 4

Comparison of 3 domain scores was made using the forest plot to display.

Snapshot of the application and the assessment output. Comparison of 3 domain scores was made using the forest plot to display. Readers are invited to click on the link in reference to practice the CNN PMHB26 application on their own or see the operational process in Supplemental Digital Content 3. Note that all 38 model parameters for classifying individual binary levels are involved in the Rasch online MCAT module.

Rasch KIDMAP for detecting aberrant responses in an assessment

Once the bubble of satisfied or unsatisfied was clicked (Fig. 5), 3 icons for the KIDMAPs of the 3 constructs appeared.[] One, the resulting KIDMAP (ie, the 12 items regarding satisfaction of the MHB operation and the platform), is shown in Figure 5.

Figure 5

KIDMAP presented for examining the aberrant responses for an examinee.

KIDMAP presented for examining the aberrant responses for an examinee. Z scores and Rasch logit scores are on the x- and y-axes, respectively. Bubbles are measured using the standard errors (SEs) of the item difficulties. Higher bubbles mean that the items (or the examinee shown in yellow) are more difficult (or higher) on the y-axis. The red color denotes an aberrant response due to a Z score of >2.0, indicating that the response is a misfit to the model's expectation. That is, the response was endorsed as too high (ie, much more satisfied) based on the person estimated measure (2.55 logits). The overall person Outfit MNSQ (2.4) is greater than the cutoff point at 2.0.

Online dashboards shown on Google maps

All dashboards in the figures appear once the QR code is scanned or the links[] are clicked. Readers are advised to examine the details about the information for each entity.

Discussion

Principal findings

In this study, we observed 2 aspects: the accuracy of the CNN–MCAT PMHB26 model for the 2 categories having a higher accuracy rate (0.98) with an AUC of 0.98 (95% CI, 0.97–0.99) and the efficiency of not having to answer all 26 items without compromising the accuracy of the measurement. All 26 items were successfully used to design an application for the respondents to evaluate MHB utility in Taiwan. As such, this study aims to design an online app that could accurately and efficiently classify a respondent's perceptions into positive and negative classes using CNNs and web-based MCAT.

Review of research findings

CNN can improve the prediction accuracy (up to 7.14%). Although >1408 articles were found using “convolutional neural network” searched in the title, we have not seen any study incorporating CNN with MCAT to solve the problem of missing responses in questionnaires. The perception of MHB[] (Supplemental Digital Content 1) was illustrated in this study to verify that the MCAT combined with CNN could be viable and feasible in the application.

MCAT rule and calculation

We set one of the stop rules for the measurement of SEs smaller than a criterion, that is, <0.51 (SD × SQRT [1 – alpha] = 2.09 × SQRT[1 – 0.94]), where alpha is Cronbach reliability and SD denotes the person standard deviation of an assessment so that the minimum number of questions required for completion could be efficiently satisfied because CAT achieves a similar measurement precision to that with only approximately half the test length compared with traditional paper-based examinations.[] In addition, we set at least 5 items in CAT to be completed as another stop rule, which might inflate the test length. The test length of CAT might be slightly higher than that of CAT in previous studies.[] The initial question was randomly selected from a subscale similar to previous studies[] using an item selection strategy. The provisional person measure was estimated using the iterative Newton–Raphson procedure[] after 3 items were answered, avoiding all item answers corresponding to either 0 or 4 as the extreme category in PMHB26, limiting the effective estimation in CAT or MCAT. The next question selected was the one with the highest information value among the remaining unanswered questions weighted against the provisional person measure.[] Details of the CAT procedures are shown in Supplemental Digital Contents 3 to 5.

The CNN–MCAT PMHB model created in this study

We illustrated an example of perceptions about MHB utility in Taiwan (Supplemental digital Content 1). Patients can use the MHB platform to obtain knowledge about their disease and treatment history, which can motivate them to actively participate in their healthcare. This survey on PMHB26 showed that if patients properly use MHB information, the MHB platform will help them immediately understand their own health and medical conditions. The information can be used as a reference when seeking medical treatment, avoiding unnecessary physical checkups, reducing duplicated prescriptions, and promoting doctor-patient interaction and value cocreation.[] The CNN–MCAT PMHB model can be used in other survey-based questionnaires,[] not just limited to perception surveys of MHB.

Implications and applications in the future

Improvement made in accuracy further in the future

Although the accuracy of the CNN–MCAT PMHB26 model could reach 98%, improvements can be further made by adjusting a few known responses to avoid letting the CNN fail in the classification, such as a scheme named the matching personal response scheme to adapt for the correct classification in the model (MPRSA) designed for driving the model's accuracy toward 100%. Only 7 misclassified cases were found in the CNN PMHB26 model (Table 3 and Fig. 2). All these summation scores were close to the cutoff point of 85. The way in which we applied the MPRSA to achieve the 100% accuracy goal can be designed if the same response string encountered in the model is automatically adjusted to the designated class.

The calculation of AUC and its 95% CI in Eqs. from 4 to 7

CNN can improve the prediction accuracy (up to 7.14%). More than 1408 articles were found using CNN in PubMed Central. Thus far, we have not seen any study, except for the 3 aforementioned studies,[] using Microsoft Excel to perform the CNN in the literature, which is novel and useful in studies with small sample sizes or a survey-based questionnaire, as we did in this study. Furthermore, we did not see any previous 1408 articles clearly and completely describing how to compute the AUC and its 95% CI, as we did in Eqs. from 4 to 7. The curves of category probabilities based on the Rasch rating scale model are sophisticated in Figure 5. Binary categories (eg, success and failure on an assessment in the psychometric field) have been broadly used in health-related outcomes.[] However, the animation-type dashboard with the KIDMAP (Fig. 5)[] shown on Google Maps is particularly designed for the CNN–MCAT PMHB26 model used in this study.

Missing responses in MCAT can be solved

Finally, not all questions were answered in the CAT. Different from those using the mean value over the entire dataset to fill the missing values and those applying CAT[] instead of MCAT,[] we used the expected value in the model for each unanswered item to fill the missing data, as done in previous studies.[] By doing so, the expected responses and CNN parameters can be incorporated to predict the classes of interest.

Strengths of this study

This study has 4 strengths. First, we demonstrated the CNN module incorporated with the Rasch MCAT classifying opinions online into positive and negative levels that have not been found in the traditional Rasch analysis in the literature. Second, all 26 person responses and 38 model parameters were embedded into the CNN–MCAT MHB26 model that can be used for predicting the underlying classes, as no such CNN was found incorporating MCAT into a prediction model ever before in academics. Third, Microsoft Excel was used to execute the CNN prediction model, which is familiar and convenient to ordinary researchers who can understand the easy-to-use CNN in applications more than ever before using other Non-MS-Excel-based software with the CNN model. Fourth, an application was designed to display results using the visual dashboard on Google Maps (see the MP4 video in Supplemental Digital Content 3). As with all forms of web-based technology, advances in health communication technology are rapidly emerging. Mobile online CAT assessment is promising and worth considering in several fields of health assessment. It is recommended that readers click on links at references[] to practice the application on their own for more details about the online app.

Limitations and future studies

Although our study aim has been achieved through the CNN–MCAT PMHB model that could accurately (with fewer standard errors) and efficiently (with fewer items) classify respondents’ perceptions into positive and negative classes, some limitations exist in this study. First, although the psychometric properties of PMHB26 have been validated for measuring perceptions on MHB (Table 2), no evidence exists to support the fact that the instrument is suitable for use on MCAT assessment when referred to the item (ie, #1 equipment required [eg, card reader]) with an Infit MNSQ of 1.64 and an Outfit MNSQ of 1.62 greater than the criterion (<1.5).[] Additional studies are needed. Second, although the 2 classes were determined to have a high accuracy rate (98%) using k-means clustering and the CNN algorithm (Table 3), we cannot guarantee that this CNN is the only statistical approach to improve the classification accuracy. Future studies are encouraged to seek other prediction methods that can also improve the model prediction, such as logistic regression, naive Bayes, decision trees, random forests, gradient tree boosting,[] and other artificial neural networks. Third, item difficulties were calibrated using Winsteps software under the Rasch rating scale model. The ideal way to gain item difficulties is to use Rasch ConQuest software because the Rasch multidimensional model was used in this study. Fourth, all the model parameters (ie, item difficulties, step-threshold difficulties, and the 38 CNN model parameters) were derived from this study. If any environment or occasion/condition is changed (eg, the survey subjects and contents), the result (eg, the model's parameters) will be different from the present study and will need further verifications in the future. Fifth, missing responses were filled by the expected scores only held under the Rasch model because a one-dimensional subscale was verified (eg, Infit and Outfit MNSQs less than 1.5). Otherwise, the remaining items that were not answered in CAT could not be obtained as the expected responses and used for the CNN algorithm. Future studies should be cautious about this matter. Sixth, the application accessed via references[] should be further developed for Android and IOS users. The efficiency with a lower number of questions answered to the questions of PMHB26 must be investigated as to whether a similar measurement (or classification) precision to that with only about half the test length[] exists using a simulation method in the future. Finally, the study sample was taken from a patient-perception survey. The model parameters estimated for PMHB26 are only suitable for the PMHB26 questionnaire. Generalizing these parameters to other surveys might be biased because the sample consisted only of patients at one specific study hospital. Additional studies are needed to reexamine whether the psychometric properties of PMHB26 are similar to our study.

Conclusions

The contributions in this study include the following: overcoming the problem of missing responses that affect CNN computation; introducing CNN availability in Microsoft Excel and incorporating MCAT in classification; and demonstrating an application incorporating MCAT with numerous parameters in CNN. The CAT–MCT PMHB26 model is recommended for mimicking similar studies on improving the accuracy of determining individual classes. An application developed for helping patients self-assess their perceptions about satisfaction in several domains is required in the future.

Acknowledgments

We thank the Ministry of Science and Technology for the support of grand funds (MOST 106-2410-H-217-001-SSS, MOST 107-2410-H-217-001-SSS) in 2017 and 2018 to accomplish the research.

Author contributions

CF conceived and designed the study, YH and TW interpreted the data, and YH monitored the process and the manuscript. TW drafted the manuscript. All authors read the manuscript and approved the final manuscript. Conceptualization: Chen-Fang Hsu. Data curation: Chen-Fang Hsu, Yu-Hua Yan. Methodology: Tsair-Wei Chien.

44 in total

1. Predicting Hospital Readmission via Cost-Sensitive Deep Learning.

Authors: Haishuai Wang; Zhicheng Cui; Yixin Chen; Michael Avidan; Arbi Ben Abdallah; Alexander Kronzer
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2018-04-16 Impact factor: 3.710

2. A personally controlled electronic health record for Australia.

Authors: Christopher Pearce; Michael Bainbridge
Journal: J Am Med Inform Assoc Date: 2014-03-20 Impact factor: 4.497

3. Technology-assisted patient access to clinical information: an evaluation framework for blue button.

Authors: Timothy P Hogan; Kim M Nazi; Tana M Luger; Daniel J Amante; Bridget M Smith; Anna Barker; Stephanie L Shimada; Julie E Volkman; Lynn Garvin; Steven R Simon; Thomas K Houston
Journal: JMIR Res Protoc Date: 2014-03-27

4. A new technique to measure online bullying: online computerized adaptive testing.

Authors: Shu-Ching Ma; Hsiu-Hung Wang; Tsair-Wei Chien
Journal: Ann Gen Psychiatry Date: 2017-07-03 Impact factor: 3.455

5. Deep Learning Approaches to Detect Atrial Fibrillation Using Photoplethysmographic Signals: Algorithms Development Study.

Authors: Soonil Kwon; Joonki Hong; Eue-Keun Choi; Euijae Lee; David Earl Hostallero; Wan Ju Kang; Byunghwan Lee; Eui-Rim Jeong; Bon-Kwon Koo; Seil Oh; Yung Yi
Journal: JMIR Mhealth Uhealth Date: 2019-06-06 Impact factor: 4.773

6. Application of a multidimensional computerized adaptive test for a Clinical Dementia Rating Scale through computer-aided techniques.

Authors: Yi-Lien Lee; Kao-Chang Lin; Tsair-Wei Chien
Journal: Ann Gen Psychiatry Date: 2019-05-17 Impact factor: 3.455

7. Online assessment of patients' views on hospital performances using Rasch model's KIDMAP diagram.

Authors: Tsair-Wei Chien; Wen-Chung Wang; Hsien-Yi Wang; Hung-Jung Lin
Journal: BMC Health Serv Res Date: 2009-07-31 Impact factor: 2.655

8. Applying computerized adaptive testing to the Negative Acts Questionnaire-Revised: Rasch analysis of workplace bullying.

Authors: Shu-Ching Ma; Tsair-Wei Chien; Hsiu-Hung Wang; Yu-Chi Li; Mei-Shu Yui
Journal: J Med Internet Res Date: 2014-02-17 Impact factor: 5.428

9. Improving Inpatient Surveys: Web-Based Computer Adaptive Testing Accessed via Mobile Phone QR Codes.

Authors: Tsair-Wei Chien; Weir-Sen Lin
Journal: JMIR Med Inform Date: 2016-03-02

10. Deep Learning Intervention for Health Care Challenges: Some Biomedical Domain Considerations.

Authors: Igbe Tobore; Jingzhen Li; Liu Yuhang; Yousef Al-Handarish; Abhishek Kandwal; Zedong Nie; Lei Wang
Journal: JMIR Mhealth Uhealth Date: 2019-08-02 Impact factor: 4.773