Literature DB >> 23569615

Probabilistic case detection for disease surveillance using data in electronic medical records.

Fuchiang Tsui¹, Michael Wagner, Gregory Cooper, Jialan Que, Hendrik Harkema, John Dowling, Thomsun Sriburadej, Qi Li, Jeremy U Espino, Ronald Voorhees.

Abstract

This paper describes a probabilistic case detection system (CDS) that uses a Bayesian network model of medical diagnosis and natural language processing to compute the posterior probability of influenza and influenza-like illness from emergency department dictated notes and laboratory results. The diagnostic accuracy of CDS for these conditions, as measured by the area under the ROC curve, was 0.97, and the overall accuracy for NLP employed in CDS was 0.91.

Entities: CellLine Chemical Disease Species

Keywords: case detection; disease surveillance; electronic medical records; influenza

Year: 2011 PMID： 23569615 PMCID： PMC3615792 DOI： 10.5210/ojphi.v3i3.3793

Source DB: PubMed Journal: Online J Public Health Inform ISSN： 1947-2579

Introduction

In disease surveillance, the objective of case detection is to notice the existence of a single individual with a disease. We say that this individual is a case of the disease. The importance of case detection is that detection of an outbreak typically depends on detection of individual cases.[1] In current practice, cases are detected in four ways: by clinicians, laboratories, screening programs and computers. Some of these methods of case detection employ case definitions. A case definition is a written statement of findings that are both necessary and sufficient to classify an individual as having a disease or syndrome. More commonly, however, the determination of whether an individual has a disease (or syndrome) is left to the expert judgment of a clinician. Clinicians detect cases as a by-product of routine medical and veterinary care. The strength of case detection by clinicians is that sick individuals seek medical care. Further, clinicians are experts at diagnosing illness, which is fundamental to case detection. However not every sick individual sees a clinician. Also, clinicians may not correctly diagnose every individual they see. Clinicians may forget to report cases or fail to report cases in the time frame required by law.[2, 3] Even when a clinician reports a case, the reporting may occur relatively late in the disease process. With some exceptions (e.g., suspected meningococcal meningitis, suspected measles, suspected anthrax), clinicians report cases only after they are certain (or almost certain) about the diagnosis. A variant of clinician detection is the sentinel clinician approach.[4–9, 10] A sentinel clinician reports the number of individuals he or she sees who match a case definition, e.g., for Influenza-Like Illness (ILI). The strength of sentinel clinician case detection is its relative completeness of reporting. Its limitations include that some cases may not be reported and those cases that are reported may be delayed due to its being a manual process.[11] Another variant of clinician detection is drop-in surveillance. Drop-in surveillance refers to the practice of asking physicians in emergency rooms to complete a form for each patient seen during the period surrounding a special event.[12-19] The clinicians record whether the patient meets the case definition for one or more syndromes of interest. The strength of drop-in surveillance (and sentinel clinician surveillance) is that it detects sick individuals on the day that they first present for medical care. A limitation is that it is labor intensive. Laboratories detect cases also as a by-product of their routine operation. Laboratories often become aware of cases of notifiable diseases either before or at the same time as the clinician who ordered the test. The strength of laboratories-as-case-detectors is that they are process oriented; therefore, they may report cases more reliably than busy clinicians. A weakness is that there is not a definitive diagnostic test for every disease, and there may not be a test with 100% sensitivity for a disease. Additionally, a laboratory cannot detect a case unless a sick individual sees a clinician, who must suspect the disease and order a definitive test. Lag times for the completion of laboratory work can be substantial. Screening programs detect cases by interviewing and testing people during a known outbreak to identify additional cases (or carriers of the disease). Screening is most often used for contagious diseases in which it is important to find infected individuals to prevent further infections. Finally, computers detect cases by applying case definitions or other algorithmic approaches to routinely collect clinical data. The earliest use of automatic case detection was for hospital infections,[20-26] followed by notifiable conditions[27-30] and syndromes[31-43]. The case definitions used in case detection may be represented either using Boolean logical statements or probabilistic statements. Boolean approaches include the clinical findings that are both necessary and sufficient to classify a case, such as a case of ILI. The statements include AND and OR operations. On the other hand, a probabilistic case definition states evidence that supports or refutes a diagnosis using conditional probabilities and provides probability thresholds above which the diagnosis is considered either confirmed, likely, or suspected, e.g., a confirmed case of a Disease might be defined as P(Disease|Data) > 0.99. In this paper, we describe and evaluate an automated case detection system (CDS) that uses Bayesian network models of diagnosis to represent case definitions. It first derives the likelihood of a disease given the symptoms, signs, and findings (Data) for a patient, namely, P(Data|Disease). It then combines such a likelihood with a prior probability distribution of disease, P(Disease), to derive the posterior probability of disease given the Data, P(Disease|Data). A Bayesian network is a compact representation of a joint probability distribution among the nodes in the network. When a Bayesian network is used to represent the medical diagnosis of a disease, the variables (nodes) include the diagnosis and findings that a physician would use when diagnosing the disease, including significant negative findings that the physician might count against some disease being present. For example, negative lab tests that usually have high sensitivity can help physicians rule out a diagnosis.[1, 44, 45]Similarly, positive results of tests with high specificity can help rule in a diagnosis.[44, 45] The relevance of Bayes rules to medical diagnosis was first introduced theoretically by Ledley and Lusted in 1959[46] and was used early on in a diagnostic expert system by Homer Warner in 1961.[47] Developers of diagnostic expert systems continue to use the same methods as did Warner, as well as more complex Bayesian methods. Several theoretical advantages of Bayesian case detection over Boolean case detection include: (1) it can use the prior probability of a disease, (2) it can represent the sensitivity and specificity of tests and findings for a disease, (3) it can represent an expert’s knowledge of disease diagnosis in the form of conditional probabilities, (4) it parallels a physician’s diagnosis of reasoning under uncertainty by computing posterior probabilities of diseases, and (5) it assists in decision making when new information becomes available. The current state-of-the-art automated CDSs are (1) electronic laboratory reporting (ELR) systems that are based on laboratory reports, and (2) syndromic surveillance systems that are based on chief complaints.[43, 48] However, the two systems fall into two extremes on diagnostic accuracy and timeliness spectrums. In regards to diagnostic accuracy, electronic lab reporting is at one extreme of generally being very accurate, whereas syndromic surveillance is generally less so. Regarding timeliness, syndromic surveillance can be immediately available at the time of a patient visit, whereas an ELR can be delayed for days from the time a lab was drawn.[49] CDS is a component in the probabilistic, decision-theoretic disease surveillance and control system described in an accompanying paper in this issue of the journal. Bayesian networks have not only been used for case detection but also for outbreak detection during the past decade. As a representative example, Mnatsakanyanet al.[50] developed Bayesian information fusion networks that compute the posterior probability of an influenza outbreak by using multiple data sources, such as aggregate counts of emergency department (ED) chief complaints that are indicative of influenza and counts of relevant ICD-9 codes from outpatient clinics. As another example, Cooper et al.[51-53] developed the PANDA system and its extensions that derive the posterior probabilities of CDC Category A diseases (including anthrax, plague, tularemia, and viral hemorrhagic fevers) using ED chief complaints and patient demographic information as evidence. In this paper, we use the diagnoses of influenza and influenza-like-illness as examples, although the approach is general and can be applied to other notifiable conditions or syndromes.

Methods

This section describes (1) the Bayesian CDS, and (2) an evaluation of its diagnostic accuracy for the diagnosis of influenza and ILI.

Bayesian CDS

The Bayesian CDS includes (1) a natural language component that process free-text clinical reports and chief complaints, (2) disease models in the form of diagnostic Bayesian networks, (3) a Bayesian inference engine, and (4) a time-series chart reporting engine (Figure 1). The software components, including the inference engine, are implemented in Java.

Figure 1:

CDS and its relationship to other components in a probabilistic, decision-theoretic system for disease surveillance and control. CDS currently operates on clinical data from the UPMC Healthcare System. The blue boxes represent software components and hexagons represent models.

CDS sits between clinical data and ODS, an outbreak detection and characterization system. A component called Phoenix, described in an accompanying paper, receives data from an electronic medical record (EMR) system via HL-7 messaging, converts any proprietary codes to LOINC and SNOMED codes, stores the data, and processes requests from CDS. In general, CDS passes the likelihoods P(Data | Disease) to ODS for each modeled disease i and for each patient j in the monitoring period. For example, for a given patient, CDS would send the probability P(Data | influenza) to ODS, where Data denotes the symptoms, signs, and other findings of that patient. An accompanying paper in this issue describes ODS in more detail. In addition, CDS can output the posterior probabilities of modeled diseases for end users, as shown in Figure 1. Our design criteria for CDS included computational efficiency sufficient to keep up with the volume of new patient data in a large healthcare system, and portability.[54] CDS uses the computationally efficient junction tree algorithm[55, 56] for Bayesian inference, which is also used in popular commercial Bayesian inference engines such as Hugin® and Netica®. We have operated CDS since 2009.[57] It generates daily reports of influenza and ILI and sends them to the Allegheny County Health Department (ACHD) by way of email (Figure 2). The daily report includes a graph of the daily counts of expected influenza cases, which is derived as . It also includes in the graph a daily time-series plot of Boolean-based ILI cases, influenza test orders, and influenza positive cases. Note the Boolean ILI counts in the daily chart are based on the Boolean case definition (Fever) AND (Cough OR Sore Throat), where the symptoms or findings are extracted by NLP.

Figure 2:

Influenza and ILI summary chart for February 15, 2010 (showing data from Aug. 1, 2009 to Feb. 14, 2010) in a daily email report to the Allegheny County Health Department. It comprises daily fever counts (from NLP), accumulated influenza posterior probability counts from Bayesian CDS, ILI counts (from NLP), and influenza (flu) test positive counts.

Public health officials in the ACHD have indicated that they find CDS to be useful. The charts shown in Figure 2 illustrate three areas of impact on practice at ACHD.[57] First, CDS provided ACHD with daily updates instead of weekly reports from sentinel physicians. Second, ACHD could provide the charts to local media on a regular basis.[58] Finally, ACHD reduced staff time since they no longer had to manually compile ILI reports from sentinel ILI reports (2 days of work for each weekly report).

Disease models

One of the core components in CDS is a knowledge base that contains disease models represented as Bayesian diagnostic networks. A disease model can include symptoms, signs, diagnosis, radiology findings, and laboratory test results (which we refer to as all-data), or it may use selected data, such as laboratory results, in which case we refer to the network as lab-only. CDS has one Bayesian diagnostic network (disease model) per disease. CDS uses an existing Bayesian network design tool named GeNIe[59] as a front end graphical user interface for disease model editing. GeNIe, which was developed at the University of Pittsburgh, can be downloaded from the Web[60] for free. GeNIe can convert proprietary Bayesian network file formats used by Hugin® and Netica® (two of the most popular commercial Bayesian inference engines) into an XML file that can be then fed into CDS, allowing CDS users to import networks already developed by other groups. Note that the GeNIe tool is only needed when a user wishes to revise or create a Bayesian network. Figure 3 shows the GeNIe graphical user interface that allows an physician expert in clinical infectious disease to construct the influenza diagnostic model shown in right panel.

Figure 3:

Anthrax lab-only diagnostic network. A Bayesian network model for detection of anthrax cases using only laboratory results.

For portability, CDS disease models use standard terminology. For variables in a disease model representing symptoms and signs, we use concept unique identifiers (CUIs) from the UMLS. Both NLP tools in CDS--the well-known Medical Language Extraction and Encoding system (MedLEE)[61], and a locally developed Topaz, use CUIs to represent extracted symptoms and findings; note that we used Topaz results in this paper. For laboratory tests, we use Logical Observation Identifiers Names and Codes (LOINC).

Lab-only Diagnostic Bayesian Network

Figure 3 shows a lab-only diagnostic model for B. anthracis. The laboratory tests in this model come from the Reportable Condition Mapping Tables (RCMT).[62] This disease model comprises 33 nodes that represent 32 lab tests for B. anthracis. The names and results of the tests are represented using the Logical Observation Identifiers Names and Codes (LOINC) and Systematized Nomenclature of Medicine (SNOMED) coding systems. The parent node, labeled Anthrax, denotes whether the diagnosis of anthrax equals True or False. The 32 child nodes denote, for each laboratory test, whether the result was positive, negative, or unknown (because it has not been obtained). The structure of this particular model indicates that we are assuming that the tests are independent, given the diagnosis. Any dependencies among tests can be modeled in hidden nodes or by the inclusion of direct arcs among the nodes that denote tests. We can apply this network to report cases in a manner similar to current electronic laboratory reporting systems. The conditional probability distributions in the network represent the sensitivity and specificity of each laboratory test for the disease anthrax. Let R denote the results of a set of laboratory tests for a given individual. We can perform inference on the network to derive P(anthrax | R). If that probability is above a threshold T, then the case is reported. If the specificities of the tests are assumed to be 1, then any positive test result will lead to a probability of anthrax of 1, which will result in the reporting of the case if T≤ 1. More generally, however, the sensitivities and specificities of the tests will not be 1, and in turn the probability of anthrax given test results will not be 0 or 1. Thus, in general, there is a need for case reporting that is based on probabilistic modeling and inference.

All-data Diagnostic Bayesian Network for Influenza

We developed an all-data influenza/ILI diagnostic Bayesian network that comprises flu symptoms, findings, and lab tests defined in the RCMT (Figure 4). The symptom and sign nodes and their corresponding conditional probabilities were initially built by author JD, who is board-certified in infectious diseases. The network comprises a total of 368 nodes including 29 symptom nodes, 337 lab test nodes, one test-order node, and one disease node (influenza), which can take the values “true” or “false”. The 337 lab nodes are those tests defined as reporting conditions in RCMT.[63] Note that an NLP algorithm extracts symptoms and signs from free-text clinical reports, and they are used to set the values of the finding nodes.

Figure 4:

Influenza all-data diagnostic network. A Bayesian network model for diagnosing Influenza. The network utilizes data from free text clinical reports, orders for laboratory tests and the results of laboratory tests.

Parameter Estimation

Each model we built has two sets of parameters: expert assigned conditional probability tables (CPTs) and machine learning estimated CPTs. We have access to a large corpus of EMRs through the UPMC health System. We implemented a variation of the well-known Expectation Maximization-Maximum-A-Posteriori (EM-MAP) algorithm[56] for learning network parameters from data. The EM-MAP is implemented in Java. The algorithm is able to learn network parameters by combining the data with prior knowledge (e.g., from our infectious disease experts and the literature), while being tolerant of missing values in the data.

Natural Language Processing

We developed an NLP application called Topaz that determines the presence, missing, or absence (negation) of 51 findings (e.g., signs, symptoms, and diagnoses) that are expected in influenza and shigellosis cases, or that are significant negative findings. Note that CDS will not assign any value for a variable in a disease model when the variable identified by Topaz has a value missing. Topaz comprises three modules. Module 1 looks for relevant clinical conditions and annotates all instances of those conditions in the report. Module 2 determines which annotations are negated, historical, hypothetical, or non-patient. Module 3 integrates the information from the annotations in the first two steps to assign values of present, absent (negated), or missing to each clinical condition for each patient.

User Interface/Data Viewer

Figure 5 is a screen capture of a data viewer, which gives a patient care-episode view of the data for internal development purposes and serves as a prototype for a health department end-user interface. It displays all data associated to a patient’s visit, including extracted symptoms and signs from free text reports, lab findings and CDS output (posterior probabilities).

Figure 5.

Case review web page. A web page that allows users to review disease posterior probabilities (CDS output) and patient data including lab reports, free text reports. The posterior probabilities are displayed in a descending order with the highest disease probability on the top.

Event Driven Process

An event driven process is a software process that defines how a system reacts to an event.[64] We define an event as data that triggers the execution of CDS, such as a laboratory test report or an ED report for a patient’s visit. When an event is available to CDS, CDS computes the posterior probabilities of the patient. Since a patient’s visit may have multiple events (such as chief complaint, ED reports, laboratory test reports) that are available at different points in time, a disease’s posterior probability may change over time. For example, a lab report followed by a free text discharge report could raise the influenza posterior probability from 0.5 to very close to 1 when the lab report states an influenza test is positive. Note that a free-text ED report could be available a few hours after the patient visit whereas a lab report could take days. To obtain an accurate patient diagnosis, when an event becomes available, CDS retrieves all patient events across different types up to the current time of the patient visit by using a data linkage key. In particular, CDS uses the visit number as the data linkage key.

Evaluation of Bayesian CDS

We evaluated Bayesian CDS in two ways: 1) case detection performance for one illness (i.e., influenza) from processing one data type, namely ED reports, and 2) NLP (Topaz) performance for extracting findings from ED reports.

Diagnostic Bayesian Network Study

For the study of case detection performance, we evaluated two influenza (all-data) Bayesian networks: 1) an expert influenza network constructed by a board-certified infectious disease domain expert, who assessed both the structure and parameters and the Bayesian network, and 2) an EM-MAP trained influenza network. Note that both networks share the same structure but different parameters.

Training and Testing Data for Case Detection Evaluation

In this study, we used ED reports from UPMC Heath System to measure the CDS performance for influenza case detection. All the ED reports used for evaluation were de-identified by an honest broker using the De-ID tool.[65] The training data comprised 182 influenza cases and 47,062 non-influenza cases. The test data consisted of 58 influenza positive cases and 522 non-influenza cases. All cases were selected randomly from EMRs in the UPMC HS. We considered a patient to have influenza if: 1) a polymerase chain reaction (PCR) test was positive, and 2) the linked ED reports had the keywords of flu, influenza, or H1N1 in the Impression section or Diagnosis section. We considered a patient to not have influenza if: 1) no flu tests were ordered, and 2) the ED visits were during July 1, 2010 through August 31, 2010 for the training data, and during July 1, 2011 through July 31, 2011 for the test data.

Evaluation Metrics

The evaluation metrics used in this study include: ROC curves, area under a ROC curve (AUROC), probability of data given each of two diagnostic Bayesian networks as stated in the above paragraph, and the average speed for processing one case.

Topaz (NLP) Evaluation

We randomly selected 201 ED reports with flu PCR tests positive. The gold standard for evaluating Topaz was experts’ annotation. Three board certified physicians annotated the ED reports for a set of 51 signs, symptoms, and other findings that are expected in influenza and shigellosis cases. To ensure reliability, all the three annotators first went through training sessions; when the measured kappa value was above 0.8, they started annotating the 201 ED reports. The evaluation metrics used in this study include kappa values, accuracy, and recall and precision.

Results

This section provides the evaluation results.

Diagnostic Bayesian Networks

Figure 6 shows the two ROC curves for the expert model and the EM-MAP trained model for the total 580 test cases. The expert model has AUROC 0.956 (95% CI: 0.936–0.977) and the EM-MAP model has AUROC 0.973 (95% CI: 0.955–0.992).

Figure 6.

ROC curves for two influenza Bayesian networks. The blue line (with dots) represents the influenza model with parameters assigned by a domain expert and the pink line (with dashes) represents the influenza model with parameters learned by EM-MAP algorithm.

We measured the computational speed for computing the posterior probabilities and EM-MAP training. The average run time for computing influenza posterior probability is 15 milliseconds per case. The speed performance was measured on a desktop computer with Intel® Core™ 2 Quad CPU Q9550, 2.83 GHz and 4GB RAM.

Topaz

Table 1 summarizes the performance of Topaz. The kappa value between the gold standard and Topaz was 0.79. The overall accuracy including absent (negated), present, and missing findings was 0.91 and the accuracy for only absent and present was 0.77.

Table 1.

Topaz performance

	Recall	Precision
Absent (negated)	0.82 (1022/1249)	0.84 (1022/1220)
Present	0.73 (1109/1526)	0.84 (1109/1319)
Missing	0.96 (7205/7476)	0.93 (7205/7712)

Discussion

The results of the evaluation of the two influenza Bayesian networks (expert model and EM-MAP model) show high diagnostic accuracy. Additionally, augmenting the expert’s conditional probability distributions used in the model with empirical data about the distributions improves the diagnostic accuracy for influenza case detection. The performance of the Topaz natural language processing algorithm for influenza findings approaches that of medical experts, as indicated by the kappa value 0.79 and overall accuracy of 91%. A limitation of the evaluation study of Bayesian diagnostic models is as follows. Although we obtained non-influenza cases from patient visits that occurred in the summer and were not associated with an order for an influenza test, it is possible that there are influenza cases in the non-influenza training and testing data. However, any such contamination would be expected to bias the experiment against finding good diagnostic accuracy. We also note that our current influenza model (Figure 4) should be modified to distinguish between Influenza A and B, which we plan as future work. Of the four types of case detection discussed in the introduction—clinician, laboratory, screening, and computerized—the principle role of Bayesian CDS is in computerized (automatic) case detection. CDS can be used to augment laboratory, clinician, and screening case detection systems. To assist clinical diagnosis, the differential diagnoses output by CDS can be fed back directly to clinicians, or to other computer systems that provide decision support to clinicians at the point of care—reminding clinicians of diagnoses, notification requirements, vaccination, and history items to obtain or laboratory tests to order. For laboratory-based case detection, the lab-only approach for Bayesian case detection discussed in this paper is a superset of current ELR approaches, which has the advantage of being able to represent the uncertainty associated with lower sensitivity or specificity tests. For screening, the ability of the Bayesian CDS to represent a probabilistic case definition could be a significant advantage for emerging diseases that have case definitions that may be evolving or are dependent on constellation of symptoms and signs.

Conclusion

We developed an automatic case detection system that uses Bayesian networks as disease models and NLP to extract patient information from free-text clinical reports. The system computes disease probabilities given data from electronic medical records. The system is in use for influenza monitoring in Allegheny County, PA, automatically reporting daily summary charts to public health officials.[57] The Bayesian CDS can function as a probabilistic ELR system or an all-data case-detection system. CDS is capable of integrating diagnostic information about a patient with prior probabilities of diseases to compute a probabilistic differential diagnosis that can be used in clinical decision support. The case probabilities derived by CDS can also be used as a key component for a system that detects and characterizes outbreak diseases in the population; a companion paper in this issue discusses a system called ODS that does just that.

49 in total

1. Syndromic surveillance for bioterrorism following the attacks on the World Trade Center--New York City, 2001.

Authors:
Journal: MMWR Morb Mortal Wkly Rep Date: 2002-09-11 Impact factor: 17.586

2. Reasoning foundations of medical diagnosis; symbolic logic, probability, and value theory aid our understanding of how physicians reason.

Authors: R S LEDLEY; L B LUSTED
Journal: Science Date: 1959-07-03 Impact factor: 47.728

3. Automated encoding of clinical documents based on natural language processing.

Authors: Carol Friedman; Lyudmila Shagina; Yves Lussier; George Hripcsak
Journal: J Am Med Inform Assoc Date: 2004-06-07 Impact factor: 4.497

4. A Bayesian spatio-temporal method for disease outbreak detection.

Authors: Xia Jiang; Gregory F Cooper
Journal: J Am Med Inform Assoc Date: 2010 Jul-Aug Impact factor: 4.497

Review 5. A systems overview of the Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCE II).

Authors: Joseph Lombardo; Howard Burkom; Eugene Elbert; Steven Magruder; Sheryl Happel Lewis; Wayne Loschen; James Sari; Carol Sniegoski; Richard Wojcik; Julie Pavlin
Journal: J Urban Health Date: 2003-06 Impact factor: 3.671

6. Computer surveillance of hospital-acquired infections and antibiotic use.

Authors: R S Evans; R A Larsen; J P Burke; R M Gardner; F A Meier; J A Jacobson; M T Conti; J T Jacobson; R K Hulse
Journal: JAMA Date: 1986 Aug 22-29 Impact factor: 56.272

7. Measles reporting completeness during a community-wide epidemic in inner-city Los Angeles.

Authors: D P Ewert; S Westman; P D Frederick; S H Waterman
Journal: Public Health Rep Date: 1995 Mar-Apr Impact factor: 2.792

8. Syndromic surveillance for measleslike illnesses in a managed care setting.

Authors: James D Nordin; Rafael Harpaz; Peter Harper; William Rush
Journal: J Infect Dis Date: 2004-05-01 Impact factor: 5.226

9. Disease outbreak detection system using syndromic data in the greater Washington DC area.

Authors: Michael D Lewis; Julie A Pavlin; Jay L Mansfield; Sheilah O'Brien; Louis G Boomsma; Yevgeniy Elbert; Patrick W Kelley
Journal: Am J Prev Med Date: 2002-10 Impact factor: 5.043

10. Improvement in user performance following development and routine use of an expert system.

Authors: M G Kahn; S A Steib; E L Spitznagel; D W Claiborne; V J Fraser
Journal: Medinfo Date: 1995

10 in total

1. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers.

Authors: Ye Ye; Fuchiang Rich Tsui; Michael Wagner; Jeremy U Espino; Qi Li
Journal: J Am Med Inform Assoc Date: 2014-01-09 Impact factor: 4.497

2. Automated influenza case detection for public health surveillance and clinical diagnosis using dynamic influenza prevalence method.

Authors: Fuchiang Tsui; Ye Ye; Victor Ruiz; Gregory F Cooper; Michael M Wagner
Journal: J Public Health (Oxf) Date: 2018-12-01 Impact factor: 2.341

3. The design and evaluation of a Bayesian system for detecting and characterizing outbreaks of influenza.

Authors: Nicholas E Millett; John M Aronis; Michael M Wagner; Fuchiang Tsui; Ye Ye; Jeffrey P Ferraro; Peter J Haug; Per H Gesteland; Gregory F Cooper
Journal: Online J Public Health Inform Date: 2019-09-19

4. A method for detecting and characterizing outbreaks of infectious disease from clinical reports.

Authors: Gregory F Cooper; Ricardo Villamarin; Fu-Chiang Rich Tsui; Nicholas Millett; Jeremy U Espino; Michael M Wagner
Journal: J Biomed Inform Date: 2014-08-30 Impact factor: 6.317

5. A Bayesian system to detect and characterize overlapping outbreaks.

Authors: John M Aronis; Nicholas E Millett; Michael M Wagner; Fuchiang Tsui; Ye Ye; Jeffrey P Ferraro; Peter J Haug; Per H Gesteland; Gregory F Cooper
Journal: J Biomed Inform Date: 2017-08-07 Impact factor: 6.317

6. Comparison of machine learning classifiers for influenza detection from emergency department free-text reports.

Authors: Arturo López Pineda; Ye Ye; Shyam Visweswaran; Gregory F Cooper; Michael M Wagner; Fuchiang Rich Tsui
Journal: J Biomed Inform Date: 2015-09-16 Impact factor: 6.317

7. A study of the transferability of influenza case detection systems between two large healthcare systems.

Authors: Ye Ye; Michael M Wagner; Gregory F Cooper; Jeffrey P Ferraro; Howard Su; Per H Gesteland; Peter J Haug; Nicholas E Millett; John M Aronis; Andrew J Nowalk; Victor M Ruiz; Arturo López Pineda; Lingyun Shi; Rudy Van Bree; Thomas Ginter; Fuchiang Tsui
Journal: PLoS One Date: 2017-04-05 Impact factor: 3.240

Review 8. Extracting information from the text of electronic medical records to improve case detection: a systematic review.

Authors: Elizabeth Ford; John A Carroll; Helen E Smith; Donia Scott; Jackie A Cassell
Journal: J Am Med Inform Assoc Date: 2016-02-05 Impact factor: 4.497

9. Surveillance of Peripheral Arterial Disease Cases Using Natural Language Processing of Clinical Notes.

Authors: Naveed Afzal; Sunghwan Sohn; Christopher G Scott; Hongfang Liu; Iftikhar J Kullo; Adelaide M Arruda-Olson
Journal: AMIA Jt Summits Transl Sci Proc Date: 2017-07-26

10. Diversity of Mobile Genetic Elements in the Mitogenomes of Closely Related Fusarium culmorum and F. graminearum sensu stricto Strains and Its Implication for Diagnostic Purposes.

Authors: Tomasz Kulik; Balazs Brankovics; Anne D van Diepeningen; Katarzyna Bilska; Maciej Żelechowski; Kamil Myszczyński; Tomasz Molcan; Alexander Stakheev; Sebastian Stenglein; Marco Beyer; Matias Pasquali; Jakub Sawicki; Joanna Wyrȩbek; Anna Baturo-Cieśniewska
Journal: Front Microbiol Date: 2020-05-25 Impact factor: 5.640

10 in total