| Literature DB >> 31319802 |
Alexandra Pomares-Quimbaya1, Markus Kreuzthaler2, Stefan Schulz2.
Abstract
BACKGROUND: The identification of sections in narrative content of Electronic Health Records (EHR) has demonstrated to improve the performance of clinical extraction tasks; however, there is not yet a shared understanding of the concept and its existing methods. The objective is to report the results of a systematic review concerning approaches aimed at identifying sections in narrative content of EHR, using both automatic or semi-automatic methods.Entities:
Keywords: Clinical narrative; Electronic health record; Free text; Machine learning; Natural language processing; Section identification
Mesh:
Year: 2019 PMID: 31319802 PMCID: PMC6637496 DOI: 10.1186/s12874-019-0792-y
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Selection method
Profiling of the studies
| Reference | Method | Focus | Section | Type | Language | Narrative |
|---|---|---|---|---|---|---|
| Doan et al. [ | R | S | G&S | B | English | D |
| Denny et al. [ | R | P | G&S | B | English | H,Phy |
| Denny et al. [ | R | S | G&S | B | English | D,P |
| Edinger et al. [ | R | P | G | E | English | D,R,N |
| Hsu et al. [ | R | S | G&S | B | English | Pat,R |
| Kropf et al. [ | R | S | G&S | E | German | Pat |
| Lee and Choi [ | R | P | G | B | Korean | D |
| Lin et al. [ | R | S | G | E | English | C,Pat |
| Mehrabi et al. [ | R | S | G&S | E | English | C |
| Melton et al. [ | R | P | G&S | E | English | O |
| Meystre and Haug [ | R | S | G | E | English | NS |
| Taira et al. [ | R | P | G | B | English | R |
| Phuong and Chau [ | R | S | G | E | English | D |
| Rubin and Desser [ | R | S | G | E | English | R |
| Schadow and McDonald [ | R | S | G&S | E | English | Pat |
| Schuemie et al. [ | R | P | G&S | E | English | C |
| Shivade et al. [ | R | S | G&S | B | English | D,A,L |
| Singh et al. [ | R | S | G | E | English | R |
| Suominen et al. [ | R | S | G | E | Finish | D,A,P,N |
| Tran et al. [ | R | P | G&S | B | English | NS |
| Wang et al. [ | R | S | G | E | English | D |
| Xu et al. [ | R | S | G | E | English | D |
| Bramsen et al. [ | ML | P | G | I | English | D |
| Deléger and Névéol [ | ML | S | G | I | French | NS |
| Haug et al. [ | ML | P | G | B | English | D,O,C,Pat,R,H,Phy |
| Jancsary et al. [ | H | P | G&S | E | German | Dic |
| Li et al. [ | ML | P | G | B | English | D,A,C |
| Lohr et al. [ | ML | P | G&S | B | German | D |
| Mowery et al. [ | ML | P | G&S | B | English | E |
| Tepper et al. [ | ML | P | G&S | B | English | D,R |
| Waranusast et al. [ | ML | P | G | I | Other | C |
| Apostolova et al. [ | H | P | G | B | English | R |
| Cho et al. [ | H | P | G&S | B | English | R,U |
| Dai et al. [ | H | P | G | B | English | D,O |
| Ganesan and Subotin [ | H | P | G | E | English | NS |
| Ni et al. [ | H | P | G&S | E | NS | NS |
| Sadoughi et al. [ | H | P | G | B | English | Dic |
NS=Not Specified; Focus: P=Primary, S=Secondary; Sections: G=General sections, S=General and Subsections; Type: E=Explicit, I=Implicit, B=Both; Narrative: D=Discharge Summaries, A=Admission Notes, P=Progress Notes, O=Operative or Procedure Notes, C=Clinic Visit Notes, E=Emergency Reports, Pat=Pathology Reports, R=Radiology Reports, U=Urology Reports, H=History or Family History, Phy=Physical Exam, N=Nursing Notes, Dic=Medical Dictation, L=Letter of Communication; Method: ML=Machine Learning, R=Rule-based, H=Hybrid
Rule based studies
| Reference | Type of rules | Info. required | Generated features | Method |
|---|---|---|---|---|
| Edinger et al. [ | E | Flat | NS | R |
| Ni et al. [ | E | Flat | NS | H |
| Phuong and Chau [ | E | Flat | NS | R |
| Singh et al. [ | E | Flat | NS | R |
| Wang et al. [ | E | Flat | NS | R |
| Apostolova et al. [ | R | Flat | F | H |
| Chen et al. [ | R | Flat | F | H |
| Hsu et al. [ | R | Flat | F | R |
| Kropf et al. [ | R | Flat | NS | R |
| Lin et al. [ | R | Flat | NS | R |
| Melton et al. [ | R | Flat | F | R |
| Meystre and Haug [ | R | Flat | F | R |
| Rubin and Desser [ | R | Flat | NS | R |
| Sadoughi et al. [ | R | Flat | F | H |
| Schadow and McDonald [ | R | Flat | NS | R |
| Schuemie et al. [ | R | Do not require | F | R |
| Taira et al. [ | R | Flat | F | R |
| Ganesan and Subotin [ | E,R | Flat | F | H |
| Jancsary et al. [ | E,R | Dictionary | NS | H |
| Lee and Choi [ | P | Hierarchy | NS | R |
| Tran et al. [ | P | Hierarchy | F,C | R |
| Suominen et al. [ | R,P | Flat | F,C | R |
| Cho et al. [ | E,R,P | Flat | F,H | H |
| Denny et al. [ | E,R,P | Hierarchy | F,H | R |
| Denny et al. [ | E,R,P | Hierarchy | F,H | R |
| Doan et al. [ | E,R,P | Hierarchy | F,H | R |
| Mehrabi et al. [ | E,R,P | Hierarchy | F,H | R |
| Shivade et al. [ | E,R,P | Flat | F,C | R |
| Xu et al. [ | E,R,P | Hierarchy | F,H | R |
NS=Not Specified; Type of Rules: E=Exact matching, R=Regular Expressions, P=Probabilistic rules;Generated Features: F=Formatting features, C= Concept or terms contained, H=Heading Probabilities; Method: R=Rule-based, H=Hybrid
Machine learning studies
| Reference | ML method | Training and test data set source | Training data set size | Test data set size | Method |
|---|---|---|---|---|---|
| Bramsen et al. [ | AdaBoost | M | 60 | CV | ML |
| Haug et al. [ | Bayesian Network | M | 3483 | CV | ML |
| Chen et al. [ | Conditional Random Fields | M, RB, CO | 790 | 514 | H |
| Deléger and Névéol [ | Conditional Random Fields | M | 100 | 600 | ML |
| Ni et al. [ | Conditional Random Fields and Maximum Entropy Classifier | M, AL | NS | NS | H |
| Jancsary et al. [ | Conditional Random Fields and Viterbi | M, RB | 2340 | 1003 | H |
| Cho et al. [ | Expectation Maximization Classifier | M, RB | NS | NS | H |
| Li et al. [ | Hidden Markov Model and Viterbi | M, RB | 7549 | 2130 | ML |
| Lohr et al. [ | Logistic Regression | M | 1106 | CV | ML |
| Ganesan and Subotin [ | Logistic Regression and Viterbi | M, RB | 1800 | 12502 | H |
| Tepper et al. [ | Maximum Entropy Classifier | M, CO | 1365 | 374 | ML |
| Sadoughi et al. [ | Neural Network | M, RB | 25842 | 2000 | H |
| Apostolova et al. [ | Support Vector Machine | M, RB | 3000 | 200 | H |
| Mowery et al. [ | Support Vector Machine | M | 50 | CV | ML |
| Waranusast et al. [ | Support Vector Machine and KNN | M | 10694 | CV | ML |
NS=Not Specified; Training and Test Data Set Source: M=Manually created, RB=Using a rule-based approach, CO= Using a data set provided by competition organizers, AC= Using an active learning strategy; Test Data Set Size: CV=Cross Validation; Method: ML=Machine Learning, H=Hybrid
Machine learning features
| Reference | Lexical | Syntactical | Semantic | Contextual | Method |
|---|---|---|---|---|---|
| Bramsen et al. [ | U,N | POS | RT,AT,T | LP | ML |
| Haug et al. [ | N | NS | NS | NS | ML |
| Chen et al. [ | C,A | Pun | ST | WL | H |
| Deléger and Névéol [ | NS | NS | NS | NS | ML |
| Ni et al. [ | NS | NS | NS | NS | H |
| Jancsary et al. [ | N | POS,Pun | ST | LP | H |
| Cho et al. [ | NS | NS | ST | SS,OS | H |
| Li et al. [ | N | NS | NS | SB | ML |
| Lohr et al. [ | U | NS | NS | NS | ML |
| Ganesan and Subotin [ | U,N,C | NS | ST | LP,LL,LC,CC | H |
| Tepper et al. [ | U,C | Nu | NS | LP,WL,SS,SB | ML |
| Sadoughi et al. [ | NS | NS | NS | SB | H |
| Apostolova et al. [ | N,C | Pun | ST | LP,WL,SB | H |
| Mowery et al. [ | U,N | POS,VT | ST,DI,MN | LP,LL,SB | ML |
| Waranusast et al. [ | NS | NS | NS | SB | ML |
NS=Not Specified; Lexical: U=Unigram, N=N-gram, C= Capitalized, A=Affixes ; Syntactical: POS=Word Part of Speech, VT=Verb Tense, Pun=Punctuation, Nu=contains of begins with a number; Semantic: ST=Semantic Type (e.g. UMLS, LOINC), DI=De-identification tag, MN=Meaning of the number(e.g. phone, dosis), RT=is it a relative temporal word (e.g. later, next, until), AT=is it an absolute temporal word (e.g. am, pm), T=Topic of the section; Contextual: LP=Line position in the document, LL=Length of a line, WL=White lines before and after a line, LC=Length change from one line to another, SS=Section size, SB=Previous and following section boundaries, OS=Order of sections, CC= Capital and colon use; Method: ML=Machine Learning, H=Hybrid
Application scenarios
| Application | Studies |
|---|---|
| Building structured data | Rubin and Desser [ |
| Contextualized search | Meystre and Haug [ |
| Coreference resolution | Xu et al. [ |
| Named entity recognition | Lei et al. [ |
| De-identification process | Phuong and Chau [ |
| Temporal analysis | Bramsen et al. [ |
| Education | Denny et al. [ |
| Quality analysis | Hsu et al. [ |
Fig. 2Distribution of performance results
Fig. 3Individual performance results