| Literature DB >> 35173908 |
Nora Joseph1, Ida Lindblad2, Sara Zaker2, Sharareh Elfversson2, Maria Albinzon2, Øyvind Ødegård2, Li Hantler2, Per M Hellström1.
Abstract
BACKGROUND: Electronic medical records (EMRs) are adopted for storing patient-related healthcare information. Using data mining techniques, it is possible to make use of and derive benefit from this massive amount of data effectively. We aimed to evaluate validity of data extracted by the Customized eXtraction Program (CXP).Entities:
Keywords: Big data; data analytics; data extraction; data mining; electronic medical records
Mesh:
Year: 2022 PMID: 35173908 PMCID: PMC8809051 DOI: 10.48101/ujms.v127.8260
Source DB: PubMed Journal: Ups J Med Sci ISSN: 0300-9734 Impact factor: 2.384
The index terms in the form of structured and unstructured data that the CXP software is capable of extracting from electronic medical records of patients.
| Structured data | Unstructured data |
|---|---|
| • Demographics: | • Imaging: |
| Gender | Radiology |
| Birth data | Computerised tomography |
| Magnetic resonance imaging | |
| Scintigraphy | |
| • Diagnosis: | • Laboratory data: |
| Primary | Microbiology |
| Secondary | Pathology |
| • Measurements: | • Case notes |
| Blood pressure | Text body of electronic medical records |
| Heart rate | |
| Breathing rate | |
| Body weight and height | |
| • Medication: | • Referrals: |
| Prescribed | Between health care providers |
| Administered | |
| • Procedures: | |
| Surgical | |
| Medical | |
| • Laboratory data: | |
| Clinical chemistry and pharmacology |
Figure 1The CXP computerised process from extraction to end-product.
Note: 1) Initial extraction, the search for K51 (UC) in all EMRs in Uppsala; (n = 2,802). 2) Extraction of target objects according to the inclusion and exclusion criteria (n = 332). 3) Cleaning procedure when all raw data is pseudonymised. Then, the pseudonymised data are exported to form a clean secure a database (A, CXP data) and a Key code file (B) containing personal identifiers and the link between the pseudonymised data and the individual EMR (stored separately).
Inclusion and exclusion criteria forming the target population for the research study Ga29103, GARDENIA.
| Inclusion criteria | Exclusion criteria |
|---|---|
| Moderately to severely active UC as determined by the Mayo Clinic Score | A history of current conditions and diseases affecting the digestive tract, including UC, indeterminate colitis, suspicion of ischemic, radiation or microscopic colitis, Crohn’s disease, fistulas or abdominal abscesses, colonic mucosal dysplasia, intestinal obstruction, toxic megacolon or unremoved adenomatous polyps |
| Gender: men and women | Prior or planned surgery for UC |
| Age: 18–80 years | Past or present ileostomy or colostomy |
| Naïve to treatment with any TNF inhibitor therapy (including TNF inhibitor biosimilars) | Have received non-permitted inflammatory bowel disease (IBD) therapies (including infliximab, adalimumab, golimumab, ustekinumab, certolizumab, natalizumab, vedolizumab, eflizumab, or tofacitinib) |
| Inadequate response to or intolerance of prior corticosteroid and/or immunosuppressant treatment | Chronic hepatitis B or C infection, human immunodeficiency virus (HIV) or tuberculosis (active or latent) |
| Background regimen for UC may include oral 5-aminosalcylate, oral corticosteroids, budesonide multi-matrix system, probiotics, azathioprine, 6-mercaptopurine, or methotrexate if doses have been stable during the screening period | History of moderate or severe allergic anaphylactic/anaphylactoid reactions to chimeric, human, or humanised antibodies; fusion proteins, or murine proteins; hypersensitivity to etrolizumab or any of its excipients |
| Use of hormonal contraception during and at least 24 weeks after the last dose of the study drug |
Note: Patients fulfilling the inclusion but not the exclusion criteria were extracted from the EMR using the structured data in CXP.
Figure 2Inclusion and extraction criteria applied under the CXP.
Note: CXP identifies patients that meet the inclusion criteria, diagnosis with ICD-code K51 (n = 2,802), in the initial extraction. A step-by-step exclusion removal process is then applied in order to funnel down and remove individuals with different exclusions (n = 469, 399) to provide a clean base of eligible patients (n = 332). In the manual examination 12 extracted target objects that meet the exclusion criteria. This resulted in a final outcome of 320 patients fitting a true eligibility according to the study criteria.
Figure 3AFlow diagram for the CXP procedure extraction with all diagnoses found in the EMR.
Note: CXP extracted structured data with all diagnoses found in the EMR of the target population. The numbers of missed, duplicated and non-existing diagnoses have been removed step-by-step to arrive at a final number of correct extractions as compared with the EMR.
Figure 3BFlow diagram for the CXP procedure extraction of coherent medical procedures.
Note: Extracted CXP data of coherent medical care procedures. The numbers of missed, duplicated and non-existing procedures have been removed step-by-step to arrive at a final number of correct extractions as compared with the EMR.