| Literature DB >> 24587378 |
Martin Dugas1, Susanne Dugas-Breit2.
Abstract
Design, execution and analysis of clinical studies involves several stakeholders with different professional backgrounds. Typically, principle investigators are familiar with standard office tools, data managers apply electronic data capture (EDC) systems and statisticians work with statistics software. Case report forms (CRFs) specify the data model of study subjects, evolve over time and consist of hundreds to thousands of data items per study. To avoid erroneous manual transformation work, a converting tool for different representations of study data models was designed. It can convert between office format, EDC and statistics format. In addition, it supports semantic annotations, which enable precise definitions for data items. A reference implementation is available as open source package ODMconverter at http://cran.r-project.org.Entities:
Mesh:
Year: 2014 PMID: 24587378 PMCID: PMC3938746 DOI: 10.1371/journal.pone.0090492
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Simplified example of data model in office format (spreadsheet).
| A | B | C | D | E | F | |
| 1 | StudyOID | S.0000 | ||||
| 2 | Sponsor | Testsponsor | ||||
| 3 | Condition | Testcondition | ||||
| 4 | StudyName | ODM Test Study | ||||
| 5 | StudyDescription | Test of ODM tools | ||||
| 6 | Form | ODM-Test | ||||
| 7 | FirstName | Test | ||||
| 8 | LastName | Testname | ||||
| 9 | Organization | Test organization | ||||
| 10 | ||||||
| 11 | Type | Name | en | UMLS CUI | SNOMED CT 2010_0731 | LOINC |
| 12 | itemgroup | Info | General Information | C0332118 | 106227002 | |
| 13 | boolean | Willingness | Willingness to participatein clinicialtrials | C1516879 | ||
| 14 | integer | Age | Age | 102518004 | ||
| 15 | date | DOB | Date of Birth | 152322001 | ||
| 16 | integer | Gender | Gender | 139865004 | ||
| 17 | codelistitem | 1 | male | C0024554 | 248153007 | |
| 18 | codelistitem | 2 | female | C0015780 | 248152002 | |
| 19 | string | DiagnosisTx | Diagnosis text | 439401001 | ||
| 20 | string | DiagnosisCd | Diagnosis code | |||
| 21 | float | Crea | Creatinine | 38483–4 | ||
| 22 | time | labTime | Time of lab value |
The header (line 1–9) contains general information about the study. Line 13–22 provide data items of different data types (column A). Column C presents item labels (en = english). Columns D,E,F contain semantic codes for each data item.
Figure 1Example of data model in CDISC ODM-format.
It consists of one form (“ODM-Test”) with one itemgroup (“Info”) and 8 data items. Details for item I.001 are displayed, including item name, detailed description in english and its associated UMLS code.
Figure 2Example of data model in statistics format.
An R data frame is provided with 8 variables (I.001 … I.008). Labels for variables and permissible values are defined, for instance “male” and “female” for item I.1004 (Gender). General information about the study like “StudyName” is provided as attribute of this data frame.
Trial IDs, medical condition, number of items and number of annotation codes regarding 10 forms in ODM format, which were used for the evaluation (randomly selected from www.medical-data-models.org).
| Trial ID | Medical Condition | Number of items | Number of annotation codes |
| NCT00824083 | Ewing-Sarcoma | 5 | 44 |
| NCT00980135 | Atopic Dermatitis | 12 | 135 |
| NCT01104584 | Breast Cancer | 17 | 219 |
| NCT01147939 | Acute Myeloid Leukemia | 27 | 306 |
| NCT01179620 | Renal Dialysis | 8 | 53 |
| NCT01283724 | Endometriosis | 11 | 172 |
| NCT01324947 | Multiple Myeloma | 33 | 333 |
| NCT01361334 | Acute Myeloid Leukemia | 28 | 376 |
| NCT01403376 | Multiple Sclerosis | 16 | 163 |
| NCT01408095 | Diabetes Mellitus, Type 2 | 27 | 355 |