| Literature DB >> 32594179 |
Tina Hernandez-Boussard1,2,3, Selen Bozkurt1, John P A Ioannidis1,4,5, Nigam H Shah1,2.
Abstract
The rise of digital data and computing power have contributed to significant advancements in artificial intelligence (AI), leading to the use of classification and prediction models in health care to enhance clinical decision-making for diagnosis, treatment and prognosis. However, such advances are limited by the lack of reporting standards for the data used to develop those models, the model architecture, and the model evaluation and validation processes. Here, we present MINIMAR (MINimum Information for Medical AI Reporting), a proposal describing the minimum information necessary to understand intended predictions, target populations, and hidden biases, and the ability to generalize these emerging technologies. We call for a standard to accurately and responsibly report on AI in health care. This will facilitate the design and implementation of these models and promote the development and use of associated clinical decision support tools, as well as manage concerns regarding accuracy and bias.Entities:
Keywords: artificial intelligence; clinical decision support; electronic health records; reporting standards
Mesh:
Year: 2020 PMID: 32594179 PMCID: PMC7727333 DOI: 10.1093/jamia/ocaa088
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Reporting standards for 4 essential components of artificial intelligence solutions in health care
| Features | Description | Example | Example |
|---|---|---|---|
| 1. Study population and setting | |||
| Population | Population from which study sample was drawn | Patients undergoing elective surgery | All patients |
| Study setting | The setting in which the study was conducted (eg, academic medical left, community healthcare system, rural healthcare clinic) | U.S. academic, tertiary care hospital | 2 U.S. academic medical lefts |
| Data source | The source from which data were collected | EHRs | EHRs |
| Cohort selection | Exclusion/inclusion criteria | Adult patients; Patients were excluded if they died during hospitalization. | All admissions for adult patients. Hospitalizations of 24 h or longer. |
| 2. Patient demographic characteristics | |||
| Age | Age of patients included in the study | Mean 58.34 y | Median ∼56 y |
| Sex | Sex breakdown of study cohort | Female: 73.0% | Female 55.0% |
| Male: 27.0% | |||
| Race | Race characteristics of patients included in the study | White: 69.0% | Not provided |
| Black: 3.1% | |||
| Asian: 5.9% | |||
| Ethnicity | Ethnicity breakdown of patients included in the study | Hispanic: 13.2% | Not provided |
| Socioeconomic status | A measure or proxy measure of the socioeconomic status of patients included in the study | Private: 31.9% | Not provided |
| Medicare: 47.8% | |||
| Medicaid: 11.7% | |||
| 3. Model architecture | |||
| Model output | The computed result of the model | Postoperative pain scores | In-hospital deaths, 30-day unplanned readmission, length of stay, discharge status |
| Target user | The indented user of the model output (eg, clinician, hospital management team, insurance company) | Risks scores produced by the model will be used by the hospital team for pain management | Predictions produced by the model will be used by hospitals for care management |
| Data splitting | How data were split for training, testing, and validation | 10-fold cross-validation | 80%/10%10% (train/validation/test) |
| Gold standard | Labeled data used to train and test the model | 100 manually annotated clinical notes and pain scores recorded in EHR | Death, readmission and ICD codes in EHRs |
| Model task | Classification or prediction | Prediction | Prediction |
| Model architecture | Algorithm type (eg, machine learning, deep learning, etc.) | ElasticNet regularized regression | Recurrent neural networks, attention-based time-aware neural network model, and neural network |
| Features | List of variables used in the model and how they were used in the model in terms of categories or transformation | 65 predictive features including age, race, ethnicity, sex, insurance type (as public and private) and preoperative pain (log transformation was applied) | Provided in detail for all models |
| Missingness | How missingness was addressed: reported, imputed, or corrected | Missing data were imputed using median of the variable distribution | Not provided |
| 4. Model evaluation | |||
| Optimization | Model or parameter tuning applied | Generated vectors with a dimension of 300 and a window size of 5 | Documented and provided for all models in detail |
| Internal model validation | Study internal validation | Internal 10-fold cross-validation | Hold-out validation set |
| External model validation | External validation using data from another setting | Not performed | Not performed |
| Transparency | How code and data are shared with the community. | Code and sample data available via GitHub | Data is not available; code is available via GitHub |
EHR: electronic health record; ICD: International Classification of Diseases.