Benchmarking Predictive Risk Models for Emergency Departments with Large Public Electronic Health Records

by   Feng Xie, et al.

There is a continuously growing demand for emergency department (ED) services across the world, especially under the COVID-19 pandemic. Risk triaging plays a crucial role in prioritizing limited medical resources for patients who need them most. Recently the pervasive use of Electronic Health Records (EHR) has generated a large volume of stored data, accompanied by vast opportunities for the development of predictive models which could improve emergency care. However, there is an absence of widely accepted ED benchmarks based on large-scale public EHR, which new researchers could easily access. Success in filling in this gap could enable researchers to start studies on ED more quickly and conveniently without verbose data preprocessing and facilitate comparisons among different studies and methodologies. In this paper, based on the Medical Information Mart for Intensive Care IV Emergency Department (MIMIC-IV-ED) database, we proposed a public ED benchmark suite and obtained a benchmark dataset containing over 500,000 ED visits episodes from 2011 to 2019. Three ED-based prediction tasks (hospitalization, critical outcomes, and 72-hour ED revisit) were introduced, where various popular methodologies, from machine learning methods to clinical scoring systems, were implemented. The results of their performance were evaluated and compared. Our codes are open-source so that anyone with access to MIMIC-IV-ED could follow the same steps of data processing, build the benchmarks, and reproduce the experiments. This study provided insights, suggestions, as well as protocols for future researchers to process the raw data and quickly build up models for emergency care.


Unsupervised Learning to Subphenotype Delirium Patients from Electronic Health Records

Delirium is a common acute onset brain dysfunction in the emergency sett...

Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Clinical notes are assigned ICD codes - sets of codes for diagnoses and ...

Overly Optimistic Prediction Results on Imbalanced Data: Flaws and Benefits of Applying Over-sampling

Information extracted from electrohysterography recordings could potenti...

Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML

Medical applications of machine learning (ML) have experienced a surge i...

Understanding the Characteristics of Frequent Users of Emergency Departments: What Role Do Medical Conditions Play?

Frequent users of emergency departments (ED) pose a significant challeng...

Intimate Partner Violence and Injury Prediction From Radiology Reports

Intimate partner violence (IPV) is an urgent, prevalent, and under-detec...

Please sign up or login with your details

Forgot password? Click here to reset