Introduction: Bacteremia, a common disease but difficult to diagnose early, may result in significant morbidity and mortality without prompt treatment. We aimed to develop machine-learning (ML) algorithms to predict patients with bacteremia from febrile patients presenting to the emergency department (ED) using data that is readily available at the triage.
Methods: We included all adult patients (≥18 years of age) who presented to the emergency department (ED) of National Taiwan University Hospital (NTUH), a tertiary teaching hospital in Taiwan, with the chief complaint of fever or measured body temperature more than 38°C, and who received at least one blood culture during the ED encounter. We extracted data from the Integrated Medical Database of NTUH from 2009–2018.The dataset included patient demographics, triage details, symptoms, and medical history. The positive blood culture result of at least one potential pathogen was defined as bacteremia and used as the binary classification label. We split the dataset into training/validation and testing sets (60-to-40 ratio) and trained five supervised ML models using K-fold cross-validation. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC) in the testing set.
Results: We included 80,201 cases in this study. Of them, 48120 cases were assigned to the training/validation set and 32,081 to the testing set. Bacteremia was identified in 5,831 (12.1%) and 3,824 (11.9%) cases of the training/validation set and test set, respectively. All ML models performed well, with CatBoost achieving the highest AUC (.844, 95% confidence interval [CI] .837-.850), followed by extreme gradient boosting (.843, 95% CI .836-.849), gradient boosting (.842, 95% CI .836-.849), light gradient boosting machine (.841, 95% CI .834-.847), and random forest (.828, 95% CI .821-.834).
Conclusion: Our machine-learning model has shown excellent discriminatory performance to predict bacteremia based only on clinical features at ED triage. It has the potential to improve care quality and save more lives if successfully implemented in the ED.