News in English

An interpretable machine learning model for predicting 28-day mortality in patients with sepsis-associated liver injury

by Chengli Wen, Xu Zhang, Yong Li, Wanmeng Xiao, Qinxue Hu, Xianying Lei, Tao Xu, Sicheng Liang, Xiaolan Gao, Chao Zhang, Zehui Yu, Muhan Lü

Sepsis-Associated Liver Injury (SALI) is an independent risk factor for death from sepsis. The aim of this study was to develop an interpretable machine learning model for early prediction of 28-day mortality in patients with SALI. Data from the Medical Information Mart for Intensive Care (MIMIC-IV, v2.2, MIMIC-III, v1.4) were used in this study. The study cohort from MIMIC-IV was randomized to the training set (0.7) and the internal validation set (0.3), with MIMIC-III (2001 to 2008) as external validation. The features with more than 20% missing values were deleted and the remaining features were multiple interpolated. Lasso-CV that lasso linear model with iterative fitting along a regularization path in which the best model is selected by cross-validation was used to select important features for model development. Eight machine learning models including Random Forest (RF), Logistic Regression, Decision Tree, Extreme Gradient Boost (XGBoost), K Nearest Neighbor, Support Vector Machine, Generalized Linear Models in which the best model is selected by cross-validation (CV_glmnet), and Linear Discriminant Analysis (LDA) were developed. Shapley additive interpretation (SHAP) was used to improve the interpretability of the optimal model. At last, a total of 1043 patients were included, of whom 710 were from MIMIC-IV and 333 from MIMIC-III. Twenty-four clinically relevant parameters were selected for model construction. For the prediction of 28-day mortality of SALI in the internal validation set, the area under the curve (AUC (95% CI)) of RF was 0.79 (95% CI: 0.73–0.86), and which performed the best. Compared with the traditional disease severity scores including Oxford Acute Severity of Illness Score (OASIS), Sequential Organ Failure Assessment (SOFA), Simplified Acute Physiology Score II (SAPS II), Logistic Organ Dysfunction Score (LODS), Systemic Inflammatory Response Syndrome (SIRS), and Acute Physiology Score III (APS III), RF also had the best performance. SHAP analysis found that Urine output, Charlson Comorbidity Index (CCI), minimal Glasgow Coma Scale (GCS_min), blood urea nitrogen (BUN) and admission_age were the five most important features affecting RF model. Therefore, RF has good predictive ability for 28-day mortality prediction in SALI. Urine output, CCI, GCS_min, BUN and age at admission(admission_age) within 24 h after intensive care unit(ICU) admission contribute significantly to model prediction.

Читайте на 123ru.net