The study systematically reviews and meta-analyzes the predictive performance of various machine learning (ML) models for kidney complications, specifically acute kidney injury (AKI) and contrast-induced nephropathy (CIN), following percutaneous coronary intervention (PCI) or coronary angiography (CAG). The statistical interpretation and professional implications are as follows:
Statistical Interpretation
Data Collection and Analysis:
The researchers conducted a comprehensive literature search in PubMed, Scopus, and Embase, following PRISMA guidelines, from inception to June 11, 2024.
Fourteen studies met the inclusion criteria from an initial pool of 431, focusing on ML models' performance metrics like AUC (Area Under the Curve), accuracy, sensitivity, specificity, and precision.
The primary effect size was the AUC, with a random-effects model used to pool AUC values and heterogeneity assessed via the I² statistic.
Performance Metrics:
Gradient Boosting Machine (GBM) and Support Vector Machine (SVM) models showed the highest pooled AUCs of 0.87 (95% CI: 0.82-0.92) and 0.85 (95% CI: 0.80-0.90) respectively, with low heterogeneity (I² < 30%).
Random Forest (RF) also demonstrated a similar AUC of 0.85 (95% CI: 0.78-0.92) but exhibited significant heterogeneity (I² > 90%).
Multilayer Perceptron (MLP) and XGBoost had moderate pooled AUCs of 0.79 (95% CI: 0.74-0.84) with high heterogeneity.
Predictors:
Age, serum creatinine, left ventricular ejection fraction, and hemoglobin consistently influenced model efficacy.
Model Efficacy:
The GBM and SVM models, with robust AUCs and low heterogeneity, are effective in predicting AKI and CIN post-PCI/CAG.
Despite competitive AUCs, RF, MLP, and XGBoost models showed considerable heterogeneity, indicating variability in their performance across different patient populations and highlighting the need for further validation.
Professional Implications
Clinical Relevance:
ML models, particularly GBM and SVM, show promise for early detection and intervention of kidney complications following PCI/CAG, potentially improving patient outcomes.
The ability of these models to handle complex, non-linear relationships among various clinical and biochemical predictors surpasses traditional logistic regression models.
Future Research:
The study underscores the necessity for further validation of RF, MLP, and XGBoost models due to their high heterogeneity.
Future research should aim to standardize the predictors and patient populations used in ML models to enhance their generalizability and reliability.
Implementation in Clinical Practice:
The integration of ML models into clinical decision-making processes could facilitate targeted preventive measures for high-risk patients, ultimately reducing the incidence and severity of AKI and CIN post-PCI/CAG.
Collaboration between data scientists and clinicians is essential to refine these models and ensure their practical applicability in diverse clinical settings.