A Hybrid Ensemble Boosting Model for Enhanced Blood Donor Retention
Abstract
Blood donor retention is critical for maintaining a stable and reliable blood supply, yet
predicting donor retention remains a complex challenge. Previous attempts to develop blood
donor retention models relied on single algorithms and achieved relatively low prediction
accuracy limiting their practical application for donor retention. The Light Gradient Boosting
Machine (Light GBM) algorithm employs leaf-wise growth strategy, excels in loss reduction
and hence improves accuracy. However, this may lead to potential overfitting, on the other
hand, the Extreme Gradient Boosting(XGBoost) algorithm incorporates a robust mechanism
for combating overfitting, such as the regularization parameter, column sampling, and weight
reduction on new trees but employs a level-wise growth strategy, which is sometimes
computationally intensive. This study developed a hybrid ensemble gradient boosting model
based on XGBoost and Light GBM. The ensemble model leverages on the high accuracy of
Light GBM while mitigating overfitting through and the overfitting prevention strategies of
XGBoost. The data was obtained from the Kenya blood banks with 5000 records and nine
features. The base models were trained in parallel, a weighted ensemble model was created
by assigning weights to the respective prediction results of each model, the ensemble model
was then evaluated and the accuracy compared with the accuracy achieved by the individual
algorithms. Bayesian hyperparameter optimization was implemented on the base learners in
order to find the best combination of hyperparameters and further improve the performance
of the model. The ensemble model achieved a performance accuracy of 99.00% and F1 score
of 99.00%. This study enables blood agencies to accurately predict blood donor retention,
thereby reducing the need for constant donor recruitment efforts and saving both time and
costs. Additionally, it will provide insights for targeted retention strategies, ensuring a steady
blood supply, ultimately saving lives and improving healthcare systems.
