Data-Driven Decision-Making for Bank Target Marketing Using Supervised Learning Classifiers on Imbalanced Big Data

dc.contributor.authorNasir, Fahim
dc.contributor.authorAhmed, Abdulghani Ali
dc.contributor.authorKiraz, Mehmet Sabir
dc.contributor.authorYevseyeva, Iryna
dc.contributor.authorSaif, Mubarak
dc.date.acceptance2024-08-26
dc.date.accessioned2024-10-21T14:22:09Z
dc.date.available2024-10-21T14:22:09Z
dc.date.issued2024-10-15
dc.descriptionopen access article
dc.description.abstractIntegrating machine learning and data mining is crucial for processing big data and extracting valuable insights to enhance decision-making. However, imbalanced target variables within big data present technical challenges that hinder the performance of supervised learning classifiers on key evaluation metrics, limiting their overall effectiveness. This study presents a comprehensive review of both common and recently developed Supervised Learning Classifiers (SLCs) and evaluates their performance in data-driven decision-making. The evaluation uses various metrics, with a particular focus on the Harmonic Mean Score (F-1 score) on an imbalanced real-world bank target marketing dataset. The findings indicate that grid-search random forest and random-search random forest excel in Precision and area under the curve, while Extreme Gradient Boosting (XGBoost) outperforms other traditional classifiers in terms of F-1 score. Employing oversampling methods to address the imbalanced data shows significant performance improvement in XGBoost, delivering superior results across all metrics, particularly when using the SMOTE variant known as the BorderlineSMOTE2 technique. The study concludes several key factors for effectively addressing the challenges of supervised learning with imbalanced datasets. These factors include the importance of selecting appropriate datasets for training and testing, choosing the right classifiers, employing effective techniques for processing and handling imbalanced datasets, and identifying suitable metrics for performance evaluation. Additionally, factors also entail the utilisation of effective exploratory data analysis in conjunction with visualisation techniques to yield insights conducive to data-driven decision-making.
dc.funderOther external funder (please detail below)
dc.funder.otherFinancial assistance from Universiti Tun Hussein Onn Malaysia and the UTHM Publisher’s office through publication fund E15216.
dc.identifier.citationNasir, F., Ahmed, A.A., Kiraz, M.S., Yevseyeva, I., Saif, M. (2024). Data-driven decision-making for bank target marketing using supervised learning classifiers on imbalanced big data. Computers, Materials & Continua, 81 (1), pp. 1703-1728
dc.identifier.doihttps://doi.org/10.32604/cmc.2024.055192
dc.identifier.issn1546-2226
dc.identifier.urihttps://hdl.handle.net/2086/24338
dc.language.isoen
dc.peerreviewedYes
dc.publisherTech Science Press
dc.relation.ispartofComputers, Materials & Continua
dc.researchinstitute.instituteInstitute of Digital Research, Communication and Responsible Innovation
dc.rightsAttribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectBig data
dc.subjectMachine learning
dc.subjectData mining
dc.subjectData visualization
dc.subjectLabel encoding
dc.subjectImbalanced dataset
dc.subjectSampling techniques
dc.titleData-Driven Decision-Making for Bank Target Marketing Using Supervised Learning Classifiers on Imbalanced Big Data
dc.typeArticle
oaire.citation.issue1
oaire.citation.volume81

Files

License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.2 KB
Format:
Item-specific license agreed upon to submission
Description: