Journal of Applied Data Sciences
Vol 4, No 4: DECEMBER 2023

Predictive and Analytics using Data Mining and Machine Learning for Customer Churn Prediction

Chandra Lukita (Catur Insan Cendekia University, Indonesia)
Lalu Darmawan Bakti (Mataram University of Technology, Indonesia)
Umi Rusilowati (University of Pamulang, Indonesia)
Asep Sutarman (University of Muhammadiyah Prof. Dr. HAMKA, Indonesia)
Untung Rahardja (University of Raharja, Indonesia)



Article Info

Publish Date
07 Dec 2023

Abstract

This research aims to predict and analyze customer churn using Data Mining and Machine Learning methods. The background of this research is based on the importance of understanding the factors that influence customer decisions to churn, as well as improving the effectiveness of customer retention strategies in a business context. The method used in this research involves the use of a customer bank dataset that includes information about customers who left in the past month, services registered by customers, customer account information, and demographic info about customers. The factors most influential to churn were identified through heatmap analysis, including MonthlyCharges, PaperlessBilling, SeniorCitizen, PaymentMethod, MultipleLines, and PhoneService. This research compares the performance of several machine learning algorithms, including Random Forest, Logistic Regression, Adaboost, and Extreme Gradient Boosting (XGBoost), to predict customer churn. Accuracy metrics and confusion matrix results are used to evaluate the performance of these algorithms. The results showed that XGBoost proved to be the best algorithm in predicting customer churn with high accuracy. The factors that have been correctly identified do not provide missed precision, showing a significant influence on customer churn decisions. The novelty and uniqueness of this research lies in focusing on the factors that have the most influence on customer churn and comparing the performance of machine learning algorithms. This research provides more specific and relevant insights for companies in developing effective customer retention strategies. However, this research has some limitations. One of them is the use of a dataset limited to a customer bank, so the generalizability of the findings of this research may be limited to that business context. In addition, other factors that are not the focus of this research may also contribute to the prediction of customer churn.

Copyrights © 2023






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...