Anovel HEOMGA Approach for Class Imbalance Problem in the Application of Customer Churn Prediction
Making class balance is essential when learning from highly skewed datasets; otherwise, a learner may classify all instances to a negative class, resulting in a high false-negative rate. As a result, a precise balancing strategy is required. Many researchers have investigated class imbalance using Machine Learning (ML) methods due to their powerful generalization performance and interpreting capabilities, comparing with random sampling techniques, to handle the problem of class imbalance in the preprocessing phase to facilitate learning process and improve performance results of learners. In this research, an effective method called HEOMGA is presented by combining Heterogeneous Euclidean-Overlap Metric (HEOM) and Genetic Algorithm (GA) for oversampling minority class. The HEOM is employed to define a fitness function for the GA. To assess the performance of the proposed HEOMGA method, three benchmark datasets from UCI repository in the domain of customer churn prediction are examined using three different ML learners and evaluated with three performance metrics. The experiment results show the effectiveness of the proposed method compared to some popular oversample methods, such as SMOTE, ADASYN, G SMOTE, and Gaussian oversampling methods. The HEOMGA method significantly outperformed the other oversampling methods in terms of recall, G mean, and AUC when the Wilcoxon signed-rank test is used.