Abstract:Churn prediction in credit cards, fraud detection in insurance, and loan default prediction are important analytical customer relationship management (ACRM) problems. Since frauds, churns and defaults happen less frequently, the datasets for these problems turn out to be naturally highly unbalanced. Consequently, all supervised machine learning classifiers tend to yield substantial false-positive rates when trained on such unbalanced datasets. We propose two ways of data balancing. In the first, we propose an oversampling method to generate synthetic samples of minority class using Generative Adversarial Network (GAN). We employ Vanilla GAN [1], Wasserstein GAN [2] and CTGAN [3] separately to oversample the minority class samples. In order to assess the efficacy of our proposed approach, we use a host of machine learning classifiers, including Random Forest, Decision Tree, support vector machine (SVM), and Logistic Regression on the data balanced by GANs. In the second method, we introduce a hybrid method to handle data imbalance. In this second way, we utilize the power of undersampling and over-sampling together by augmenting the synthetic minority class data oversampled by GAN with the undersampled majority class data obtained by one-class support vigor machine (OCSVM) [4]. We combine both over-sampled data generated by GAN and the data under-sampled by OCSVM [4] and pass the resultant data to classifiers. When we compared our results to those of Farquad et al. [5], Sundarkumar, Ravi, and Siddeshwar [6], our proposed methods outperform the previous results in terms of the area under the ROC curve (AUC) on all datasets.
Abstract:With the widespread use of social media, companies now have access to a wealth of customer feedback data which has valuable applications to Customer Relationship Management (CRM). Analyzing customer grievances data, is paramount as their speedy non-redressal would lead to customer churn resulting in lower profitability. In this paper, we propose a descriptive analytics framework using Self-organizing feature map (SOM), for Visual Sentiment Analysis of customer complaints. The network learns the inherent grouping of the complaints automatically which can then be visualized too using various techniques. Analytical Customer Relationship Management (ACRM) executives can draw useful business insights from the maps and take timely remedial action. We also propose a high-performance version of the algorithm CUDASOM (CUDA based Self Organizing feature Map) implemented using NVIDIA parallel computing platform, CUDA, which speeds up the processing of high-dimensional text data and generates fast results. The efficacy of the proposed model has been demonstrated on the customer complaints data regarding the products and services of four leading Indian banks. CUDASOM achieved an average speed up of 44 times. Our approach can expand research into intelligent grievance redressal system to provide rapid solutions to the complaining customers.