Fixed Line Customer Segmentation Using PCA Based on Customer Lifetime Value

  • Thee Houw Liong Telkom University
  • Ratna Prihandini Telkom University
Keywords: Principal Component Analysis, Customer Lifetime Value, K-Means, Logistic Regression, Support Vector Machine


One of the challenges faced by PT Telkom Indonesia in the process of transition to
FTTH services is how to retain current customers and attract customers to migrate toward
Indihome product. Telkom currently has a home customer service around 8.3 million, and only
about 26% are profiling well. These conditions make some difficulties for company to identify
and recognize customer that potential to give future profit. Therefore, company needs a model
to segment customers for determining priority of customer migration process to FTTH product
so that can maximizing profit from existing customer. Based on Pareto 80/20 rule, companies
that can find 20% of profitable customers will be able to sustain its business revenue.
In this research, the suggestion method is clustering process using principal component analysis
combined with K-Means, then continue with classification process using customer lifetime
value combined with support vector machine and logistic regression. The primary data source is
based on call data record and payment record of existing customer. Two classification
algorithms is used to build model for mapping new customer without repeating clustering
process and looking at the influence of principal component analysis towards accuracy model.
As a result, the best model is principal component analysis as feature selection technique. The
clustering measurement method to define optimal cluster is using Calinski Harabasz. While, for
building a classification model a logistic regression gives better accuracy result and
improvement of performance algorithm with F-Test score 99.16%, accuracy about 99.93%, and
precision 98.84%. Principal Component Analysis as feature selection can improve the
performance logistic regression up to 2% compare with non PCA implementation and RFM
model. While, the top list of profitable customer that can be inferred by using this model is
33.62% of total population.