Testing the ability of Generative AI to predict Customer Churn

Australia's superannuation industry is a robust and essential part of the national economy, with total assets exceeding $3.9 trillion as of March 2024.
Customer churn is a problem in many industries, none more so that in superannuation due to it's compulsory nature and low barriers to switching. Superannuation customer value is proportional to their super balance, so it follows that the longer someone has been a customer (and the larger their super balance becomes) the higher their value. Put another way, having customer churn close to retirement is costly.
Active Super is a prominent player within this thriving sector, known for its strong performance and member-focused approach. With a substantial share of the market, Active Super has consistently delivered competitive returns and innovative investment options. Deepend have been working with Active Super on their technology and platforms for many years, and we wanted to see if we could move the needle on the customer churn issue by applying a new approach that hasn’t been tried before.
The current crop of LLM based AI system have a unique ability to understand human language. This insight formed the core of an idea that was developed into a experimental program. At a high level, the questions we wanted to test were:
- Could AI leverage new and existing data sets to understand customer sentiment?
- Would this sentiment over time have predictive value for customer churn?
- Could generative AI be leveraged to then genenerate personalised retention communication strategies?

Methodology
The methodology we designed for this experiment consisted of 6 stages:
Step 1: Data analysis & preparation
Active Super had no shortage of data, but it was siloed and not readily ingestible. Additionally, a significant portion of their customer data was unstructured—comprising free-text interaction records and notes from their CRM team. This valuable information was difficult to analyze at scale, limiting the fund's ability to identify emerging churn patterns and respond in a timely manner.
Step 2: Data Pipeline construction
A dedicated data processing pipeline was engineered to cleanse the data. Erroneous records were removed, acronyms swapped out, a "feature engineering" was used to creating new data points from existing information to generate deeper insights.
Step 3: Development of a custom AI application using Langchain
A custom AI application was constucted using the LangChain framework and Open AI's APIs, allowing us to safely experiement with full control over data privacy and security.
Step 4: Identifying churn signals in data
By supplying the app with data where the outcomes were known in advance - i.e. customers who had already churned - we were able to progressively identify signals in the data that indicated churn propsensity. These included indicators like tenure and balance, but also sentiment over time as derived from customer service interations.
Step 5: Testing customer churn prediction accuracy
Once we had assembled our churn indicators, the next step was to supply the AI App with new curated data sets and test whether it could successfully predict which customers would go on to churn.
Step 6: Generating personalised retention communication
Finally, we tested the App's ability to use the insights derived (why someone may be unhappy) to generate custom retention communications that could be leveraged to try to address the issue and prevent churn.

Custom Python Data Pipeline
We build a custom data pipeline using Python to merge, clean and prepare the data sets ready for consumption by the AI model.

Custom AI application powered by LangChain
LangChain is an open-source framework designed to simplify the development of applications powered by large language models (LLMs) by enabling them to interact with external data, tools, and memory.

Leveraging Open AI LLMs
Connecting directly to Open AI's APIs allowed us to securely and safely experiment over data whilst protecting sensitive information
Results
The early results from experimentation have been promising. We have been able to isolate churn indicators in data sets, including in the demographic type data (age, sex, location, tenure), and also in the customer interaction data (sentiment, keyword usage.)
In round 1 testing, churn prediction accuracy exceeded targets by 5%. Subsequent fine tuning increased accuracy in later rounds to 75%
Hallunications have been observed where the model will predict customer churn for customer id's that do not exist! Investigation of how to eliminate these is ongoing but a method that looks promising is to limit the size of the prediction set.
Related Work

Using artificial intelligence to better guide users and lighten the load on a call centre under pressure
