Transfer Learning in Customer Service Automation

A woman peeking out from behind a book.ook.

How to use transfer learning to build state-of-the-art customer service AI! Hint: combining deep learning + transfer learning powers our language agnostic approach.

Nowadays, we regularly hear about advances in the field of Artificial Intelligence (AI). At Ultimate, we are aiming at being at the forefront of this development by using and advancing AI techniques that help improve our AI customer service tools and enable companies to be more efficient in their interactions with customers.‍

The beauty of deep learning…

A lot of advances in the field of Artificial Intelligence stem from the development of deep learning. Deep learning is a machine learning technique that typically uses artificial neural networks with multiple layers. A neuron can be thought of as a function “approximator” that allows us to learn how to map from an input to the desired output using training data alone. For example, deep learning enables AI to learn how to convert between Celsius and Fahrenheit by analysing examples of conversion pairs (0ºC = 32ºF, 10ºC = 50ºF, etc). This can be useful in complex cases where we do not know the function (formula) between input and output. In customer service AI, these cases can be anything from understanding a new language to simulating a conversation.‍

In essence, a deep learning neural network is a stacking of several neurons that aims at learning an approximation of more complex functions such as the relationship between your facial expression and your mood. In cases like this, were building the rules manually is near-impossible, it allows the machines to learn from data alone.

A man making an angry face, a smiling face, and a surprised face.

‍…and its shortcomings

By developing AI to handle complex functions, deep learning has significantly expanded its possible applications. However, current systems require a lot of fuel: the more complex a system is (say, language vs. temperature conversions) the more neurons will be needed in neural network architecture and, more importantly, the more data the network will require to learn a satisfactory approximation of how the rules operate.

To illustrate, when comparing a 5-year-old’s ability to identify cats and dogs from photographs, you would only need to show them a few examples. For the same task, a deep learning algorithm could require hundreds of thousands or even millions of labelled data samples.

This is a problem. Data is rare. Data is expensive.

In the case of Ultimate, to build state-of-the-art customer service AI we want to learn a mapping between what customers say, and what they want, in order to be able to answer them according to their needs. This is not a trivial task and conversational data is limited, which is a big issue. It’s why major customer service AI providers (e.g. IBM, IPsoft) require huge 6+ month onboarding projects (for, among other things, humans to manually build rules for the AI) and ongoing in-house maintenance teams(customer service agents spending all day training the AI rather than helping customers). We didn’t think this was the future of customer service.

So what to do then, when data is limited, but data quantity is so highly tied to solution quality?

On the importance of transfer learning

To deal with the lack of data, we make use of a technique called transfer learning. Transfer learning is a method that allows us to use the knowledge gained from other tasks in order to tackle new but similar problems quickly and effectively. For instance, in the case of humans, using knowledge of cycling makes it easier to ride a motorbike.

At Ultimate, we leverage transfer learning in two key ways:

1) Leading the market in languages: the first language-agnostic solution

‍Ultimate is a Finnish startup and so had the pleasure of starting with one of the world’s toughest languages. For deep learning AI, Finnish faces two problems: 1) it’s really, really complicated (lots of rules), and 2) with only approx. 5m Finnish speakers worldwide, there isn’t much data on it.

Ever heard that learning more languages gets easier as you go? Maybe from a friend that’s boasted that speaking Italian and French have made Spanish a breeze? This phenomenon isn’t just true for humans. In Google Translate’s Zero-Shot Translation algorithm, probably the most famous beneficiary of transfer learning for scaling languages, Google uses a single model to translate between multiple languages, scaling the learnings from one translation pair (English →Japanese) to another (English →Korean).

At Ultimate, we use a similar technique for customer service. Like Google, one of our adoptions for this is understanding languages. Solving the Finnish problem with transfer learning prompted us to develop our architecture to use a single model across all clients and regions. Today, this development story has made us the only truly language agnostic provider of customer service AI. Within our clients, we can scale easily with them, to serve their customers across all their markets. And the more we scale, the better we get. ‍

2. Learning from limited data: cracking the data problem in customer service AI

More interestingly, by being able to apply ways of thinking from one task to another, transfer learning unlocks deep learning potential from smaller datasets. This is called fine-tuning from a warm restart.

Let’s go through an example together: as we said earlier, humans develop basic knowledge (how to sit, how to balance) of riding a motorcycle from riding a bike. It’s a similar case with AI customer service. Each time our AI is trained on a new customer service use case, e.g. adding Ecommerce to a Telco and Travel portfolio, its previous experience in identifying customer problems in other industries elevate its accuracy with the new case. As such, as we scale at Ultimate, we get better at using less and fewer data to achieve higher and higher accuracy levels. In fact, we’ve been able to deploy with clients at near-launch of their chat/email customer service. To give a sense of this impact, we’ve been able to build trained AI on <500 conversations, which has gone on to support agents with real-time reply suggestions and automation of repetitive questions at the rate of 10,000+ conversations a month.

See how our industry-leading virtual agent works