The Future of Multilingual Bots for CX Automation

play Watch

In the latest session of our Future-gazing Series, Ultimate’s Staff AI researcher, Meysam Asgari-Chenaghlu, discusses the challenges and innovative solutions that enable our bots to speak over 100 languages.

Welcome to the latest instalment of our Future-gazing AI series, which features video Q&A's with experts on the future of AI.

If you’re familiar with the world of conversational AI, then you know that honing a bot’s multilingual capabilities is really hard. In this session, our Staff AI researcher, Meysam Asgari-Chenaghlu shares the solution we’ve come to at Ultimate – and it may surprise you. In conversation with our Senior Product Marketing Manager, Aaron Wichman, Meysam talk shop on the secrets behind successful multilingual automation

They delve into the differences between monolingual, multilingual, and cross-lingual bots – and why we have ditched translations in order to vastly expand our bots’ capacities to speak to customers in 109 languages and counting. They also discuss the importance of choosing the right LLMs and why Ultimate has always prioritized conducting its AI research in house. 

Read the full interview or check out the video recording above.  


Aaron Wichman: Hey, everyone. Welcome. My name is Aaron. I'm a product marketer at Ultimate, and I'd like to welcome you to our Future-gazing Series on AI. I'm here with our staff AI researcher, Meysam. Welcome Meysam.  

Meysam Asgari-Chenaghlu: Hey, my name is Meysam. I'm very glad to be invited. 

AW: Let's start off by doing a little bit of an intro. Maybe you could talk about your role at Ultimate.  

MA: Well, I work at Ultimate as a staff AI researcher and our daily job is researching and developing these new AI features and rolling them out to our customers. 

AW: Amazing. I'd like to dial in a bit and talk more about the multilingual side of our bots. What kind of work is being done right now in the field of multilingual automation, and how has it developed or changed since the era of ChatGPT? 

MA: Multilingual automation has been quite a challenge. Not just for us, but for everybody. And just to add a bit of clarification, we have monolingual bots, multilingual bots, and cross-lingual bots. 

By monolingual, we mean that the bot is set up to speak a single language. With multilingual bots, they speak multiple languages, but they differentiate between the languages. 

"And then, there’s the amazing cross-lingual bots, which you set up with a monolingual knowledge base and then you enable your customers to interact with your bot in different languages."

So imagine your knowledge base is in German and then your customers can speak with this knowledge-base bot in Chinese, Japanese, Arabic, or any other language, and it just works.  

AW: Yeah, that makes sense. I know that it can also be kind of tricky when you have languages that have  formal and informal pronouns like in German or French. What are some of these challenges, and how do you approach trying to make it sound a little more natural when you interact with the bot? 

MA: Yeah, to be honest, that was challenging. Different languages have different ways to address people – in a very polite way, in a casual way, an informal way, and even specific tones of voice.

So we created a set of tools in our settings where you can set up the tone of voice that you want your bot to use. But of course, for a professional tone of voice, you don't want your bots to address people as “Du” in German. You want it to be “Sie,” and these pronouns are handled by our bots. 

AW: That's great. It's really important to have that functionality. So as AI continues to evolve, how can researchers like yourself ensure that automated multilingual support systems are more culturally sensitive and capable of understanding the nuances that come with various languages?

MA: To start, it depends on the AI that you are trying to use. For example, there are a variety of different LLMs, the technology used as the foundation of generative AI. Some of them have biases, such as gender bias, as well as different levels of bias.

"We tried to pick the LLM which has the fewest biases, and you also need to work with an LLM that is well-suited for customer support automation. This is very important."

Another thing is that sometimes you need the bot to follow certain instructions, and not all of them actually do follow these specific instructions. And the other thing is, of course, domain. So if you want to have a large language model for a specific domain, let's say, abstraction or summarization of a specific technical text, it's going to require a different LLM, probably.

But for customer support automation, it has to be an LLM which is fine-tuned to chat mode. So  it is important in this way.  


AW: Yeah, I hear you. But what key technologies or LLMs that we're using hold the most promise for enhancing the efficiency and effectiveness of multilingual support? 

MA: In order to make UltimateGPT even better, we adopted different ways of generating the output. One of them being function calling, which was very important for us. The other being the creation of a set of prompts or inputs by manually testing and creating benchmarks for evaluation. 

And again, I just want to emphasize this part, which is very important: it is not just a single step. We get the information or the query from the user and then use a RAG model (retrieval-augmented generation) to pull the data from the knowledge base, and give this data to an LLM, and then ask it to answer the question. For us, it's a more complex pipeline, and the reason for it to be a more complex pipeline is because we want it to be more accurate with less hallucinations. 

The second thing is that we want to address the specific tone of voice. It is very important to address the very specific tone of voice our customers require for their bots to use. 

"A person can find text online that probably also answers the query, but without mastering the tone of voice function, a customer will not get the feeling that they are talking to a support agent."

AW: Yeah, so it's less about simply just passing the query to the LLM. There's multiple steps in between. 

MA: On the subject of these multiple steps, we wanted the bot to be more creative such as when engaging in small talk with a customer, but sometimes we don't want to be creative. In such cases, we want the bot to be grounded in the facts that we have in our knowledge base. 

From the technical perspective, there is a very tiny space between the two. It's important to detect in a given interaction whether you want the bot to be creative, and when you want it to just stick to the facts. Either way, you also have to maintain the same consistent tone of voice.

AW: Yeah, that makes sense. I'd like to dive a little bit deeper into the product now. We talk about Ultimate serving 109 different languages. Can you discuss more about how Ultimate can handle that natively?

MA: Handling more than 100 languages is very challenging. If you are trying to shape a feature for the customers, it should be tested for all of these languages. It should work for all of these languages with a certain capacity and certain standard of performance.

There are, of course, some languages which have fewer resources online. For example, some languages don’t have that many speakers. So those with fewer speakers don't have as much online content because there are less people creating it in the first place. And then after a certain point, whether the language model is large or small doesn't really matter if the bot isn’t proficient in that specific language. So this is very challenging. 

The other part is that the bot has to be able to detect the language based on specifications you make in your bot settings, to engage in the languages you choose. Some of our customers, for example, don't want their bot to speak in other languages even though their users might have been speaking in that language because they don't provide services for those speakers. 

So you also have to have a very good language detection module to be able to determine which languages the bot should be engaging in, which is also very challenging because a bot can be trained on the many open source services out there that are available in different languages. You can pay and get some service, but in the end the domain is different. Maybe that works quite well for detecting the languages, but our subject is customer support. Our domain is chat, so people write differently in chat, than they do in edits on, let's say, a Wikipedia article. 

AW: Yeah, that makes sense. And then the language detection is like the first part of that pipeline that you were talking about. How does an AI tool like our UltimateGPT compare or differ from deep language learning SNGP (Sequential Neural Generative Pre-training) models? 

MA: Well, SNGP models are classifiers. If I want to explain it very briefly, you have a set of classes, which in our domain, we call intents.

So your customer wants to solve a specific problem; for example they want to cancel an order. How we address this problem with SNGP is that you have to create a set of annotated data, which we call expressions, perhaps from your historical data. You will have these intents, and under the intents, you will have this set of expressions. We have a very good feature for training SNGP models.

But there is a difference between this intent classifier and the rest of intent classifiers that may be out there. And the difference is that, an SNGP model is able to understand what is within the scope of this intent and what is not. For example, if I wanted to set a bot for customer support in healthcare, then my bot should not be concerned with the query, “where did I park my car?”  Because there is no intent which can address this. So it should understand the scope of messages as well. And it should not make mistakes such as understanding this message as one of the existing intents, because it is completely out of domain and out of scope. 

"This is the difference between SNGP and UltimateGPT, which is a RAG-based system. It has a search engine, it has a large language model, and whenever you ask a query, it doesn't classify it to some categories. It gets your query and then tries to find relevant information from your knowledge base or CSV import." 

We have a web scraper that will crawl whichever URL you give it and then tries to answer the question, and the answer is not something that you have provided beforehand. You simply put the data and information into the knowledge base. The bot reads that and then creates an answer to address the issue.  


AW: That makes sense. As I know we just did an upgrade to this recently, what about multilingual embeddings? Where does this fit in? 

MA: So first of all, why use embeddings at all?  Because we have to perform search functions.  Why do we have to perform searches? Because UltimateGPT uses search as a tool to get the information and then create the answer. 

So, then what is embedding? Embedding means that you have a text, you give it the text, and it represents the text with some numbers, which are semantically closer to some other information which is related to it. So all texts which have similar context will be semantically similar to each other. But as I said before, we are not creating monolingual bots. We are not even creating multilingual bots, but instead we are creating cross-lingual bots. 

"It is necessary for us to also have a very good embedding service, which can represent text in all of these languages that we offer, and also to have this cross-lingual functionality. So if I ask a question in German, and in my knowledge base there is information which answers this in Japanese, it should still be able to find the answer and respond in German." 

AW: So there's no translation going on at all? 

MA: No. We don't do any translations.  

AW: Okay, got it. Thank you. Yeah. And what about the latest improvements that you and your team have been working on for UltimateGPT and how it handles languages?  

MA: We have been working on very different aspects of it and we have been improving it a lot. One of the improvement points has to do with the very same topic we are talking about, these embedding models. So we are using bigger embedding models, which are more capable of finding more accurate information. The second part is we focus more on the factuality of the answers. So the answers are now more factual. 

But also just to clarify, it's a very tricky part of this entire pipeline. If you emphasize the factuality of answers, you should also keep in mind that you should not reduce the answer rate of your bots. It should answer more factually, and your analytics should be able to detect that.

There were other parts such as re-ranking. We have been working on this and re-ranking means that we use this embedding. It's very accurate, but sometimes if there is any part of this information, which is not related to the query, afterwards we do a re-ranking and eliminate that unrelated information to improve the model by making it more accurate.

This helps us in many aspects. The other part we have been working on is the latency. We’ve  progressed a lot so far and I think it's in good shape. Latency is also improved and of course the quality of the answers and the tone of voice capability has improved a lot of features. There's a lot that we are working on!   

AW: Amazing. I appreciate this kind of constant innovation. From what I know it can be kind of tricky. How is the way that Ultimate handles the different multilingual bots different from, say, some of the other bots that are out there?

MA: Well, there are different approaches for handling multilingual bots. One of the approaches that some are using is getting this  information from one specific language and then translating it to different languages. And then trying to expand the dataset, which is going to be used as an intent classifier by paraphrasing and translations.

This is proven not to be a very good approach, I would say. There are many downsides to it, and many issues may arise. For starters, translation – even with the best translating tools – is not at its best right now.  You can simply tell that it was translated, but it's removed a lot of other information. For example, you can use Google Translate and write something with typos.  And then when it translates, it removes the typos. So if you're gonna train a model based on this translated text, it will only understand very well-written messages. 

Because everything comes from the same source, Google Translate. But it is better to have a representation of these messages, as we were saying before, which is applicable to not one language, but more languages. And whenever your classifier, such as SNGP, learns this, you can speak in another language to it. 

And it will understand because you already gave it some examples, let's say in English. So the bot will understand whenever you're speaking in German. Even though it is not trained in German, it understands. Because it already was trained on very similar information and very similar text. 

AW: But what if it's a dialect or subset of a language? 

MA: Well, dialects are different forms of writing the same language. And it depends on the dialect itself. As an analogy, if you imagine a language like a tree, then the dialects are the leaves of that same tree.

But between these different dialects, maybe some word means something very different compared to another dialect. It's tricky because you will give it an example, where you are saying this means “cancellation.” And then someone with a different dialect will write it in a very different way, which may be similar to other intents.

So in order to approach it, I personally think that SNGP does a quite good job on this because it creates a general understanding of the concept instead of just being a classifier. 

AW: I appreciate your metaphor of the trees and leaves, that's a very helpful way to approach it. So what's the benefit of having your team here in house at Ultimate versus say, maybe outsourcing some of that?  

MW: That's a good question. Ultimate is an AI company because our core foundation and all of the features that we provide are based on AI. And it is not as if this was the case only since generative AI came out. I was hired and my colleagues were hired at the very beginning.

In fact, Jakko, one of the co-founders and chief science officer, has been there from the very beginning because the entire product is based on core AI functionality.

"It totally makes sense that you should have an in-house AI team, not just of AI researchers but also machine learning engineers. If you are creating some kind of innovation, you should be able to scale it. You should be able to track issues. You should also be able to shape it to huge amounts of requests. So there's not just a few people, but a noticeable portion of Ultimate is just doing AI."  

AW: Yeah, that's amazing. I think maybe, the next time we chat, I'd love to delve into a little bit more about prioritization between the ML teams and the AI. But I know we're running out of time here. So maybe you could just talk about what's next for you and the team.  

MA: Well, the next is going to be groundbreaking. 

AW: So exciting. Thank you so much, Meysam. I’m really happy to have you here with us today and we'll speak to you soon. 

MA: Thank you very much. Bye. 

Gaze further into the future of AI