Synopsis: In this AI Leadership Insights video interview, Amanda Razani speaks with Alexey Khitrov, founder of ID R&D, a Mitek company, about ID voice technology.

Amanda Razani: Hello, I’m Amanda Razani, and I’m excited to be here today with Alexey Khitrov. He is the founder of ID R&D, a Mitek company. How are you doing today?

Alexey Khitrov: Fantastic.

Amanda Razani: Glad you can be on our show. So can you explain, what is ID R&D and what services do you provide?

Alexey Khitrov: ID R&D is a company that develops best-in-class voice biometric and likeness capabilities. So we build AI-driven products that can identify people by their voices, they can provide remote authentication and personification services through the voices, and we can also distinguish between a human voice and a replay back or a voice clone. So we can not only identify people, but we can protect these channels from various types of fraud.

Amanda Razani: Wow, that is really neat. So I believe we were first introduced at the Voice and AI conference a little while back, and you unveiled the IDVoice there that works with ChatGPT, correct?

Alexey Khitrov: Absolutely. This is a very exciting time for voice, in general, and voice biometrics, in particular, because the quality of the conversational content, the quality of the conversational AI is improving drastically, and that means that voice is becoming a new interface. Because when we are communicating with things like ChatGPT, when we are communicating with various voice-enabled devices, we increasingly use voice to input the information therapy because voice is much quicker than typing or some other type of input. We generally see that people tend to like to mimic the type of the user experience, the type of the interface that we have between people. So when we talk to somebody, we know we don’t need to go through the lengthy process of establishing each other’s identities. We know who we are, we know who we talk to by the way that our voice sounds, by the way we look and so on and so forth.
And we are excited that the current generation of the voice biometrics and voice likeness capabilities can deliver the same security for interactions between people and ChatGPT-like capabilities in apps. What does it mean in practice? It means that the companies can build chatbots, the companies can build a virtual assistant that can talk to people and that can only input information from the right people. So if I’m talking to my bank about my bank account, if you were to try to get into this conversation, you would be rejected. I’m really sorry. Or if somebody will get a recording of my voice and then try to play that, that would be rejected. Or if somebody would create a clone of my voice and try to play that, that attempt would be rejected as well.
So it’s just a great natural user experience, where it comes to security, there’s no you user experience if it’s the right person, but the friction arises when it’s a fraudulent attempt. And that’s the key to success here. We want to create a trustful environment when we can talk to various e-incarnation of conversational e-interface and we can be sure that our information, our identities are protected, but we don’t need to do a lot to get there. And we as a company, ID R&D, we strongly believe that the data that’s collected through those interactions is enough right now to deploy the latest generation of AI and establish that security, establish that voice biometric mentioned, establish that voice biometric likeness.

Amanda Razani: So would this technology then be an enhancement with chatbots, or can you explain a little bit about how this can be used in the customer service industry?

Alexey Khitrov: Absolutely. So these can be deployed both at the virtual assistant level, when we talk to the virtual assistants, and the virtual assistants would be able to automatically establish and confirm the identity of a speaker and provide us e-access to our accounts or allow us to conduct transactions like buying something or transferring something and so on and so forth, and doing that through the means of conversation. And that’s a revolutionary thing, if you think about it, because right now, in order to do it, we need to jump through the hoops, but this type of technology can greatly simplify the user experience while enhancing the security. And the same experience, we’re already using some of these elements right now in call centers, but it can be enhanced further for the virtual assistant and the conversation interface type of applications.

Amanda Razani: So do you foresee this technology eliminating some of those in-person jobs that it would take over?

Alexey Khitrov: I think that it’s just going to change. I was recently reading a book about the thriving horse industry in New York in the 1910s and 1920s, right before the car came up and became popular. I think about 10 or 20% of population in New York were busy doing something about horses as a transportation or feeding them or cleaning them or driving them. And then the car came about and a lot of those people just had to change and adapt to a materially technological environment. But I think we all can agree that we’re greatly benefiting from the fact that cars were invented. So I think we’ll see a similar transition, that we as consumers, we as the people who use this type of technology will enjoy the convenience of it, will enjoy the fact that we can get to the right information, to the right transaction much quicker in a much more natural way. And then a lot of the people’s jobs are going to be transitioned to various different functions. And the world, I think, in general, is changing for the better because of it.

Amanda Razani: It’ll free up those positions for other roles.

Alexey Khitrov: Exactly.

Amanda Razani: So what are some of the concerns or key issues that you see business leaders having when it comes to implementing this technology?

Alexey Khitrov: I think one of the key concerns that we hear from the market is that GenAI and the large language model that allowed this leap in conversational interface to be so successful and so natural can also be used for the fraudulent purposes. Just a few days ago, Joe Biden in the White House was talking about the dangers of the voice cloning. So it’s not often that we hear presidents address issues like this, but it is top of mind for a lot of consumers and a lot of business leaders. So now the voice cloning capabilities can allow to create your voice based on just a few seconds of your speech. And the technology itself is actually a great technology.
So voice cloning, I think, is actually a wonderful piece of technology that can enhance our communications as humans, and I can just give you one example of where it can be used. One of my favorite applications is a plugin to one of those, I think it’s Zoom that did it, where you can speak to people in different languages, which means, let’s say I’m talking to somebody in Mexico and I don’t speak Spanish, but I speak into Zoom or a Zoom-type platform, but I speak in English, they translate it into Spanish, and then the listeners on the other end will hear my voice speaking Spanish.

Amanda Razani: Wow, that’s amazing. That would be very helpful.

Alexey Khitrov: Think about all the barriers that are breaking with the availability of the technology like this. So the technology is great, but of course, there’s a certain number of bad people that are always trying and are looking for ways to use the advancements in technology for fraudulent activities. And of course, if you’re using voice as one of the security measures, that availability of technology that can create voices that sound exactly like somebody can be exploited by the criminals for some nefarious purposes. So that’s a concern that we hear a lot. And the good news is that the current generation of technology can not only create those voices, but the companies like ID R&D have already been working on clone detection capabilities as well.
So when IDVoice is analyzing the incoming signal, we can not only confirm that this is Alexey’s voice, we can also make sure that it is a voice that came out of a human body versus a clone of a voice, even if for human, the voice would sound exactly the same. So the bad news is that humans can no longer, in a lot of cases, make the distinction. The good news, that the technology can.

Amanda Razani: That is good. It’s funny how the technology has all these amazing pros and yet it can be turned around and used against us too with all the cons. But luckily that technology is there to deem the fakes and the frauds, so that’s good. Do you have any advice for the business leaders that are trying to go through a change management phase and implement this technology, any steps they can take during this journey?

Alexey Khitrov: I think that the key is to make sure that when you’re making the step, when you’re making these transitions, that you have both the user experience and security in mind. And this is something that we see quite a bit in the field where some of the security concerns just come up in the end of the journey. And of course, we are happy to help all the customers, but our advice is to plan for this from the start.

Amanda Razani: What is a good length of time that business leaders can expect, from beginning to end stages, implementing this type of technology? What is a good time span to expect?

Alexey Khitrov: I think one of the most exciting things about ChatGPT and the huge shift that we’re seeing in the technology landscape is that speed to deployment, speed to production is increasing drastically. And things that would take a year from a technology perspective can now be done in weeks. So this is a significant paradigm shift. So I think technology is less of a barrier and integration is less of a barrier than some of the other organizational steps that need to happen within the organizations to allow these capabilities. So what we’re seeing is a drastic increase in speed in which these capabilities are coming to market. And again, we think that this is a great benefit to customers because what we’re seeing is a new channel which can be utilized to get the information quickly, to get your transactions quickly, to be done with those things that you need from your bank, your insurance company, whoever that you’re dealing with in a much faster and a more secure way.

Amanda Razani: Amazing. Well, if there there is one key takeaway you can leave our audience with today, what would that be?

Alexey Khitrov: I think there are a lot of concerns about the rise of AI and capabilities that come with it. We hear a lot of concerns about deepfakes, we hear a lot of concerns around voice clones. And I think that although it is very important to raise these issues, it is very important to talk about it, to be aware of this, of these type of threats, I think it’s also important to know that the biometric industry, that the voice industry is well-equipped to address these concerns and these issues. And we believe that these concerns should not be the showstoppers when it comes to the deployments of these very exciting capabilities.

Amanda Razani: All right. Thank you so much for coming on our show today and sharing your insights.

Alexey Khitrov: Absolutely. Thank you so much for having me.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Networking Field Day

TECHSTRONG AI PODCAST

SHARE THIS STORY