Synopsis: In this episode, Mike Vizard speaks with Himanshu Shukla, CEO for LightBeam.ai, and PD Prasad, who's chief product officer, about raising 17.8 million in funding for a platform to manage data security, data privacy and all the things that come along with AI and data.
Mike Vizard: Hello, and welcome to the latest edition of the TechStrong AI Leadership video series. I’m your host Mike Vizard. Today we’re with Himanshu Shukla, CEO for LightBeam.ai, and we also have PD Prasad, who’s chief product officer, and we’re talking about them raising 17.8 million in funding for a platform to manage data security, data privacy, and all the things that come along with AI and data these days, that are a challenge. Gentlemen, welcome to the show.
PD Prasad: Pleasure to be here, Michael.
Himanshu Shukla: Pleasure to be here, Mike. Thank you for having us here.
Mike Vizard: All right. Himanshu, what exactly is the challenge that we’re looking at because we’ve been talking about data security and data privacy for as long as anybody can remember, so how is AI changing this equation?
Himanshu Shukla: Very good question, Mike. So if you look at in this era of generative AI, everyone is trying to extract information or actual knowledge out of the customer data they have gathered. And in doing so, the rail guards around privacy have been totally omitted. You just get customer data and you start extracting information out of it, in which a lot of sensitive data may bias your model, which is getting trained, and that requires a new thinking, new platform to look at the data the way you are looking at it. And that’s where LightBeam kicks in, which is to say that it not only secures the sensitive data that you’re using, but also ensures that you are not getting your model biased by age, gender, or sensitive information that is needed for giving meaningful information, when you’re asking questions to an AI model, it is giving meaningful information, but at the same time it is not getting biased with this because the data was not cleansed up.
Mike Vizard: PD, what’s at risk here? Because I think people have it in their head that somehow or other an AI model is just another software artifact. If we run into an issue, we’ll find some way to patch it, but the reality is you kind of have to retrain the whole thing anytime there’s an issue, and that sounds very expensive. So do people get that? Do they understand what’s really going on here?
PD Prasad: Well, I guess they understand that there’s a lot of opportunity and potential. We are talking to the customers. In fact, they are having board conversations about how they can leverage and use AI to increase their market share and fulfill their software capabilities. I think they have some understanding that this is different from just plain, simple software. I’ll give you a simple example. Software, once it gets trained, it learns in a laboratory for example, gets deployed, doesn’t change at all. AI on the other hand actually learns in the real world, which means that it might have left the factory so as to say the software factory in some shape. But as it starts to work on your data, it starts learning from your data, which means that if your data is biased, well guess what, that bias is going to get perpetuated, which is not good obviously. And that’s how you might’ve heard this phrase that ChatGPT is getting dumber by the time.
Well, that’s because it’s actually learning on data. It left the software factory nicely, but it’s now getting dumber. Same holds true for other AI models as well. And that’s where what we have been doing is we have been saying, “Hey, we use whatever model you want to use, but let us take care of your sensitive data. What kind of data do you have? Do you know what kind of data are you training your model on? That way you can customize and continuously train your model on data that’s relevant, that’s a representative of your customer segment and not have it biased.”
Mike Vizard: Himanshu, who’s in charge of this aspect of the whole thing? Is it the data science team or the security people showing up or is it the GRC people, who’s taking the lead on making sure that the data is well correct?
Himanshu Shukla: So today organizations haven’t really figured out the person who’s ultimately responsible for this. So the ball gets thrown from one place to another where everyone is, whether it is the chief data officer, that’s a new role which has been evolving for past couple of years, who’s ensuring that the data gets democratized across the organization. Then there is the security officer, which is also looking into the security aspects of the sensitive data. So today the way organizations have been functioning is the CDO and the CISOs are coordinating amongst themselves, and the CISO is the one who’s primarily responsible for data not leaking across the organization, while the CDO is the one who’s ensuring that things are shared within the organization, but CDO is not really aware of the privacy aspects. So CISO typically educates him or the legal team educates him in terms of how this data has to be handled. So legal team also gets involved in this whole conversation. So organizations have to figure out who’s the one person responsible for the end result that is still not sorted out cleanly.
Mike Vizard: PD, to that point, are people coming full circle here? I mean they’re kind of aware that there’s a data quality issue and then there’s a data security issue. And what they’re discovering is, well, we never had good data management practices in the first place. So is this all coming home, the roost? And this is at the core of this whole mission is to figure out how to retrofit that best practice on top of what we’re doing.
PD Prasad: I guess it is in a lot of ways, a lot of organizations for the last 15 years I would say, have been struggling with how do they handle or get a sense of what sensitive data do they have, and frankly, the technologies that have been there in the past, and actually Himanshu, myself and our other co-founder all worked on those technologies and it’s just that the tech stack was not mature enough, was not capable enough. With AI, we are seeing an amazing capability where we can actually do things that were just not possible even five years back actually.
And so it is in some sense coming full circle, it’s the same starting question, which is what data do I have? Where is that data? Who am I sharing that data with? Do I have biased data? Is my data representative of my customers that I’m training my AI model on? It’s the same. I guess the last question is a new one, but the first four questions are pretty much the same that people have been grappling with for the last 15 years, is just that now with newer modern AI technologies that we are leveraging, we are able to provide reasonably good answers to customers to get started with on that.
Mike Vizard: So Himanshu, am I in effect going to need to use AI to manage the data that I’m using to build the AI models? And so this kind of process becomes a little bit full circle, I mean, what is the mechanism by which I apply some level of adult supervision over the data that’s being used in the AI model?
Himanshu Shukla: So this is an evolving thing. I don’t think there is any system that has been defined. I haven’t seen an organization which is handling the data in a systematic, well-defined process way. So ideally, if you look at it all starts off from which model are you picking up? Is the model itself having some holes where it is going to get biased automatically the way you have designed the model, then the second layer that kicks in is to say the data that you’re going to train the model. So you should have a visibility around the data that you’re going to train the model. Does it have privacy information? Does it have sensitive information which is there or it is biased because if you just, for example, train your model with just only males or white males, the information that you’ll get out of that will be very biased in terms of when you’re looking for the questions that you’re asking in the answers which you’ll get will be very, very biased because you’re only picking up data which is representative of one form of the overall scheme of things, which is there.
So that is the second problem. And this model which is there, once you have put it in, continuously the data keeps evolving. So what you typically end up doing is you keep feeding more and more data to this model and the model keeps evolving. So the model is never stationary, and that’s what PD mentioned to say in this new world, once the things have been shipped out of the factory, it is not like the model doesn’t evolve. You keep feeding more and more data to this particular model and that it keeps evolving there. So then you need a continuous loop, which is needed to say that every time the model has evolved, you need to see if things were biased, not biased, what kind of data you were managing. So you need a good catalog of the models as well as a good catalog of data that you’re feeding to those models.
That’s the prerequisite in terms of making it happen. And before you feed into those things, there’s another piece which is the part which you mentioned about data scientists, which is getting information from different pieces. So they also need to be in that loop because you don’t just feed in data, which is there in SQL DB to these models. You have to extract some information and then you feed it to these models in some form factor. So whole of that end-to-end lifecycle management, that’s what I would call it, has to be built in for every organization going forward. So this was not a challenge till 2022, 2023, organizations have started looking into it, and in 2024 and 2025, it has become mandate for every organization to have some AI story as they’re building the organization. So this will become much bigger problem coming 2024, 2025.
Mike Vizard: PD, to Himanshu’s point, how long will it be before the compliance people and the regulators catch up to all this? And should I keep an eye on that? Because I feel like a lot of folks are going down a path that they may wake up one morning, two years down the road and somebody’s going to be knocking on the door saying, “Buddy, you got to pull that AI [inaudible 00:10:29].”
PD Prasad: It is going to be a little bit time I think, organizations do have a little bit of time, but based on our experience that we saw with let’s say privacy regulations like GDPR and California Privacy Rights Act, legislators are aware of this, President Biden’s AI Act, already there was an executive order around that, to say how your AI system should be used, and if you cannot use AI system to take some automated decisions, some of them are very, very clear, like for example, you cannot decline someone, a mortgage based on a completely automated processing based on AI system. So the regulators are keeping an eye on that.
It is a double-edged sword though because the more regulation comes up in any area, and certainly it is true of AI, it also means that only the larger players may be able to then meet those regulations, which then means that you are throttling in some way innovations from some of the smaller players. So it has to be, for example, Europe follows an approach as you very well know where it is super regulatory and some of that is really good. But here in the US, I believe we try to follow a middle path where it needs to be regulated, but at the same time it needs to not come in the way of innovation, particularly from startups and small companies.
So from my understanding, from what we are seeing, California already is looking at ADMT Automated Decisionmaking Technology. President Biden’s executive orders already out there. New York is looking at something similar as well around use of AI and what you may or may not use AI for. And then disclosure requirements, that’s another actually piece that we are looking at, which is if you are using AI, well, can you start with disclosure? What kind of model are you using? What kind of data have you been training your AI on? Start with disclosure. I think Sunlight, as our name suggests LightBeam, takes care of a lot of challenges, I would imagine.
Mike Vizard: Otherwise, known as the best disinfectant. Hey, Himanshu, what’s your sense of where are we going to be this year versus next? And is this the year that we spend more time trying to figure out how to operationalize AI versus benefiting from it? Is it going to be more of a let’s get our collective acts together kind of year and then see all the ROI in 2025, or are we moving full steam ahead?
Himanshu Shukla: Yeah, so the way I see it is like this has been very typical of the way US has operated. First you try to get the benefit out of things which are there, and then you think backwards to say what could have done to make things better. But just adding to the last point, which is there, I think there will be a lot of education also needed for people who are making the laws. Because what happens is at times they might go totally off track and as PD was mentioning somewhere I was reading, where they’re going to an extreme, where you have to get the model certified even before using it, which is kind of a big, big overhead for any organization, Vizard. So that’s an extreme. I think in this particular part, although Europe has been really conservative traditionally, but they’re very much better educated.
The lawmakers are much better educated, and that’s what they showed during the GDPR leadership too. But here I think there’s a big opportunity, all the US legislators to define things in a way which is reasonable, the best of both balances are there. But coming back to your question, I think this year people will move forward with trying to use AI and then they would realize that look, it can harm us in different ways and what are the ways you can control it. And organizations like us can start showing, and that’s where LightBeam is all geared towards to say that, why don’t we start defining the fundamental pieces in which you can use the technology in a responsible way? So that’s where we have been putting in lot of effort to help organizations build these smaller building blocks so that they are not caught off guard in 2025 when all those regulations kick in.
Mike Vizard: All right. So PD, you’ve been at this for a while. What’s that one thing you see organizations doing as it relates to data management, security and privacy that just makes you shake your head and go, “Folks, we need to be better than this.”
PD Prasad: One thing is hard, because there are quite the list of things, obviously. Perhaps the one thing that I’ve seen and we’ve all seen at least at LightBeam and related technologies is I think people are sometimes focused on the wrong problem statements to solve. I’ll give an example. Obviously as a company, we got a lot of sensitive data. Every company has a lot of sensitive data, and the question is, are you going to be focused on internal exposure of that sensitive data? Which is a problem. I absolutely, I’m on the same page on that. But the bigger problem is external exposure because hey, if you hired an employee through all the interviews and they’ve gone through your governance trainings and so on so forth, not that they can be bad apples, but they’re still your employees and they’re still protected by your network, and yes, they can do some insider stuff obviously, so it’s a risk.
But I see routinely that organizations while they’re aware about insider risk, they completely give a short shrift to outsider risk, which is what sensitive data is actually leaking from your organization, what sensitive data is going out of the organization. And that just maybe personal thing is boggles me a lot. We are focused on helping organizations understand their outsider risk. What’s the sensitive data are they sharing outside, with who are they sharing, whose data are they sharing? And it goes the same thing with AI models. You’re using ChatGPT, if you’re using the public version of ChatGPT, well guess what, a lot of your sensitive data is probably going out to ChatGPT, case in point, Samsung. And so that outsider risk with AI is actually becoming even more pertinent, I would say. So that’s the one thing from my side.
Mike Vizard: All right, folks. Well, you heard it here. On the one hand, it’s important to look before you leap. On the other hand, you don’t have forever and you got to get started someday. So better to get exposed and understand what’s happening now than trying to catch up later. Gentlemen, thank you for being on the show.
PD Prasad: Been a pleasure, Mike. Thank you.
Himanshu Shukla: Thank you.
Mike Vizard: And thank you all for watching the latest episode of the TechStrong.ai video series. You can find this episode and others on our website. Check them all out. Until then, we’ll see you next time.