Synopsis: In this AI Leadership Insights series Mike Vizard speaks with Stuart McClure, CEO of Qwiet, about how to secure large language models.
Mike Vizard: Hello and welcome to the latest edition of the Techstrong.ai video series. I’m your host, Mike Vizard. Today we’re with Stuart McClure, who is the CEO of Qwiet. And we’re talking about how to secure all these large language models and other types of AI models, for that matter, in a world where, well, the bad guys want to target these things and they want to do everything from poison them to see if they can inject malware, and it’s a little more complicated than anybody might think. Stuart, welcome to the show.
Stuart McClure: Thank you so much, Michael. Appreciate it.
Mike Vizard: We are, again, talking about something a little bit after the fact when it comes to security, it doesn’t seem like we ever learned that lesson. But can you outline what are some of the challenges that folks are going to encounter from security perspective as we go to build and deploy LLMs?
Stuart McClure: Well, obviously AI is both LLMs and they’re also predictive models, so I think you’ve got both sides of that problem of AI. But yeah, the hype right now is around large language models, so we can hit that first. I think OWASP came out most recently with a pretty good list of sort of the top 10 machine learning security risks, which I think it’s a great place to start to sort of understand what are the big problems there. But let me just hit a couple top three, so that people understand where the real vectors of attack are into this space around the LLMs.
So first off, and this came from an attack that actually happened against my company Cylance. So Cylance was a company who got acquired by Blackberry back in 2019, and we were a full online cloud model for predicting maliciousness in executables, and we had a cloud model, but we also had a local model. And what the attacker did out in Australia is they did a brute force attack. So what they said was, well, we’re going to modify the file just slightly and then see if Cylance keeps calling it bad. And then they modify it bit by bit by bit by bit by bit, and send requests every single time. So that’s called a brute force attack and that’s a pretty common attack that a lot of these… If you look at these security for AI companies that are coming out, they’re often anchoring around that particular threat vector, which is brute forcing of models, but you’ve got a lot of other LLM specific type of attacks like data exfiltration.
So as you allow your employees to gain access into these OpenAI or Bard, or anything else, and you can submit information around the question that you’re trying to ask, all that information is stored and can sometimes be retrieved by other people that shouldn’t have access to it. So that’s sort of a data disclosure or a data privacy issue that occurs. But you can also do things like poisoning of that data. You can get insider information around that data. There’s a bunch of attacks that occurred around looking for internal IP addresses of a specific company or domain that was allowed. So there are a lot of these sort of like what the customer provides OpenAI is now being able to be pulled from others. Now, OpenAI and others have done a lot of work around trying to prevent that kind of thing, and I feel like they’re getting better at it, but I think the biggest problem that we have with all of these LLMs is the AI hallucination problem.
And what that means is real simple. My simple example to that was when the OpenAI model first came out, the GPT-3 came out, I asked it, hey, who is Stuart McClure, the cybersecurity expert, tell me a little bit about the person? And it came back with Stuart McClure died May of 2021, my condolences to you and those that knew Stuart. He was a great cybersecurity expert. And so I said to him, I said, well, I don’t think that’s possible. I’m right here and this is me, and I’m Stuart McClure. And it said, oh, sorry, we’ll learn from this. Okay, so that’s fine. Month after month after month, I kept asking the same question, who is Stuart McClure, cybersecurity expert? Same answer every time, until about three weeks ago in September when finally it said, Stuart McClure is CEO of Qwiet, running AI and machine learning models inside of cybersecurity, et cetera, et cetera. And so it got it correct.
The truth is, they built that model I think in September of last year, obviously had been building it from data around either another Stuart McClure or simply couldn’t make the connection and gave false information. And this happens a lot. I would probably venture to say it happens maybe 1 out of 10 times, 2 out of 10 times. But because it’s so human-like in its response, we sort of accept the imperfections of the answer, and we take it as truth, because they are in a position of authority. So I think this AI hallucination problem is a big problem, and we see it right now in, for example, code generation. So source code generation is part of Qwiet. All these LMSs are built, but they’re built on insecure code. So if you’ve built a machine learning model on insecure code and you ask it to code for you, what do you think you’re going to get, secure code or insecure code? You get insecure code. So that’s why we have to really be careful around a lot of this work.
And of course there are countless other types of attacks with large language models that we could talk about, but the bottom line is there are big ones and small ones to worry about across the board.
Mike Vizard: Well, I’m glad to see that rumors of your demise have been greatly exaggerated.
Stuart McClure: Thank you.
Mike Vizard: I also think that part of the issue is that the answers you sometimes get, I mean you were getting a consistent set of answers regarding your living status, but a lot of times around code, I noticed that when you ask for a piece of code, you’ll get different instances and different types of code back, and it’s not consistent, and some of it has vulnerabilities and some of it doesn’t. So I feel like there’s this consistency issue is being overlooked, because sometimes what you get for one answer and what I get for the other answer, or for that matter what I get for the next answer tomorrow through the same question, can be radically different and have different levels of security issues.
Stuart McClure: I think that’s a big problem as well. It’s ubiquitous and I don’t see it going away anytime soon, which is, you ask it.. Well A, based on a timeline, you could ask it the same exact question and prompt and you get a different answer depending on the model build. That’s why all of these big companies that are building these LLMs, they have to freeze the model and they can’t update it any further, and that’s why you saw that report of my demise for a year, because they hadn’t updated the model in a year. And so you’re going to see a lot of that, but also you’re right, based on the prompt, you make one change in one word and it could give you an entirely different set of code to use in your software. So it’s very fragile. It’s very brittle, as we say. So I don’t know how much we can really use, at least in mass, these sort of LLM code generation solutions without a lot of human intervention and oversight, and making sure that what you’re getting is actually what you asked for.
So I would say it’s not super helpful for those junior developers, but it’s probably super helpful for the more advanced developers that know what they’re needing to build before they ask a GPT model.
Mike Vizard: One of the things we have seen people do is they’re trying to use a vector database to teach the model something new and different. And so they’ll show it some code, says this is good code, and they’re hoping that we’ll generate some code back that’s more, shall we say, viable. But it seems to me that that process too is also prone to abuse like any other workflow process. So how do we kind of make sure that the stuff that’s going in the vector database is relevant and doesn’t wind up making the model worse?
Stuart McClure: And this is the problem with a community-based model, in my opinion. What you’re learning in a community-based model is from the community, and it’s an incredibly highly diverse set of data that you’re trying to build these models on, and they can be polluted obviously in many different ways. That’s why I do believe probably some of the most successful implementations of some of these vector databases and feeding into the models rather is going to be on-premise or customer specific. So let me give you the example of Qwiet today. So the technology we use is based on a graphing technology called code property graphs. And what we do is we take code and we basically create layers of data around that code, so we understand all of the contextual elements of the code, and what this allows us to do, we’ve been doing this for six plus years now, but what it now allows us to do is to actually train a model on the very first code property graph that you created, which is obviously going to have vulnerabilities all along the way. And then the code graph, the final code graph that you pushed, which are going to have all of those vulnerabilities gone or mitigated.
And it is the difference between those, the first one, the vulnerable one, and the last one, the fixed one that we can learn on how did you make the changes that you made to make it less vulnerable? And it is that sort of learning model that we’re going to be able to offer to customers so that they can fix the problem the way they fix it, not the way the rest of the world fixes it, because they don’t know the nuances of your code, they don’t know the multiple levels of elements, and contingencies, and libraries, and dependencies, and everything else that it requires. So I think there is a short term near terms of application for a lot of this stuff in and around specific tenants. So customer tenants for example. And I think we can focus there for now and see how that builds out into a broader, more generalized model. But I think a subjective model is absolutely the perfect place to start with a lot of this stuff.
Mike Vizard: Do you think the bad guys are onto this? Are they kind of figuring out how to exploit this stuff or is it still early days?
Stuart McClure: I think it’s still early days, but just like anything, it probably takes a year or two and the bad guys are fully funded and they can absolutely start going after a lot of these attacks, but it’s not mainstream yet, not by any stretch. Why? Well, because it’s too easy the old school way, you don’t even need to get into large language models and hack them up. You just hack up an iPhone with simple zero days that come out every single week, or you hack up an Android or a Windows computer. I mean, it’s just like shooting fish in a barrel still out there.
Mike Vizard: Who’s going to be in charge of this security issue? Because I feel like if I look at the models today, they’re built by data scientists, and we’ve had this issue with developers and security for more years than anybody cares to admit. But the data scientists know even less about security. So who’s going to step up?
Stuart McClure: That’s a great question. I mean, the government is always going to stomp their feet and raise the flags, but they’ll never enforce. They’re too scared of private industry fallout. The data scientists have never been trained on cybersecurity or very few have been. So they’re not aware of the threat vectors that they need to consider as they build their models and the infrastructure around the models. So really it’s going to fall on the private industry, and I think that’s why the cybersecurity marketplace and the industry itself is not going to go away anytime soon. You have a lot of people trying to solve these problems in small and big ways, and the burden falls on them to come up with innovative technologies that can be broadly adopted.
Mike Vizard: Is there going to be regulation in this space and will it be meaningful to your earlier point? Sometimes I feel like the people who write the regulations have no idea how the thing actually works.
Stuart McClure: Exactly. I mean, pretty much the entities, they’re all well-intentioned and good people, for sure, but they often have no teeth on any of it. It is maybe one step above guidance of, well, this is how we think you should build models or this is how we think you should implement them or make them accessible. But there are no specific details around it. It was one of the big great challenges. I’ve been up on the hill twice and presented to Congress in the house around cybersecurity and embedded systems, for example. And everybody’s excited about preventing the next cyber attack on a 10 ton truck, but no one wants to put anything in detail, because well, you’re going to offend some company that depends on that truck from not having the security control. And so it’s just a very tricky problem set to solve for when it comes to prescriptive guidance.
So you’re going to get these general guidance. Yeah, you should do the CIA model and you should do this model, and things like this. But under regulated industries, for example, finance, healthcare, you’ve got well-defined structured processes, but a lot of those are qualitative. They’re me asking you a question, well, do you have a firewall, Mike? Oh, great. Do you have filtering rules on the inbound interface of your firewall? Yes. Oh, okay, great. Check, check. You’re done. You pass. But very few of those regulations actually go in and say, well, show me the filter rules, dump the filter rules for me, and how often do these change, and who controls them? That kind of thing. These are prescriptive requirements that any cybersecurity practitioner, what their salt knows to go look for, ask for, to understand the current state of cybersecurity for that particular customer. And it’s just going to be incredibly difficult to get regulators to be that prescriptive, but they should be, in my opinion.
Mike Vizard: So what’s your best advice going forward and who ultimately will be in charge, and how do we kind of get at this in a way that maybe is a little more sophisticated than just locking everybody in a room and hoping for the right answer?
Stuart McClure: I think the leaders in the space, the Googles, the OpenAI, the Microsofts, these are the folks that really need to set the standard, the standard for how you build secure models, how you protect those models, and how do you prevent them from being manipulated to malicious ends? Because there are defined processes that you can take with all of it, but it takes the big guys to really come forward with a set of standards by which all of us can sort of follow and guide. Now, of course, only those that have the capacity and the checkbook to do it can follow those sort of standards and guidelines, but at a minimum, you’re going to take a chunk out of the problem, and we’ll get to a better place around these models being more secure.
But ultimately, I think we all have to think to the prevention angle. You learn as a kid, an ounce of prevention’s worth a pound of cure, and we, in the cybersecurity industry, 98% of every hour that goes into our work is all cure. It’s all cure. It’s detect that an attacker got in and then it’s respond. How do we clean up? And then how do we fix it and prevent it going forward? And prevention is the only game of any value in this space, but 2% of our energies are focused on that. We have to take a more preventative approach and ask the question, why. Try to go back to your kids’ 3 and 4-year-old days and they’ll ask you, well, why do I need to go to lunch? Why do I need to go to swim class? Why do I need to do this? Why? Why? Well ask the question, why did this happen? Tell me why. Not just the surface level, but well, why did the original problem come to be and how could we have prevented that original core root of the problem? Because I am telling you, 100% of the time you can find it, and 99.99, you can put something in place that’ll prevent it.
Mike Vizard: All right, folks, you heard it here. No matter what it is, including AI, there’s always a root cause, you just have to figure out what that is, and what it’s about, and how to prevent it. Because I will tell you the one thing that is for certain, is that the only thing more challenging than building AI models is fixing them after they’ve already been deployed in production environments. Hey, Stuart, thanks for being on the show.
Stuart McClure: Thank you, Mike. Take care.
Mike Vizard: All right, and thank you all for watching the latest episode of Techstrong.ai. You can find this in other episodes on our website. We invite you to check them all out. Until then, we’ll see you next time.