AI Leadership Insights: GenAI in Workflows

Synopsis: In this Techstrong AI Leadership interview, Originality.ai CEO, Jonathan Gillham, explains how tools capable of identifying content created by a GenAI platform will be incorporated into workflows.

Mike Vizard: Hello and welcome to the latest edition of the Techstrong.ai Leadership Series. I’m your host, Mike Vizard, today we’re with John Gillum, who’s CEO for Originality.ai, and we’re talking about, well, how to detect content that’s been created using a generative AI engine. John, welcome to the show.

Jonathan Gillham: Yeah, thanks for having me. Excited to be here.

Mike Vizard: There’s been a lot of claims in this space. Some have proven to be not so good and others are still being tested, but ultimately what distinguishes who in this category and how does the whole thing work?

Jonathan Gillham: Yeah, so you’re correct in that there’s been a lot of noise in this space around how well do AI testers work, and some people saying that they don’t work. I mean, the question becomes what level of accuracy is necessary for them to be effective? So in the use case within academia where you need to have 0% false positive or near 0% false positive, it’s a tricky use case because AI detectors are AIs that are predictive engines on whether or not content was human generated or content was AI generated. And like all AI predictive engines, they’re not a hundred percent accurate. They have false positive rates. In our case, in our testing across hundreds of thousands of documents, we’re in that one and a half to three and a half percent range, which is very useful for content marketplaces and content agencies and people that are hiring writers that want to know that they didn’t just copy and paste out of Chat GPT.

Mike Vizard: To what degree is it okay to use content created by these platforms versus requiring people to have written it themselves? Is it just a question of validating what was created by the machine in the first place before I incorporate it? Or are people just perhaps being overly aggressive in their copying and pasting?

Jonathan Gillham: I think it depends on the use case. And I think I use Chat GPT all the time for my own writing. It’s great. It helps. I prefer to communicate in spreadsheets and chat. GPT really helps me sort of communicate in words that make a lot of sense. It’s a phenomenal tool. It’s about the transparency and honesty of being able to publish with integrity and know if this content was generated and there was oversight with a human or if this content was just copied and pasted out of Chat GPT with no fact checking. I think one of those is great, one of those is a concern, and I think a lot of people, if you were to ask them, do you want to read an online review that was generated by an AI or an online review that was generated by a human, most people would choose to read that human review and not the AI version.

Mike Vizard: And that transparency extends to use cases involving, let’s say I’m a lawyer and I’m using Chat GPT to go summarize a case law or something like that. I might want to note that in the presentation to the judge because then the judge is going to go, well, did this thing hallucinate or not? And that matters to the case. And then there are other instances though where to your point, I might have a bunch of spreadsheets that I’m just summarizing the data on, and I just need to get to the gist of the thing, and I don’t necessarily need to cite chapter and verse and be a hundred percent accurate. Are we as a society going to be able to make those distinctions as we go along? I mean, how long and where are we on this adventure?

Jonathan Gillham: Yeah, it’s a societal, I think you’re exactly right in that it’s a societal question on what is going to become the norms that are accepted. And I think it’s going to be nuanced. In a medical field, if your doctor is using AI to summarize notes and providing those to you, I think that could be incredibly valuable because you’re getting that right information communicated to you in layman’s terms, but you would also want to know that you need to sort of dial up your BS detector because AI was used in the creation of this document.
And so I think it’s going to become a real societal question that many different industries and many different professional organizations are going to have to communicate when and where it’s allowed and not allowed to be used. And the disclosure of that use. Are we going to have tools that will ensure the policing of whatever sort of societal rules get put in place? I don’t think we’re ever going to have, I think the only, watermarking is going to be challenging to enforce. I think it’s going to be very hard to achieve proof of the use of AI in text creation. I think image creation, we might have a better shot at it, in video creation similar, but I think in text that genie is out of the bottle and it’s going to be very hard to have absolute proof on when it’s been used and not used with academic disciplinary levels of threshold of certainty.

Mike Vizard: Do organizations need to come to some decision about this themselves? Because I mean, even in academia, to your point, I’ve interviewed folks over the few months and one was appalled by the use of AI and was going to eliminate the use of essays and tests altogether because they felt they couldn’t police it. And they were like, well, I have other ways to test the knowledge of these students, so I don’t need to rely on the essay. And if I don’t know who wrote it, I don’t want it.
On the other extreme, another teacher was over the moon about generative AI because she was teaching in, I think it was Baltimore, and she was saying, I have a lot of students that can’t articulate their ideas because they don’t have the writing skills. And generative AI makes them more productive in the sense that they are able to share an idea, because they’re not natural born writers, and there’s many people who are not. So along that spectrum, do we as corporate cultures and even in academia or whatever it’s going to be, need to come to some sort of agreement about what we want more.

Jonathan Gillham: And I wonder where we’re going to land. So to your question, yeah, I think there, there’s certainly, each organization is going to need to find out what the right fit for them is, and it is not necessarily always the same. For example, in our organization, we have, our AI research team is based overseas, non-native English writers. They submit a lot of their reports having been AI generated, that makes it onto our website. They’ve written it, it’s been translated, we review it. Whereas a lot of our other marketing writers that are native English speakers, we expect no use of AI from them. That’s our policy from a Google risk standpoint. And so I think there’ll become some social norms and some sort of professional norms, but then I think it’s ultimately going to be up to every organization to determine what their policy will be. And then that policy will have some component of enforcement, even if it doesn’t produce a perfect outcome for every test.

Mike Vizard: And the quality of the adventure varies greatly, and I’m bringing this up because I’ve also seen people where they’ll use a series of prompts. Sometimes they can be 10 deep to come up with some outcome that they’re happy with and that’s great, but somebody who actually knew something about the topic would be able to just create that same piece of content in about a minute versus going all the exercise of creating all the problems to get the machine to create that as well, and then spend time fine tuning whatever the machine came up with. So I think we as individuals will need to decide when is it simpler just to do it ourselves versus rely on the machine and then spend all the time and effort tweaking whatever the machine came out with?

Jonathan Gillham: Yeah. So there’s an interesting study that Boston Consulting Group and Harvard did where they gave a control group. So they gave a bunch of their management consultants access to the use of generative AI in the completion of tasks that reflected what they generally do with the management consulting. And what they saw was that they split the group into sort of low experience consultants and high experience or high domain knowledge, low domain knowledge. And the performance on the control group was significantly different between the high domain knowledge and low domain knowledge. When both groups were provided, or that similar group, the low domain knowledge and high domain knowledge were provided access to generative AI to complete those tasks. The gap greatly, greatly narrowed. The high domain knowledge still outperformed the low domain knowledge, but the gap between the performance was close to eliminated. And I think that speaks to some people will need AI to level up playing field like the teacher in Baltimore and their students, and then some will, the experts will still continue to outperform the lower performers but with access to AI.

Mike Vizard: Chris, we’re talking about this in the context of where AI is today, but the machines kept getting smarter. So what’s your expectation of where we’re going to be down the road? Because if I look at something like Chat GPT, it’s a general purpose model. A lot of data that was, shall we say, casually vetted and thrown up into the platform. If I have LLMs that are carefully vetted with the right data, will I not get better outcomes? And as such, my experience with these platforms should improve.

Jonathan Gillham: Yeah, and from an AI detection standpoint, I mean when GPT 10 comes out, are we going to still be capable of detecting it or will it be so powerful? And I think it’s a pretty tough world to make predictions. Yeah, I think there’s a case that there’ll be no chance of detection, and I think there’s also a case that there will be, detection will continue to be usefully effective.

Mike Vizard: Is there a danger to the way we learn? And I’m asking the question is if the machines keep spitting back that which we put into it, does it not eventually just become somewhat circular and we miss the opportunity for some original thinking? So do we need to be careful about how we use these machines in the context of innovation and some actual original thought?

Jonathan Gillham: Yeah, I think my view on it is that I think AI will do a phenomenal job at steepening that the learning curve up to the level of knowledge that currently exists. But I think if you’re doing frontier level research, generative AI is an aid, but it improves your capabilities by a smaller percentage than if you’re new to the field. And so I think it’s going to, my bullish case on this, is that it’s going to bring people up to the frontier level of knowledge on their domain far quicker, and we’ll then have far more people working at the frontiers of human knowledge than trying to ramp themselves up to that level of understanding. And so I think it’s going to, sort of a second order effect, increase our humanity’s ability to generate new knowledge by having more people working on the frontier of that knowledge.

Mike Vizard: Will it raise the bar for what we might have previously thought of as an entry level job? And the things that we consider entry level today, whether it’s in law or whatever field it may be, we will not look anything like they look today and everybody will have to up their game to particularly join an industry because the common base of knowledge will be in the machines.

Jonathan Gillham: Yeah, I don’t know. I think it’s going to carve out the middle. And so I think the people working at the frontier, because AI has to learn off of what’s existed, I think there’ll still be space at the frontier. And I think because AI is such a tool that it advances people’s learning curves so rapidly that I think new people can join and be very productive. I think the people that are going to be the most challenged is, they call it the journeyman knowledge worker, where they’re relying on their 10 years of experience to sort of do their tasks that they’re quite good at, and they can justify their salary off of that 10 year learning curve or 15 year learning curve.
I think that the value of that 10 years, 15 years is going to shrink with new entrants. So I think, I don’t know how, everything’s going to age poorly, but I think it’s going to be attractive for companies and organizations to hire junior employees, give them the power of the AI, and then have several, call it frontier level, expert level overseeing that as opposed to a large number of middle of career experience individuals.

Mike Vizard: We’ve been talking mainly in the context of text, but we’re also seeing the machines now can be given text and they’ll create video. And we’re also seeing that on the audio side. Can we detect that and what are the implications?

Jonathan Gillham: Yeah, so our tool does not, we have not researched the capabilities of image and audio and video detection. We have tested some detectors in the marketplace, with the data set that we tested 75 to 90% accurate with a pretty high false positive rate, 10 to 15% false positive, so less effective than the text detection performance. I think there will be a large push, and we’ve seen this from a legal standpoint, from a sort of laws of the country to ensure that the societal harm that could be caused by images and texts and audio that are AI generated and being passed off as factual could be significant. And so I think it will be a regulatory push to ensure, and we’re seeing the start of it with some letters out of the White House, but a regulatory push to ensure that labeling of image, audio and visual will be enforced, because I don’t think, detectors will not be up to the task and the societal harm from those forms is greater than from just text.

Mike Vizard: So what’s your best advice to folks based on where we are and at least what we can see a little bit over the horizon? Should they be embedding some sort of detection capability all across their workflows, or is there smaller measures to be taken? Where do we begin and where are we going?

Jonathan Gillham: Yeah, I think it’s a nuance for every company. I mean, I think a lot of our customers are happy to pay a writer a hundred dollars, a thousand dollars for a piece of content. They’re not very happy to find out that that piece of content was copied and pasted out of Chat GPT in 15 seconds. That’s a large percent of our use case is in that context of somebody hiring writers. And so we think that transparency should be required throughout an organization on when individuals have used or not used Chat GPT, and then the enforcement side, the checking side should be implemented at whatever threshold makes sense. We try and provide a lot of open source tools that allow potential users to test their own dataset to determine the efficacy of our detection on their dataset, and so they can load their dataset, run our tool, get the full statistical analysis to understand the level of efficacy of our detection on their dataset so that they can make that right decision on where detection should be implemented as part of their overall AI strategy.

Mike Vizard: All right, well folks, you heard it here. Obfuscation, lying, it’s wrong in any age, so AI is no different. So we just need to figure out how to shine a light on it because the best thing to do with any kind of issue where we don’t know what’s going on is, hey, some sunlight goes a long way. Hey John, thanks for being on the show.

Jonathan Gillham: Great. Thanks for having me.

Mike Vizard: All right. And thank you all for watching the latest episode of the Techstrong.ai video series. You can find this episode and others on our website. We invite you to check them all out. Until then, we’ll see you next time.

AI Leadership Insights: GenAI in Workflows

SHARE THIS STORY

FOLLOW US

AI Leadership Insights: GenAI in Workflows

TECHSTRONG TV

TECHSTRONG AI PODCAST

SHARE THIS STORY

RELATED STORIES:

FOLLOW US

NEWSLETTER SIGN UP