AI will never be a lights-out, hands-off process. Even the most advanced AI systems underpinned by powerful LLMs will provide outputs that aren’t quite correct if they don’t have knowledgeable engineers fine-tuning them and users providing quality feedback. This is not unlike how humans behave as well. Beyond pure hallucinations, when humans aren’t in the loop, AI chatbots and search engines can still provide outputs that are technically aligned to the query but miss vital context.
For example, a user may search for cigarettes – but instead be shown nicotine patches. While the product they were shown was related to what they were looking for – it still wasn’t what they were hoping to see, and while not the AI’s intention, it showed a bias against smoking. Another example is searching “Chromebooks” on Google. The intent of the query is unclear. Is the user asking about where to buy Chromebooks? Or how Chromebooks are made? Or reviews of Chromebooks? Or an analysis of the pros and cons of Chromebooks vs traditional laptops? The list of potential intent is long and unclear.
There is incredible potential for the use of AI, but we have to see through the hype to determine what is realistically possible. Satisfying the user’s intent as quickly as possible is an extremely difficult challenge. The gap between understanding and satisfying intent will keep reducing, especially with the rise of LLMs and agentic AI. An equally important consideration is ensuring the gap is reduced both ethically and feasibly. Keeping humans in the loop with an override option will be critical to ensuring we’re getting the most useful, accurate results from AI.
Keeping Humans in the Loop
Without the right checks and balances, we’ve seen AI run wild, hallucinate and produce unhinged results. On a less extreme but equally important note, businesses without the right safeguards in place could be putting themselves in a poor position to take advantage of the benefits AI can provide. This is why inputs are critical.
Take seasonality for example. AI search models aimed at retailers need people training them with the right inputs, so they understand cultural nuances and when to present certain items based on the season. For example, in a search function, Christmas decorations don’t need to be pulled and presented all year round. There will always be specific cases and situations that the model doesn’t know of, so humans must be there to train with context and correct when wrong.
In the world of AI and generative AI output, the person who is receiving the results needs the option to share feedback in real-time. This means being able to note whether or not the answers they were provided were correct and met the mark. This feedback can be used to inform inputs and improve results on an ongoing basis.
Overriding an AI-Driven Decision
For certain industries, it’s critical that outputs from AI models are precise. For example, for professions such as lawyers and doctors, it’s important they work with factual information – so they’re hesitant to trust the outputs coming from these models. Even in the consumer world, we have seen cases of overruling with Google. After Reddit made available their data to Google for their AI overviews, multiple examples of false answers were reported, including recommending people jump off the Golden Gate Bridge as a fun activity. Since then, Google has been rolling back when and how their AI overviews are being presented.
Because this is not an uncommon occurrence, it should be extremely easy to overrule or reverse an AI-driven process, so things don’t get out of control. Years in the future AI agents, or AI systems that can carry out tasks from start to finish without human involvement, will become a reality. At some point, those agents will start to talk to other agents. And, if they start to develop their own languages that humans can’t understand, it will be time to abort. Though this is far off on the horizon, it’s essential we set the groundwork now by developing and standardizing evaluation frameworks that ensure models are working properly and not moving in a rogue direction.
When it comes to who overrides AI decisions, there are more questions than answers. AI outputs rely on a corpus of data and material, and whoever owns that data has a responsibility for how that data is being used. But who owns user-generated content? If it’s the users who are creating content like on Instagram, TikTok, or Reddit, are they to be held responsible if their data is used the wrong way? Or is the platform that hosts the data to be held responsible? Once the data starts to get pulled out and becomes valuable, who should take responsibility to make sure it doesn’t get crafted in a way where it’s going to mislead users?
What about the company that owns the model? The company using the data should have the right to overrule, but there’s a gray area when we think about the internet. The difference between a search engine and an answer engine is the search engine is not the arbitrator of truth. All it does is provide “10 blue links” and leave it to the consumer to decide if they trust the data. By moving towards providing answers in addition to “10 blue links,” Google is now in the position of having to arbitrate if the answer is true or false. Who is responsible for the accuracy of the underlying data? This is the gray area we are entering. It’s also critical that this question is answered before we actually enter the era of agentic AI models – an era that some claim has already arrived.
Increasing Trust in AI
We are a ways away from reaching a point of complete precision and full, hands-off trust. That being said, we can improve trust through AI governance and frameworks, and ensure humans are kept in the loop to make sure models are being trained properly and reinforcement learning is being applied.
In a world of agents, this governance framework is going to be even more necessary. While we’re working on it, individual responsibility, safeguards, hands-on human interference and overrides will be necessary for us to responsibly use AI to its current potential.