Operationalizing AI

After the recent Operationalizing AI event, the working groups that emerged from it are continuing their journey to connect DevOps with AI projects. I spoke with a few of the attendants to see what they’re doing. But first, let’s recap why the world of AI needs DevOps; that is, why it needs to be operationalized.

The majority of developers using AI are not going to be building AI systems like the GPT system behind ChatGPT; instead, they’ll likely be making use of such systems as part of a larger project. An example might be a chatbot that is aware of all the online documentation for a product. DevOps has, through the years, situated itself neatly in the entire software development process, using tools, for example, to automate builds as changes are made, run tests and verify all the tests pass; and if so, deploy the software to the cloud. In order to make this happen, developers need to write scripts that automate such processes. That’s one part of DevOps.

But when your software development project includes AI, it’s likely to encounter some additional difficulties. For example, you may be using a pre-packaged large language model (such as those found at Hugging Face). You might need to download a model and deploy it into a running instance or container. But consider what happens to that model; depending on the software you’re building, that model will continue to learn (a step called fine-tuning). The information that is added to the model through fine-tuning needs to be stored in a place such that if a new version of the model is released, you can safely download the new model without losing your additional data. In other words, your DevOps process needs to be expanded to handle the AI infrastructure.

Now, on to the working groups that have formed as a result of the Operationalizing AI event: The attendees originally had four groups, but they reorganized into three. (The fourth, governance, is being put on hold until they complete a first round of work with the other three, at which point they can move on to issues regarding governance.) Here are the three groups:

Security: This working group is focusing on security, including authorization, secrets management and more. This is required because many of the projects in AI make use of external APIs, and, for example, keys need to be guarded.

Data Strategy: This team is covering multiple issues, including, for example, the demand for on-premise solutions. AI models owned by private entities could easily include massive amounts of proprietary and private information owned by the organization, and it could be risky to place such data in a cloud. Another area of study here is what’s called observability. Observability refers to observing all aspects of the data. This includes quality control over the data going into the AI model and coming out of it, as well as the data inside the model, and detecting anomalies and preventing incorrect information from propagating. Another aspect of observability is logging of errors and producing alerts and notifications when there are problems.

Tools and Pipelines: As the name implies, this could be a wide range of technologies; however, the focus for this working group is on building a Docker image that will serve as a starting point for teams. The goal is that such an image will implement best practices such as code quality and testing.


The work is not yet done, but the teams are plowing forward. As the working groups move forward, they will also be producing GitHub repos they’ll be sharing, as well as access to the Docker images; and they’ll share their findings so that DevOps can move forward and be as important to AI projects as it is to current software development.