Snowflake today announced it has allied with Microsoft to integrate it cloud data services with the Azure ML, Azure OpenAI and Microsoft Cognitive Services as part of an effort to make it simpler to bring the data needed to train artificial intelligence (AI) to the cloud platforms where they are created.
Announced at the Snowflake Summit 2023 conference, other Microsoft offerings that Snowflake will integrate with include Purview for data governance, Power Apps & Automate for low-code/no-code application development, Azure Data Factory for ELT and Power BI for data visualization.
Christian Kleinerman, senior vice president of product for Snowflake, said this alliance is part of a larger Snowflake effort to become one of the dominant sources of data for training AI models. In fact, it’s unambiguously the company’s goal to be the platform for choice for building generative AI models, he said.
Snowflake is obviously not the only provider of a data management platform with similar ambitions, but it has become a center of data gravity in the cloud that can’t be ignored. In addition, the company has made available Snowpark, a framework for building Java, Scala, SQL or Python applications, along with an open source Streamlit framework for building Python applications that it acquired in 2023.
It’s not clear to what degree the Snowflake platform will be employed to build, deploy and host applications, but the company is clearly betting the volume of data residing in its cloud platform will attract a wide range of application developers and now data scientists.
The one that is clear is a convergence of data and machine learning operations (MLOps) with the best DevOps practices used to build modern applications is underway. The challenge and opportunity now is to harness the respective engineering talent those individual teams possess without creating higher levels of organizational friction. After all, mastering the cultural issues that arise when trying to infuse applications with AI is as big a challenge as any technical consideration. Each IT group has developed its own vernacular for processes that, while dependent on each other, are distinctly different.
Of course, the AI models developed are only going to be as good as the data relied on to construct them. Organizations that have previously not paid much attention to data management are now racing to bring some order to a chaotic process that in many cases has been allowed to spin out of control for decades. Much of that data is not only strewn across the enterprise but is also conflicting, and in some instances plain erroneous.
Regardless of the approach, the one thing that’s clear is that as many of the data management sins of the past come home to roost, organizations are running out of time. With the race now on to apply enterprise data to generative AI platforms before rivals, a lot more organizations have finally gained a new appreciation for managing data that goes beyond merely checking a box to achieve a compliance mandate.