Synopsis: OpenAI models are central to debates on definitions, open-source software, licensing, and model transparency. Concerns include data source clarity and dependence on proprietary systems. Peter Wang of Anaconda stresses stakeholder collaboration and clear development processes for effective AI.

Wang argues the phrase “open-source AI” is getting sloppy: If you can’t inspect the training data or recreate the training recipe, you’re not truly open, even if the model weights are downloadable. Only a handful of projects meet that higher bar today, he says, pointing to research-focused releases from IBM and the Allen Institute as rare examples.

Why does transparency matter? Beyond ideals, Wang warns that hidden data can smuggle biases—or worse, legal liabilities—into downstream applications. He predicts that once the novelty fades, enterprises will demand audit-ready provenance just as they once demanded readable source code for critical software.

AI Unleashed 2025

The conversation shifts to cost and scale. Not every task needs a “kitchen-sink” model with a 128K-token window, Wang notes. Smaller, specialized models—quantized to run on CPUs or even a Mac mini—can deliver 90% of the value at a fraction of the bill. Expect future AI stacks to chain together many lightweight experts rather than leaning on a single, expensive behemoth.

Operationalizing those stacks, however, still “takes a village.” Data engineers, security teams and domain experts all need clear “swim lanes” so one misconfigured prompt doesn’t land the company in tomorrow’s headline. Wang envisions platforms that enforce guardrails, catalog vetted models and track lineage from raw data to production endpoint—so frontline users can innovate without learning the guts of GPU clusters.