The intersection of open source and artificial intelligence (AI) has become a pivotal force propelling us into uncharted territories of innovation. As the COO of the OpenInfra Foundation, a global open source organization with 110,000 members spanning 187 countries, I’ve witnessed the transformative power of collaboration and open source. Today, I want to delve into the profound impact of open source on the development of AI and vice versa, exploring a symbiotic relationship that has reached a tipping point.
Open Source: The Birthplace of AI
At the risk of stating the obvious: there would be no AI without open source. Over the past decade, open source projects have been the lifeblood of AI development. From fundamental frameworks like Linux and PyTorch to monumental contributions like NumPy and SciPy, the open source ecosystem has laid the foundation for the AI revolution.
Even proprietary giants like OpenAI, Google Bard, GitHub Copilot and others owe their existence to open source. Despite the irony in the name, OpenAI acknowledges the pivotal role of open source in their models’ training processes. The likes of NVIDIA, with its AI storage architecture powered by OpenStack Swift, exemplify how open source becomes integral, shaping the very fabric of AI systems.
The Rise of Open Source AI
The AI market’s recent surge, notably catalyzed by innovations like ChatGPT, underscores the accelerating pace of AI development.
Google Trends for “AI” from 2018-Present
Google Trends reveal a dramatic uptick, reflecting a pivotal moment in the field. The infusion of open source principles into AI has not only accelerated progress but has also fostered healthy competition, innovation, and democratized access to AI capabilities.
Hugging Face, a key player in the open source AI realm, showcases staggering growth in datasets, spaces, and models contributed.
Source: Hugging Face
The numbers tell a story of real-world engagement, dispelling notions of mere hype. Open source AI is not just a buzzword; it’s a dynamic force changing the game—increasing competition, fostering a faster rate of innovation, and providing access to a global community.
AI’s Impact on Open Source Communities
As AI continues its meteoric rise, it’s not just influencing how we develop technology but also challenging the norms within open source communities. AI tools, including GitHub Copilot, are already shaping how code is produced. Stack Overflow’s recent developer survey indicates that 70% of developers are either using or planning to use AI tools in their development process. The same survey indicated the most widely used AI assistant is GitHub’s Copilot. Thomas Dohmke, GitHub CEO, recently said he thinks 80% of code will be produced by Copilot sooner than later.
The landscape of software development is evolving, with challenges arising from AI-generated code that demands new cultural shifts. Some of the questions that we as open source communities must grapple with in this Brave New World include:
- What happens to the alignment of incentives between organizations and open source communities when you have robots at the table?
- When we have more and more code development without humans, some completely built without humans, what does that mean for our norms (such as code review) and our collaborations?
- And the legal question—How will AI license its code? This is a massive quandary that is completely unsolved at this point.
- How are we going to accept and review the code from AI systems?
- How are we going to accept AI assistants as members of our communities?
The alignment of incentives, collaboration norms, and the legal frameworks governing AI-generated code are questions open source communities must grapple with. As AI assistants produce more code, we confront new challenges in maintaining transparency, understanding authorship, and preserving the collaborative essence of open source.
The Regulatory Dilemma: Balancing Innovation and Security
The surge in AI’s capabilities has triggered a predictable response: Calls for regulation. Drawing parallels with historical debates, such as the encryption battle of the 1990s, the AI regulation discourse is gaining momentum globally. However, this time, the conversation takes a nuanced turn with implications for open source.
Source: Ernst & Young
Countries are adopting various approaches, from stringent legislation to flexible guidelines. Ernst & Young’s analysis of these varying approaches reveals the complex regulatory landscape, indicating a growing awareness of the need to balance transparency, security, privacy and non-discrimination.
Preserving the Open Source Ethos
Amid the regulatory debates, the open source ethos faces challenges. The fear of AI risks has led to calls for licensing authorities and new agencies overseeing AI development. Sam Altman, CEO of OpenAI, testified before Congress, “I would form a new agency that licenses any effort above a certain scale of capabilities and can take that license away and ensure compliance with safety standards.”
Although there are real risks in AI, regulatory capture such as Altman proposes is a riskier scenario than a world that allows free and open source software to live. We must avoid overregulation that could stifle innovation and potentially sideline open source contributions.
The concern about regulatory strangling of open source is real, as regulatory discussions increasingly veer toward anti-open source sentiments. For example, U.S. Senator Mark Warner recently issued a statement that said, “I fear that our export-control laws are not equipped to deal with the challenge of open-source software — whether in advanced semiconductor designs like RISC-V or in the area of AI — and a dramatic paradigm shift is needed.”
This statement was in regards to RISC-V, which is open source architecture for CPUs, but even in this statement that’s ostensibly a hardware-oriented restriction (which I would also oppose), Warren specifically mentions open source designs and AI.
Fortunately, industry leaders and advocates are beginning to speak out against ill-advised attempts to restrict open source in AI and other domains.
For instance, Kevin Xu, Senior Director at GitHub, posted on X, “All the attempts to restrict open source, first in AI, then in chip design (i.e. RISC-V), are: ill-advised, short-sighted, bad for innovation, and impossible to enforce.”
Yann LeCun, a VP and Chief AI Scientist with Meta, called for vocal advocacy in his X post, “The heretofore silent majority of AI scientists and engineers who do not believe in AI extinction scenarios; or believe we have agency in making AI powerful, reliable, and safe; and think the best way to do so is through open source AI platforms NEED TO SPEAK UP!”
Francois Fleuret, machine learning guru at the University of Geneva, echoed Yann’s sentiments: “I am with @ylecun on that. Legislation against open development of AI would not mitigate any risk, but would create tons of others by hiding development away from scrutiny by the collective intelligence. An AI ‘growing in the dark’ is the last thing we need.”
And Anima Anandkumar, Caltech lecturer and NVIDIA researcher added, “I agree with @ylecun that open source has been the primary reason for AI innovation and growth. Do not let misinformation and hysteria kill this. Open-source means we democratize and allow everyone to explore new ways to make AI reliable and safe, and allow for peer review.”
I’m inspired by this social discourse and encourage all open source advocates to add their voices. Let me suggest two practical ways to get involved:
- The Open Source Initiative (OSI), steward of the Open Source Definition, has formed a new group to define “open source AI” which I am an active participant in, and you too can participate in that effort. Check out the current draft of the Open Source AI Definition at https://opensource.org/deepdive/drafts/, add your comments, and keep a lookout for future open meetings throughout the world that you could join in person.
- The OpenInfra Foundation has established two regional hubs, OpenInfra Europe and OpenInfra Asia, to give the Foundation a better mechanism to support our local communities and focus on regional policy issues such as AI regulations that could impact open source. Visit your region’s website, join the mailing list, and become a participant.
Conclusion: Navigating the Future Together
As we stand at the crossroads of AI and open source, the path forward is intertwined. Open source has been the catalyst for AI’s unprecedented growth, and AI, in turn, is reshaping the landscape of open source development. A successful path forward lies in finding a delicate balance—a regulatory framework that ensures ethical AI practices without stifling the collaborative spirit of open source.
In the words of science fiction writer Ramez Naam, “The worst atrocities…maybe half of them arose directly because the powerful had a monopoly or a near-monopoly on some key capability.”
Let us not forget the essence of open source: Collaboration, transparency and collective intelligence. As we navigate the future, let’s ensure that the dance between open source and AI remains a symbiotic one, fostering innovation and progress for the benefit of all.