
Google today added a Model Context Protocol (MCP) server to make it simpler for artificial intelligence (AI) applications to access the Data Commons knowledge graph it developed to make it simpler to access public data repositories.
Originally developed by Anthropic, MCP provides an open interface through which AI applications and agents are able to access additional data that resides out of a large language model (LLMs).
The overall goal is to make data stored in public repositories more accessible to AI agents that end users will employ to query public data via the Data Commons platform using natural language, says Prem Ramaswami, head of the Data Commons initiative at Google.
Making that statistical information available to AI agents will also provide the added benefit of reducing hallucinations that might otherwise be generated by LLMs that drive AI applications, he adds.
Among the first organizations to take advantage of this MCP server is the ONE Campaign, a global organization that advocates for the investments needed to create economic opportunities and healthier lives in Africa, which was founded by Irish musician Bono and Bobby Shriver. The non-profit charity has now created its own AI agent to explore data it accesses via the Data Commons graph to search tens of millions of health financing data points in seconds. That AI agent can then also visualize that data and download clean datasets that are used in reports created by the AI agent.
The MCP server for Data Commons is available under an open source license and can be integrated with the Agent Development Kit (ADK) and Gemini command line interface (CLI) developed by Google or other third-party agentic AI platforms.
Ultimately, the goal is to make it simpler for organizations to make decisions based on actual data, says Ramaswami.
It’s not clear how many organizations are using the Data Commons graph to access public data, but AI agents should make it easier to identify anomalies, notes Ramaswami. Governments around the world make public data available, but validating the accuracy of the data has historically required painstaking analysis. Going forward, analysts accessing that data should be able to launch a simple natural language query that surfaces any anomalous data that warrants further investigation of its veracity, says Ramaswami. “Organizations can create their own data validation pipelines,” he says.
Hopefully, that capability will shine a light on instances where governments, for whatever reason, have made erroneous data available that they will then correct. In the meantime, however, it should become increasingly possible for the average end user to explore public data without having a deep background in data science. For example, it will become simpler to more deeply understand why, for example, different regions of the same country have significantly different medical outcomes that might lead to changes in public policy.
Being able to more easily analyze public data sources is, of course, only one aspect of the revolution in data science that AI will enable. The challenge now is finding ways to combine that public data with private data that enables organizations to make better decisions on behalf of their customers, employees and stakeholders.