
In the world of AI, it’s clearly RAG time. The surge to put Retrieval Augmented Generation (RAG)-elevated injection into new and existing AI models is fast apace. Many software engineers are understandably keen to draw on RAG’s ability to externally validate large language model (LLM) data and align GenAI engines towards internal proprietary data chains where needed.
But RAG is no plug-in, its implementation requires advanced software and data engineering skills in a marketplace where platform progression often outstrips human comprehension.
Graph database management system company Neo4j wants to address these pain points with its GraphRAG Ecosystem Tools offering. The new suite of tools is designed to simplify GraphRAG development, helping developers new to knowledge graphs and GraphRAG to get started more easily.
An AI Starter Kit for RAG Development
Described by the company itself as ‘something of a starter kit’ for RAG development in GenAI projects, the tools can take unstructured data and turn it into a knowledge graph from scratch. This technology is designed to be a first step on the journey to implementing LLMs supported by knowledge graphs, a heady combination that is heralded as a route to kickstarting GenAI development through the company’s own GraphRAG Ecosystem.
For completeness here, a knowledge graph (sometimes also referred to as a ‘semantic network’) is a collection and amalgamation of interlinked entities, concepts, relationships, situations and events. The descriptions of these ‘things’ and the values used to denote their relationship to each other are stored in a graph database with its ability to visualize relationships as a graph structure. Through the use of linked semantic metadata, knowledge graphs are able to put data into context and therefore provide a validation for that information to be integrated, unified, analyzed and ultimately shared. The nodes (person, place or object) in a knowledge graph are further defined by their edge (relationship) to other nodes and the graph’s ability to store information about any given piece of data’s column label.
Because software engineering practices that make heavy use of LLMs supported by knowledge graphs aren’t necessarily simple, Neo4j wants to simplify this software application development category.
Neo4j says that the main benefit of the tools on offer here is that they can take unstructured data (like a collection of newspaper articles) and turn it into a knowledge graph to help data teams spot patterns and relationships. The new tool also includes a new natural language search function, called NeoConverse, to help make graph databases accessible to non-technical users.
Structured & Semi-Structured Data
“GraphRAG combines retrieval augmented generation with knowledge graphs to solve critical LLM issues like hallucination and lack of domain-specific context. Unlike most RAG solutions, which only offer access to fragments of textual data, GraphRAG integrates structured and semi-structured information into the retrieval process,” said Michael Hunger, head of product innovation & developer strategy at Neo4j. “These new tools will help developers create a knowledge graph from unstructured text and use that graph – or an existing graph database – to retrieve relevant information for generative tasks via both vector and graph search.”
Hunger and team further state that knowledge graphs provide the contextual memory that LLMs need to answer questions and serve as trusted agents in complex workflows.
Neo4j promises user (developer) freedom here and says that its toolset can be used to start greenfield GenAI development projects, or, equally, be used as a functional reference template for a team looking to build its custom implementations. The current implementations use the company’s own LangChain integrations for Python and Javascript, but users can also build with other languages and frameworks.
Neo4j Knowledge Graph Builder works with PDFs, Word documents, YouTube transcripts, Wikipedia pages and many other kinds of unstructured text.