Diffbot Unveils New AI Model to Enhance Factual Accuracy
Diffbot, a burgeoning tech firm located in Silicon Valley and recognized for maintaining one of the globe’s most comprehensive databases of web information, has announced the launch of an innovative AI model aimed at tackling a significant hurdle in artificial intelligence: ensuring factual accuracy.
An Innovative Approach: GraphRAG
The latest model is a finely-tuned iteration derived from Meta’s LLama 3.3 and marks the pioneer open-source application of what is known as graph retrieval-augmented generation (GraphRAG).
Unlike traditional AI models that depend exclusively on extensive sets of pre-existing training data, Diffbot’s large language model (LLM) utilizes real-time data sourced from its dynamically updated Knowledge Graph, which houses over a trillion interrelated facts.
In an interview with VentureBeat, founder and CEO Mike Tung remarked, “We hold to the belief that general reasoning capabilities will ultimately be simplified into around 1 billion parameters. Rather than embedding all knowledge within the model itself, our goal is for it to effectively utilize tools that allow for external queries.”
The Mechanics Behind Diffbot’s Knowledge Graph
Diffbot’s expansive Knowledge Graph serves as an automated repository that has been indexing publicly available web content since 2016. It systematically categorizes webpages into various entities like individuals, companies, products, and articles by extracting structured insights through advanced computer vision techniques combined with natural language processing.
This resource undergoes regular updates every four to five days as millions of new facts are added continuously. The AI leverages this real-time capability by querying the graph rather than adhering strictly to static knowledge confined within its training data.
A Paradigm Shift in Information Retrieval
Tung illustrated this process by saying, “Consider asking an AI about current weather; instead of relying on outdated training datasets to formulate an answer, our model accesses a live weather API providing timely information.”
Accuracy Beyond Conventional Models
The effectiveness of Diffbot’s methodology is evident in benchmark evaluations. The company claims its new system received an impressive 81% accuracy rate on FreshQA—a benchmark established by Google for assessing up-to-date factual knowledge—-outperforming ChatGPT and Gemini in these tests. Additionally, it achieved 70.36% on MMLU-Pro—an advanced examination measuring academic understanding.
A Commitment to Openness: Customizable Solutions
Matter-of-factly noteworthy is that Diffbot plans to make this model fully open-source; organizations can run it independently on their own systems tailored according to specific requirements. This move alleviates growing concerns regarding data privacy violations and dependency issues associated with major AI service providers.
The Promising Future for Open-Source Applications in Enterprises
< - hr >
< - hr >Nope
< span style = "white-space: nowrap" title = "This text should not be wrapped">
PTITLE
TEXT DIALOG
COPT
but corrects:
opening parentheses
highlight text
Pausing here… Would you like me . . .
Duplicate
Select me!
Something went wrong.
( How are we going ) or fall.
< . Build -- >
ع ++ [-]: Hover Me!
< - hr >