Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
In the race to deploy enterprise AI, one obstacle consistently blocks the path: hallucinations. These fabricated responses from AI systems have caused everything from legal sanctions for attorneys to companies being forced to honor fictitious policies.
Organizations have tried different approaches to solving the hallucination challenge, including fine-tuning with better data, retrieval augmented generation (RAG), and guardrails. Open-source development firm Oumi is now offering a new approach, albeit with a somewhat ‘cheesy’ name.
The company’s name is an acronym for Open Universal Machine Intelligence (Oumi). It is led by ex-Apple and Google engineers on a mission to build an unconditionally open-source AI platform.
On April 2, the company released HallOumi, an open-source claim verification model designed to solve the accuracy problem through a novel approach to hallucination detection. Halloumi is, of course, a type of hard cheese, but that has nothing to do with the model’s naming. The name is a combination of Hallucination and Oumi, though the timing of the release close to April Fools’ Day might have made some suspect the release was a joke – but it is anything but a joke; it’s a solution to a very real problem.
“Hallucinations are frequently cited as one of the most critical challenges in deploying generative models,” Manos Koukoumidis, CEO of Oumi, told VentureBeat. “It ultimately boils down to a matter of trust—generative models are trained to produce outputs which are probabilistically likely, but not necessarily true.”
How HallOumi works to solve enterprise AI hallucinations
HallOumi analyzes AI-generated content on a sentence-by-sentence basis. The system accepts both a source document and an AI response, then determines whether the source material supports each claim in the response.
“What HallOumi does is analyze every single sentence independently,” Koukoumidis explained. “For each sentence it analyzes, it tells you the specific sentences in the input document that you should check, so you don’t need to read the whole document to verify if what the [large language model] LLM said is accurate or not.”
The model provides three key outputs for each analyzed sentence:
A confidence score indicating the likelihood of hallucination.
Specific citations linking claims to supporting evidence.
A human-readable explanation detailing why the claim is supported or unsupported.
“We have trained it to be very nuanced,” said Koukoumidis. “Even for our linguists, when the model flags something as a hallucination, we initially think it looks correct. Then when you look at the rationale, HallOumi points out exactly the nuanced reason why it’s a hallucination—why the model was making some sort of assumption, or why it’s inaccurate in a very nuanced way.”
Integrating HallOumi into Enterprise AI workflows
There are several ways that HallOumi can be used and integrated with enterprise AI today.
One option is to try out the model using a somewhat manual process, though the online demo interface.
An API-driven approach will be more optimal for production and enterprise AI workflows. Manos explained that the model is fully open-source and can be plugged into existing workflows, run locally or in the cloud and used with any LLM.
The process involves feeding the original context and the LLM’s response to HallOumi, which then verifies the output. Enterprises can integrate HallOumi to add a verification layer to their AI systems, helping to detect and prevent hallucinations in AI-generated content.
Oumi has released two versions: the generative 8B model that provides detailed analysis and a classifier model that delivers only a score but with greater computational efficiency.
HallOumi vs RAG vs Guardrails for enterprise AI hallucination protection
What sets HallOumi apart from other grounding approaches is how it complements rather than replaces existing techniques like RAG (retrieval augmented generation) while offering more detailed analysis than typical guardrails.
“The input document that you feed through the LLM could be RAG,” Koukoumidis said. “In some other cases, it’s not precisely RAG, because people say, ‘I’m not retrieving anything. I already have the document I care about. I’m telling you, that’s the document I care about. Summarize it for me.’ So HallOumi can apply to RAG but not just RAG scenarios.”
This distinction is important because while RAG aims to improve generation by providing relevant context, HallOumi verifies the output after generation regardless of how that context was obtained.
Compared to guardrails, HallOumi provides more than binary verification. Its sentence-level analysis with confidence scores and explanations gives users a detailed understanding of where and how hallucinations occur.
HallOumi incorporates a specialized form of reasoning in its approach.
“There was definitely a variant of reasoning that we did to synthesize the data,” Koukoumidis explained. “We guided the model to reason step-by-step or claim by sub-claim, to think through how it should classify a bigger claim or a bigger sentence to make the prediction.”
The model can also detect not just accidental hallucinations but intentional misinformation. In one demonstration, Koukoumidis showed how HallOumi identified when DeepSeek’s model ignored provided Wikipedia content and instead generated propaganda-like content about China’s COVID-19 response.
What this means for enterprise AI adoption
For enterprises looking to lead the way in AI adoption, HallOumi offers a potentially crucial tool for safely deploying generative AI systems in production environments.
“I really hope this unblocks many scenarios,” Koukoumidis said. “Many enterprises can’t trust their models because existing implementations weren’t very ergonomic or efficient. I hope HallOumi enables them to trust their LLMs because they now have something to instill the confidence they need.”
For enterprises on a slower AI adoption curve, HallOumi’s open-source nature means they can experiment with the technology now while Oumi offers commercial support options as needed.
“If any companies want to better customize HallOumi to their domain, or have some specific commercial way they should use it, we’re always very happy to help them develop the solution,” Koukoumidis added.
As AI systems continue to advance, tools like HallOumi may become standard components of enterprise AI stacks—essential infrastructure for separating AI fact from fiction.
Daily insights on business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.
Read our Privacy Policy
Thanks for subscribing. Check out more VB newsletters here.
An error occured.