When "Web Search Cuts Hallucinations by 73-86%" Fails the Data Test: A Case Study of Reasoning Models and Retrieval-Augmented Systems

https://fire2020.org/why-the-facts-benchmark-rated-gemini-3-pro-at-68-8-for-factuality/

How a controlled evaluation exposed conflicting claims about web search and hallucination reduction Vendors routinely claim that adding web search to a language model “reduces hallucination by 73-86%

Submitted on 2026-03-05 10:03:39