RAG and Grounding: How AI Search Stays Accurate and Trustworthy
By Tharindu Gunawardana | SearchMinistry Media | March 25, 2026
Retrieval-Augmented Generation (RAG) and grounding are the architectural response to the hallucination problem in AI search. This post explains the 7-stage RAG pipeline with reference to 6 publicly available patents from Google, Microsoft, and Citibank.
What Is RAG?
RAG was formally introduced in a landmark 2020 paper by Patrick Lewis et al. at Facebook AI Research. Instead of relying solely on training data (parametric memory), RAG gives the model real-time access to an external knowledge base at query time. A news article published this morning can inform an answer this afternoon, with no model retraining required.
The 7-Stage RAG Pipeline
- Intent Analysis: The system infers what the user wants: comparison, explanation, recent news, or a specific fact. This feeds into query fan-out and retrieval targeting.
- Parallel Retrieval: Multiple knowledge sources are queried simultaneously using vector embeddings. Patent: US20240346256A1 (Microsoft Technology Licensing, 2023).
- Grounding: Retrieved chunks are verified and anchored to real sources using chunking, vector similarity matching, source credibility scoring, and recency filtering. Patent: US12131123B2 (Microsoft Technology Licensing). Additional: US9916366B1 (Google).
- Prompt Augmentation: Verified document chunks are injected into the LLM prompt. The model is constrained to answer only from this context. Patent: US20240256582A1 (filed 2024).
- LLM Generation: The model synthesises a coherent answer from the provided context, inserting inline citations and confidence markers.
- The Grounded Answer: Output includes the synthesised response, inline citations, source metadata, confidence qualifications, and follow-up suggestions.
- Hallucination Guard: A final verification pass checks every claim against retrieved sources. Unsupported claims are flagged or removed. Patent: US12536233B1 (Google, granted 2025).
The Three Grounding Failure Modes
- Parametric Drift: The model slips from retrieved context back to training data mid-generation. The most common failure and hardest to detect without sentence-level verification.
- Chunk Misalignment: Retrieved chunks are topically relevant but do not actually support the specific claim. Strong grounding systems check at the claim level, not just the document level.
- Stale Context: Retrieved documents were correct at publication but have since been superseded. Recency filtering and source freshness scoring are the primary defences.
RAG vs Alternatives
- Base LLM: Low accuracy on new information, no citation support
- RAG: High accuracy via real-time sources, full inline citations
- Fine-tuning: Medium accuracy on static domain data, limited citations
- RAG plus Fine-tuning: Highest accuracy, full citations, best for enterprise AI search
What This Means for AI Search Optimisation
- Citations are auditable: every claim links to a source document
- Knowledge stays current: RAG retrieves at query time, not from training memory
- Well-structured, authored, regularly updated pages pass grounding quality filters
- Australian businesses in AI-indexed sectors benefit from schema markup, clear authorship, and consistent NAP for local businesses
Patent Reference Summary
- US9916366B1 - Query Augmentation (Google Inc.) - query fan-out
- US20240346256A1 - Response Generation Using RAG (Microsoft) - vector retrieval
- US12131123B2 - Grounded Text Generation (Microsoft) - grounding interface
- US20240256582A1 - Search with Generative AI (pending 2024) - prompt augmentation
- US12536233B1 - AI-Generated Content Page (Google, 2025) - hallucination guard
- US12135740B1 - Unified Metadata Graph via RAG (Citibank, Nov 2024) - enterprise RAG