All posts
ai5 min read

RAG Over Company Documents — What Actually Works (and What's Hype)

A chatbot answering questions from your company's documents sounds simple — yet 80% of RAG projects fail on data and retrieval quality, not the model. Here's what really works and how to build a RAG that doesn't make things up.

Company document with an AI knowledge node (RAG)

The idea is tempting: dump all your company docs into AI and, instead of digging through folders, employees just ask a chatbot. That's RAG (Retrieval-Augmented Generation) — and yes, it works. But only if you do it right. In practice most pilots fail not on the language model, but on data and retrieval quality.

This article explains, without the fluff, what really works in RAG and what's just marketing hype.

How RAG works in one paragraph

RAG combines two things: a search engine over your documents and a language model (LLM) that formulates an answer from the retrieved snippets. The key: the model doesn't answer "from memory" — it answers from the context it's given. That grounds responses in your knowledge, not the model's general training, and lets you cite the source.

What actually works

Good data preparation (this is 80% of success)

The most important stage happens before you touch any AI. Documents must be:

  • cleaned (remove duplicates, stale versions, scans without OCR),
  • split into meaningful chunks (chunking) — not blindly every 1000 characters, but by sections and paragraphs,
  • enriched with metadata (department, date, document type) for filtering.

Hybrid search

Pure semantic (vector) search gets lost on part numbers, codes and proper names. Combining vector search with classic keyword search (BM25) gives genuinely better hits.

Source citation

A good RAG always shows which document an answer came from. That builds trust and lets an employee verify the information in one click.

What's hype

  • "Just plug in a model and you're done" — without data work you'll get confidently worded nonsense.
  • "The bigger the model, the better" — retrieval quality affects accuracy more than LLM size.
  • "RAG will replace a whole department" — it's a tool that supports people, not a magic employee.

The most common reasons projects fail

  1. Garbage input data — the model is only as good as the documents it gets.
  2. No evaluation — without a test set of questions and answers you don't know whether the system is improving or degrading.
  3. Hallucinations with no guardrails — missing the instruction "if you don't know, say you don't know."
  4. Ignoring permissions — a sales rep shouldn't see HR data through the chatbot.

Real cost and time

A sensible RAG pilot (one knowledge area, e.g. technical docs or procedures) takes a few weeks of work. Running costs are mainly model queries and the vector database — at reasonable scale, hundreds of euros a month, not tens of thousands.

FAQ

Will my data go to the model provider?

It doesn't have to. RAG can be built on locally hosted models, in a private cloud, or with a provider that guarantees no training on your data.

Does RAG make mistakes?

Any AI system can. A well-designed RAG minimizes hallucinations through source citation and clearly communicating uncertainty — which is why we always design it with verification.

From how many documents is it worth it?

From just a few dozen frequently used documents, RAG saves real time. The more scattered knowledge and the more often people search it, the bigger the return.

Summary

RAG over company documents really works — provided you treat it as a data project, not "plugging in AI." The keys are clean data, hybrid search, source citation and continuous evaluation.

At Kajpa Studio we build RAG systems and AI assistants over company knowledge — with a focus on not making things up. Let's talk about your case.

Tags
  • ai
  • rag
  • llm

Working on something similar? Let's talk.

Book a call