amberSearch.de

Contextual RAG: Why a context layer outperforms MCP and is the key to production-ready AI in companies

Contextual RAG helps companies make internal knowledge accessible. This article explains how it differs from classic RAG.
Contextual RAG Kontextual RAG

Many discussions currently revolve around the best approach to making internal company knowledge quickly and easily accessible. The discussions fluctuate between classic Retrieval Augmented Generation (RAG), providing as much context as possible, and other technical approaches. However, generative AI only delivers real benefits when answers are not only correct but also contextually appropriate and secure. Companies therefore need more than reactive tool queries: they need the necessary context that maps metadata, relationships, and permissions—and intelligently links events between systems. This article explains what contextual RAG is and why the classic RAG approach is not enough to remain competitive in the future.

Classic RAG

Let’s start with a review of classic retrieval-augmented generation (RAG). Classic RAG separates corporate knowledge from the knowledge of an AI model: A language model does not use its training knowledge to answer a query, but first draws the necessary information from available sources and formulates the answer from it. The quality of the retrieval is therefore crucial – it controls which information is included in the answer.

There are essentially two common approaches to retrieval:

1.    Upload-based approach

The approach: Documents are uploaded to an AI platform, indexed, and searched.

The challenge: Important context is lost:

  1. Metadata such as author, project, validity date, or sensitivity is missing or is not consistently transferred.
  2. Access rights (ACLs) are often not indexed or are only roughly replicated.
  3. Changes to documents must always be tracked manually.

Without these signals, the search engine is less able to evaluate relevance, timeliness, and permissions – responses become inappropriate or uncertain. This reduces the quality of the responses generated.

The approach: The search is delegated to the source system via an MCP- or Federated Search approach to avoid complexity.

The challenge: Knowledge is always used reactively, the context is limited, and there is a dependency on the quality of the external search:

  1. The AI platform relies on the search quality of the connected system (e.g., SharePoint search) and has little influence on ranking, signals, or error handling.
  2. Contextual information (e.g., user profile, history, ongoing projects) can hardly be incorporated in an acceptable time frame, as each query encounters different APIs “on demand.”
  3. Rate limits, latencies, and heterogeneous index pipelines with different search syntaxes make consistent, fast retrieval difficult.

Ultimately, retrieval determines the quality of the response in the RAG approach. If metadata is missing, rights are not properly enforced, or context is not available in time, even the best language model will produce only mediocre results—or risk violations. This is where contextual RAG is the necessary advancement and alternative:

What is contextual RAG?

Contextual RAG combines classic RAG with an additional context layer: Instead of relying on the search functions of the source systems (e.g., drives, SharePoint, intranet), it creates its own index or knowledge graph that consistently brings together all relevant information. Contextual RAG thus combines the advantages of all approaches and delivers more precise, context-rich answers.

Additional content is modeled in the context layer, including, for example, the following:

  • Metadata (e.g., author, creation date, file type, project assignment)
  • Access rights (based on access control lists (ACLs), role- and object-based, including groups/teams)
  • Relationships and events (e.g., “Ticket A references document B,” “Commit X triggers review Y”)

The advantage: Answers are not only drawn from the right sources, but are also authorization- and context-compliant. The terms complement each other as follows:

RAG (Retrieval-Augmented Generation)Context Layer
An approach in which a language model retrieves information from a search index/knowledge store while generating responses.   A comprehensive layer comparable to a data lake that links data objects, metadata, permissions, and events across systems to form a consistent context graph.

Why MCP alone is not enough

Many AI platforms started out in the past with a classic, upload-based RAG approach and are now realizing that in practice, it is only suitable as a proof of concept at best. In order to “simply” connect internal knowledge in the next step, the MCP/federated search approach is then sold as the fastest and easiest way:

In practice, however, the retrieval quality is insufficient, as the existing native search solutions from SharePoint & Co are always used. This is why companies resort to inflationary tool calls: instead of making a single query with the necessary context, each piece of information and “direction” to be queried is queried individually. Learn more about the limitations of MCP in this article.

If you would like to learn more about MCP, download our white paper on this topic now. In it, we explain various technical approaches for connecting internal company knowledge in detail:

An example: A mechanical engineering company has developed a product called “Refrigerated Counter 5000.” In some documents, the refrigerated counter is referred to as “Refrigerated Counter 5000,” in others as “KT5000,” and in others as “KT 5000.” Since these different names are not synonyms for the SharePoint search, a provider who uses the MCP approach for retrieval must make at least three queries to search for the refrigerated counter. More complicated contextual information, such as department/user information, etc., is not yet taken into account here.

As a result, with an MCP approach, the AI does not receive the appropriate sections based on the necessary context, but rather “any” documents in which the appropriate keyword was randomly considered relevant by the SharePoint search. To correct this, significantly more context (=found documents) is loaded into the “context window.” Ultimately, the AI is fed “unnecessary” information and must now filter it out again independently. Every piece of unnecessary information costs tokens.

At first, these costs may seem insignificant, but they become uncontrollable when:

  • The solution begins to scale across the workforce
  • Use cases and queries become more complex
  • Processes need to be automated and the AI platform provides knowledge not only for humans but also for AI agents.

Want to try a better approach? Then test amber now. amber relies on contextual RAG and helps you connect your company’s internal knowledge in a scalable way.

Providing knowledge for AI agents via contextual RAG

In the future, knowledge will be prepared not only for humans, but above all for AI agents. Companies must therefore consider how knowledge can be provided in a form that can also be used meaningfully by AI agents. This is exactly where amber’s context layer comes in:

  • amber has all the necessary interfaces to enable Agent2Agent (A2A) communication.
  • Queries to the company’s knowledge base can be triggered via amber as manual input, via API, or via MCP action, while amber answers the query in the background based on the context layer.
  • AI agents can trigger actions in third-party systems via amber.

What makes the amber Context Layer unbeatable

Companies that rely on classic RAG approaches can only act reactively:

  • A ticket is opened à a workflow is triggered
  • Information is sought à a search is launched in the systems

However, the context layer enables companies to recognize connections and patterns and make proactive decisions based on this information. An example:

  • A new document associated with a customer inquiry is indexed and stored in the context layer. Due to semantic proximity, the context layer recognizes that there was a research project on this topic in the past that was discontinued a few months ago. The context layer then proactively informs the responsible employee that there are already relevant activities related to this inquiry.
  • Without this link, the person processing the customer inquiry and the employees from the research project might not know about each other, especially in larger companies, and the inquiry could therefore be rejected incorrectly.
  • In the classic RAG approach, one would have to hope that the correct search query is entered and that the native SharePoint search returns the appropriate results.

The context-based approach pursued by amber reliably closes such knowledge gaps by automatically recognizing semantic relationships and pointing out synergies to the relevant people in a targeted manner.

Dealing with access rights

One of the most important points here is the restriction that not every employee is allowed to see every piece of information. Different access rights exist within the company, which must be taken into account across requests.

There are two main approaches to taking access rights into account:

  • Pre-retrieval filter: The search is performed on behalf of the user, so only documents approved for the user are taken into account.
  • Post-retrieval checks: All documents are searched on behalf of the AI platform, regardless of access rights. Before the results are processed by the AI, the documents are checked to see whether the user has access, and documents without access are sorted out before a response is generated.

While the classic MCP/federated search approach depends on the data source to be connected, amber relies on mapping access rights in the context layer (via so-called access control lists (ACLs)). This means that all access rights can be taken into account right from the start with contextual RAG. This means that only relevant documents are searched/processed. This is consistently implemented across all connected systems (including on-premise systems such as drives).

If automated queries are triggered, e.g., by an agent or API, they must always specify a profile that determines which information/capabilities may be used for the respective query.

Conclusion

Many companies are just beginning to introduce AI or have not yet started. Long-term success and scalability will depend largely on the architecture chosen. The context layer developed by amber provides the perfect foundation for making knowledge available not only to humans but also to AI agents in the long term.

If you are wondering how you can use AI yourself, then let’s get in touch now. You can find our contact form here: