Multi-hop question answering (multi-hop Q&A) is a method for generating better quality answers in the field of generative AI applications. As a further expansion stage of a Retrieval Augmented Generation (RAG) system, the aim is to consider as many aspects of the prompt as possible before an answer is generated. This blog article deals with advanced techniques in the field of information retrieval. If you would like to gain a basic understanding of this area, we recommend that you first read our blog articles “What is an enterprise search?“, “What is retrieval augmented generation?” and “Technical basics for the introduction of generative AI“.

Retrieval Augmented Generation as a precursor to Multi-Hop Q&A

In order to use generative AI in a company today, a retrieval augmented generation system is typically used. A retrieval augmented generation system usually consists of 3 steps:

  1. Bring all information into a vector-based index so that the information can be processed for a large language model
  2. As soon as the user enters a question, the most relevant information is searched for
  3. The context-related results generated by the intelligent search are used to generate an answer based on them.
Simplified Retrieval-Augmented- Generation Model

Figure 1 Simplified process of Retrieval Augmented Generation

This method already has the advantage that the search can take into account various things such as the topicality of a document or access rights. The generative AI does not then need to be trained with company-specific expertise, but can simply use general intelligence to generate the answer. However, it may only rely on the context provided (a. k. a. the previously found search results). This massively reduces the risk of hallucinations, which is a problem when training an AI model with your own data, for example.

The weakness of this method, on the other hand, is that the upstream search can only focus on one topic. This means that the results are one-dimensional and do not provide satisfactory results for more complex prompts.

When is Multi-Hop Q&A used and how does it work?

Multi-hop Q&A is used when a user submits a multi-dimensional prompt to a generative AI system. To illustrate this, let’s take the following example:

A one-dimensional question would be, for example: “Who are our suppliers for high-precision ball bearings?”

In this case, a simple retrieval augmented generation system is sufficient, as the semantic focus is on “suppliers of high-precision ball bearings”.

However, a multidimensional prompt would be:

“Send me a cold call email to our buyerpersona CTO and state our USP’s”. To answer this question satisfactorily, the system must be able to answer three key questions:

  • What does a cold call email look like?
  • What is known about Buyerpersona CTO?
  • What are our USP’s?

Only when it has the answers to these questions can it generate a sufficiently satisfactory answer in which all dimensions of the prompt can be taken into account.

Multi-Hop Q&A

Figure 2 Comparison single vs multi-hop Q&A

In contrast to the first example, several intermediate steps (multi-hop) are used here to answer the question (Figure 2). In contrast, the first example was a single-hop Q&A.

Autonomous agents in multi-hop Q&A

The special thing about a multi-hop Q&A system is that it is based on an autonomous agent approach. Based on the results of the first search, it decides for itself whether it already has enough results to generate an answer. It can therefore decide for itself whether it needs 2, 3 or more intermediate steps to provide a sufficient answer.

Use cases of multi-hop Q&A

This technology offers the possibility of covering use cases where a single-hop retrieval augmented generation system would fail. This includes, for example

  • Answering customer questions, which often have several questions/requests in one query
  • Creating a strong “ChatGPT-like” system that can fulfil more complex tasks.
  • Pure search systems can also divide more complex search queries into several intermediate steps and thus generate better answers.

The added value of a multi-hop Q&A therefore lies in the significantly increased quality of the answers, especially for complex questions, as well as the more flexible prompts made possible by such a set-up.


Figure 3 An example, how multi-hop Q&A is used in amberSearch

In Figure 3, we have shown another example from practice. The task is to generate a short knowledge article. First, a search for USA import standards for CNC Milling machines is started. Then the software searches for comparisons between the home country and the USA and then for possibly already existing knowledge articles. In the last step the results are aggregated. In the references below you can see how different information such as information from network drives, Confluence, Outlook or Haiilo are taken into account to answer this question. These examples and others can be reproduced using this link.

Applications of multi-hop Q&A within amberSearch and amberAI

Back in 2020, we at amberSearch trained our first large language models and at times published the most efficient German-language LLM. We now offer information retrieval systems for companies so that employees can access internal company data quickly and efficiently.

In order to remain the German market leader in information retrieval systems in the field of generative AI/LLMs, we are constantly developing our solutions in line with the times. This is why we use a multi-hop approach at various points in amberAI, e.g. to supplement complex search queries or to solve demanding prompts with internal company expertise. In our online demo, we have stored a medium-sized, six-digit demo data set in over 11 systems so that anyone can try out our solution without any problems: