Hard Coal mining in Germany is history, but the expert knowledge from 150 years should remain. But how can information hidden in millions of documents on numerous servers be made efficiently usable? To solve this problem, RAG is relying on technical innovation and artificial intelligence (AI) to unlock valuable knowledge for employees and make it easily accessible. The company is receiving support from Aachen-based start-up amberSearch (formerly ambeRoad), which is designed to take the company-wide search engine to the next level.
After a long development and test phase, implementation of the new system in RAG’s IT architecture is scheduled for the second half of 2022. During the extensive pilot phase, the first users from different areas of the company were able to access the AI-supported search. In addition to helpful feedback for further optimization, the development team received consistently positive feedback: “My colleagues were amazed at how I could suddenly get the information I needed so quickly,” was one comment. “The speed at which the search results are displayed is unbelievable,” another.
At the end of 2018, RAG had to finally cease active coal mining and is still in the midst of an extensive change process characterized by staff reductions. Effective ways of preserving and accessing expert knowledge are thus in great demand. The collected knowledge lies dormant in various data silos such as network drives, the Microsoft SharePoint platform or the company’s internal search engine DSA (Digital Service File). Step by step, RAG is digitizing historical analog data – and several terabytes are added every week. Mining plans and pit outlines from the days of industrial mining now provide the basis for day-to-day operations. While the workforce is shrinking, the mountain of data is growing steadily. To ensure that valuable information is not lost in the flood of legacy data and that the search for it becomes faster and more efficient, RAG looked for new solutions through the DataHub program. The initiative of Gründerallianz Ruhr brings together established companies from the Ruhr region with the best start-ups worldwide to work together on an innovative solution to a problem in a three-month cooperation. The DataHub has set itself the task of utilizing and promoting the data potential of the Ruhr region. Participating start-ups are given the chance of a pilot project as well as up to 20,000 euros. RAG’s specific mission was to provide high-quality answers to users’ questions when searching for data with the help of AI.
As clever as an experienced colleague
Since the previous search via RAG’s internal “Digital Service File” (DSA) platform was not designed to answer complex questions, the aim was to make the results in future as clever, complete and precise as if you were asking an experienced colleague sitting next door in the office. The DSA collects tags from multiple systems and thousands of documents. Users searching for specific keywords receive a list of results that can be filtered by source system or time period. Using intelligent search algorithms, startup amberSearch made the vast store of knowledge that RAG has built up over its corporate history quickly and efficiently searchable by individual users, so that valuable content is not lost when employees leave the company.
The problem is widespread. In companies and organizations around the world, valuable time is wasted every day on tedious searches for information, documents and forms. Especially larger companies with a corresponding history know the challenge of making information easily available. Especially as the ever-growing volume of data is pushing conventional systems to their limits. A survey by SearchYourCloud shows just how much the search for information can affect productivity in the workplace. According to the survey, one third of the employees questioned need between 5 and 25 minutes each time to find a specific document. What’s more, in 80 percent of cases, up to eight attempts were required to even get to the right result. A large amount of time that managers often underestimate. Here’s an example: If each employee spends an average of 30 minutes a day searching, a company with several hundred employees is already “wasting” six-figure personnel costs per month. The economic benefit of an effective search function can therefore not be overestimated.
By working with amberSearch, RAG wants to streamline and optimize its business processes in a targeted manner. After the DataHub tender, an interdisciplinary team reviewed the submitted solution approaches and finally decided in favor of the start-up from Aachen. The young founders enable their customers to easily and intuitively access their company’s internal knowledge and help save time, money and nerves as well as improve workflow. To do this, they use the latest developments in various areas of artificial intelligence, such as natural language processing, deep learning and computer vision. In the joint project KISS42, the search software amberSearch was developed and adapted to the requirements of the post-mining company. KISS42, by the way, stands for Artificial Intelligent Search System and 42 for the answer to the “question of all questions” from Douglas Adams’ novel “The Hitchhiker’s Guide to the Galaxy.
AI first had to learn miner’s language
In order for the software to provide the correct answers to specific mining-related questions at all, it first had to learn that in the mining language, with all its technical terms and peculiarities, words such as Rauben, Walsum or West and East can also have other meanings than the conventional ones. “Training our algorithms to use these special terms was a huge challenge at first, but ultimately led to users being efficient and, above all, satisfied with our search,” says Philipp Reißel, strategic product developer and co-founder of amberSearch. In addition, during the trial phase, the search engine learned what types of data it should be able to read in order to deliver it to users in a way that fits their query: geodata, office documents from Excel to PowerPoint, PDFs, but also maps, graphics, images and much more. The sources pose another challenge, as the data is often stored in different places such as SharePoint team rooms or databases – it comes from Staffbase, ELO or the company-wide intranet. “Nothing less than a software that can answer questions like an expert should be at the end of the project,” says Steffen Bechert from the Location and Geoservices division, describing the demand on the intelligent search engine.
To ensure that amberSearch understands the problems and wishes of users from the ground up and in depth, the focus was on a constant exchange with users. This is to ensure that the newly developed solution is accepted one hundred percent. RAG is convinced that the extra work will pay off in the long run. It was clear to the team early on that simply optimizing the normal search function would not be enough to guarantee access to the ever-growing body of knowledge for all employees in the long term. Instead, AI-based features were to ensure that the advanced search engine far surpassed previous software in terms of speed and quality of results. As is common with well-known Internet search services, users were also to receive correct answers to completely formulated questions and not be able to search exclusively for individual keywords.
The collaboration was characterized by agile project management, which allowed for many small steps and adjustments. While in classic project management the result is finally defined right at the beginning, an initial concept draft, a prototype and a pilot phase were followed by many extensive and necessary feedback rounds in order to optimize the AI search step by step and to solve any bugs, user problems and the like that arose. This created a dynamic that led to an even better understanding of each other. The variable project result was thus able to grow with each new requirement. In the prototype, for example, the initial focus was on intelligent text search across multiple data sources. Later, the desire arose to also integrate other media such as images or scanned documents, which the start-up immediately incorporated into the next prototype. “For us, the project was a successful start to the use of AI methods in the field of geoinformation and search technology. The know-how and innovative spirit of amberSearch opened up completely new possibilities for the intelligent use of our databases,” emphasizes Peter Vosen, geodata expert at RAG and head of the KISS42 project.
Analysis within fractions of a second
The end product should correspond to what employees are used to seeing on the Internet. This is not a particularly easy task, considering that the search processes on the Web and in companies are fundamentally different. Information on the Internet is available to everyone in the first place; access rights hardly need to be taken into account by the provider. In addition, data on websites is available in more uniform file formats in the cloud. An easy classification and evaluation of the results is possible through links between different websites. In case of doubt, huge amounts of user data can correct the inaccuracy of the software. On the other hand, since amberSearch’s software has to cover many different other topics, the startup developed it in different subareas, so-called containers, which when put together then make up the overall solution. Before a query can be routed to the various data silos, the software first analyzes the question using an intelligent, customer-specific language model. To do this, it understands it at the semantic level and specifically expands it with synonyms. During the search process, the start-up uses a so-called vector-based index, whereby not only keywords are compared with each other, but also correlations of sentences and paragraphs are analyzed on a more abstract level – within fractions of a second. After an intelligent re-ranking model sorts multiple text snippets by relevance, the results are combined with additional content from various file formats, such as images. In the final step, the results are then displayed to the user in a clear format.
The amberSearch development team places great emphasis on high user-friendliness and therefore largely does without filter options. “We realized that our users rarely if ever used the filtering features on web search engines. Therefore, we decided to use this only to a very limited extent in amberSearch as well and to replace filter functions with a combination of different AI-based models,” said Igli Manaj, co-founder and technical product developer of amberSearch. In various pilot projects, this has proven to be the right way to go.
There is no adequate provider on the market
Digital search and find” is not a new subject area. However, the rapid progress of various AI disciplines has created entirely new ways of dealing with large amounts of data. Existing vendors often offer large, comprehensive solutions, but they are costly to maintain and significantly less efficient. RAG was looking for new, measurable, and less expensive solutions. In addition, the requirement was to be able to search across boundaries of different data silos and file formats. Since RAG was unable to find an adequate provider for this specific task on the market, the company decided to issue an international challenge for startups through Gründerallianz.
“It’s always exciting when startups collaborate with corporations, two cultures sometimes collide,” says Julian Reinauer of amberSearch. “The cooperation with RAG has shown that both sides can learn from each other. On the one hand, corporations benefit from the speed and agility of startups, while on the other hand, startups gain insights into the processes and structures of large corporations.”
Further file formats and features planned
In addition to RAG, the Aachen-based founders have meanwhile been able to acquire other major customers and are conducting test phases with them or planning further cooperation. In order to accelerate growth and further develop the product, the team around the four founders Igli Manaj, Julian Johannes Reinauer, Philipp Reißel and Bastian Maiworm has already closed a round of investors. The solutions developed so far are to be made available to the majority of employees in the near future. Further features are already planned, for which several customers have already expressed a need, such as the ability to search additional file formats such as 3D models or technical drawings. In addition, interfaces to communication tools such as Teams, Slack or e-mail have been developed for the AI-based search, taking access rights into account. In the future, internal company data could also be enriched with external data – the possibilities seem endless.
Info box: RAG
For 150 years, industrial coal mining shaped the coalfields in North Rhine-Westphalia and Saarland. RAG Aktiengesellschaft is assuming long-term responsibility for post-mining operations following the phase-out of coal mining in 2018. By handling the so-called eternity tasks, the company is helping to regulate the water balance in the mining regions there, both underground and above ground. The top priority is to protect drinking water and the environment. The tasks include mine water drainage, polder measures above ground as well as groundwater purification and groundwater monitoring at contaminated sites. RAG’s other tasks include the regulation of mining-related damage to buildings, land or infrastructure, as well as the rehabilitation of old shafts and the dismantling of operating facilities.
Info box: DSA
RAG’s company-wide search engine is called the Digital Service File (DSA) and provides access to around 350 different topics. Users can use it to access geodata, information and the digital crack archive. The company-wide platform was launched in 2011 and has been continuously developed since then. As an integration and information service on the intranet, it provides uniform access to all important data, collects keywords from several other systems, such as GIS, Staffbase or ELO and thousands of documents (for example, Sharepoint, shared folders, file systems). Furthermore, about 160,000 mine plans are available via the integrated crack archive. In addition, there is an integrated aerial photo archive with RAG’s own aerial photos from past decades. This means that mine plans are now displayed at the click of a mouse, which employees previously had to view in the archive or obtain via the previous system, which was not particularly user-friendly.
Users can search by tags and get a list of search results that can be filtered in various ways, such as by time period or source system. Another feature is the creation of individual profiles within the DSA and the ability to save individual data queries. The system also provides templates for individual departments. It is possible for employees to export profiles and send them to colleagues, which facilitates collaboration and project-related documentation. Taking into account the respective access rights, DSA also allows extensive search functions in connected source systems such as team rooms and file systems as well as Staffbase, ELO and GIS. It is also possible to jump to the corresponding documents or source systems. RAG thus offers its employees a company-specific version of Google Search and Google Maps.
Info box amberSearch:
amberSearch is a deep-tech start up from Aachen, Germany, that focuses on finding information within unstructured data. The founding team around Philipp Reißel, Julian Reinauer, Igli Manaj and Bastian Maiworm originally started with a slightly different idea. Over time, the vision of a company-internal search engine modeled on Internet search engines matured so that employees could also quickly access relevant information within the company. After several success stories the start up took a growth financing to be able to grow further. amberSearch prevents through their search engine the frustration that employees feel when they cannot find the available knowledge in the company. To do this, they rely on different types of artificial intelligence and thus enable a search that takes company-specific circumstances into account.