Successful Integration of Enterprise Data Requires a Thorough Edge Presence

Subscribe To Download This Insight

By Leo Gergs | 2Q 2024 | IN-7336

Currently, up to several Petabytes (PB) of data are generated on a factory floor every year—but due to high fragmentation and rigid data silos, only a small fraction of that are actually used by manufacturers. With Artificial Intelligence (AI) use cases and Large Language Models (LLMs) requiring a vast amount of data to be trained properly, the value of harmonizing, standardizing, and integrating data from various sources becomes apparent to a growing number of manufacturers. Consequently, data integration was a pivotal topic at this year’s Hannover Messe in April. This ABI Insight compares different approaches to integrating enterprise Operational Technology (OT) data—presented by software vendors on the show floor in Hannover—and highlight the importance of the on-premises edge in this context. Based on this analysis, this ABI Insight offers actionable recommendations for how different technology providers can increase their presence at the on-premises edge to embrace this new challenge.

Registered users can unlock up to five pieces of premium content each month.

Log in or register to unlock this Insight.


Data Management and Data Integration as Key Topics at This Year's Hannover Messe


Earlier in April 2024, Hannover Messe once again set the stage for a remarkable display of innovative technological innovations, highlighting how rapid advancements are shaping the future of industry. As technologies like Artificial Intelligence (AI), which require copious amounts of data to unfold their true transformative potential for enterprise verticals, gain traction, manufacturers around the world show increased interest in aggregating and harmonizing the data they already gather to use to feed Large Language Models (LLMs) for AI training. At the same time, soaring energy costs put a price tag on transferring and processing data, pushing manufacturers to appreciate the cost of data collection and ensure these data will be used in the most optimal way. After all, depending on the exact industry, several Petabytes (PB) of data are gathered from the factory floor each year. Consequently, normalization, contextualization, and integration of enterprise Operational Technology (OT) data for further treatment were an important subject of discussion at Hannover Messe this year.

A range of exhibitors presented their approach to and products around this integration challenge. Microsoft presented its Data Fabric software, which relies heavily on contextualizing, harmonizing, and normalizing data in the cloud. AWS’ Industrial Data Fabric (IDF) solution, on the other hand, relies on edge deployments on enterprise premises for data integration, while LLM training, for example, will be performed in the cloud. In a similar fashion, Hewlett Packard Enterprise (HPE) showcased its Ezmeral software offering for data integration at its booth in Hannover.

Different Approaches to Data Integration


At present, however, the data gathered by enterprises’ OT teams are fragmented and domain specific. An average manufacturing floor, for example, might gather data on the number of production equipment (a pure numeric value), the condition of a certain machine (qualitative data), or the date of the last maintenance (a temporal value). As these heterogenous data are exceedingly difficult to use for further processing and training of LLMs for AI use cases, only a fraction of the enterprise data are structured and used for further processing. There are, broadly speaking, three main different approaches to data integration:

  1. Data Fabric at the Edge: This architecture is particularly suited for industries such as manufacturing and healthcare, where immediate data analysis can lead to improved operational efficiency and outcomes. By leveraging data fabric at the edge, organizations streamline data accessibility and analytics, enabling a more agile and informed decision-making process directly at the source of data generation. In this approach—adopted, for example, by AWS through its Industrial Data Fabric—no enterprise data will have to leave the enterprise site, as all raw data are interpreted and contextualized on-site, while only aggregated data are sent to the cloud.
  2. Data Fabric in the Cloud: Cloud-based data fabrics, such as Microsoft’s Data Fabric, are particularly effective in supporting complex data ecosystems that include real-time data streaming, large-scale analytics, and Machine Learning Operations (MLOps), allowing organizations to harness the full potential of their data assets. This strategy not only enhances operational agility, but also drives innovation by enabling more complex and data-intensive applications. However, as this requires overly-sensitive enterprise data to leave the enterprise premises to be integrated at the cloud level, it will only be applicable to a very select group of verticals with less critical use cases.
  3. Data Integration through Integration Middleware: A third approach was presented at Hannover Messe by a range of specialized software vendors like soffico: these specialized vendors use their software as a middleware. While such a solution provides seamless integration for certain use cases with centralized data management, its proprietary nature introduces a single-vendor dependency. Furthermore, these middleware solutions can induce performance bottlenecks and do not scale easily.

Extensive Edge Presence and Partnership Networks Are Key to Success


In providing a data integration solution to industrial verticals, ABI Research is convinced that on-premises edge deployments will an important determinant for success for a number of reasons:

  • Industrial enterprises place a lot of emphasis on network integrity, as their OT data are highly critical and confidential. No industrial enterprise would want to see data about the condition of their production machines leave their own premises, aiming to prevent competitors from getting access to this highly-sensitive information. Similarly, enterprises will not be prepared to see any safety-critical data leave their premises to ensure they retain maximum control at any given time.
  • A range of industrial applications—particularly safety-critical ones—often require real-time processing capabilities. An on-premises edge deployment can minimize latencies, as it guarantees OT data to be processed as close to the data source as possible.
  • An on-premises edge deployment reduces the need to transmit large volumes of data to a central cloud for processing, reducing the costs for bandwidth and the load on the network, also ensuring that transmissions to the cloud are only used for data that are beneficial for applications residing in a central cloud, such as training of LLMs.

There are several ways in which software vendors, hyperscalers, and telco providers can strengthen their footprint at the on-premises edge for industrial environments:

  • Partner with Edge Hardware Vendors for End-to-End Data Integration Solutions: Software developers and service providers should look at edge/Internet of Things (IoT) hardware vendors for potential partnerships/co-creation initiatives. Such a partnership collaboration can ensure that the data fabric software is optimized for performance on the preferred hardware used by manufacturers, providing a seamless and efficient solution that is easier to adopt.
  • Develop Tailored Integration Software That Embraces the On-Premises Edge: In line with this, software vendors and service providers should develop their product around the on-premises edge and the unique features it offers for industrial deployments. Only then can software vendors develop applications that are directly relevant to the industrial world, such as real-time analytics, Machine Learning capabilities for predictive maintenance, and integration with existing industrial control systems.
  • Look at Specialized Data Integration & Contextualization Software Vendors for Possible Collaboration: To drive their data integration ambitions even further, software vendors should look at smaller players that are highly specialized in contextualizing and harmonizing industrial data. Establishing this expertise in-house would not only be resource-intensive, but also take considerable time and, therefore, increase the Time to Market (TTM) unnecessarily. Instead, the large hyperscalers should focus on players like HighByte to determine the best possible way of collaborating.



Companies Mentioned