←

Machine Data Integration: OPC UA, MQTT & Cloud Architecture

By Mark Kobbert · Last updated: April 2026

What machine data integration actually is — and why it is a fundamentally different discipline from machine data acquisition, even though most vendors conflate the two

Machine data integration is the architectural discipline of turning raw, heterogeneous signals from thousands of controllers — PLCs, CNCs, robots, test benches, energy meters — into a coherent, canonical, governable data layer that every downstream application in the enterprise can consume without custom per-machine work. The distinction that matters, and that is almost never drawn clearly in the industry's marketing materials, is this: acquisition is getting the signal out of one machine; integration is making ten thousand signals from five hundred different machines look like one consistent system. Acquisition is solved by a competent electrician with a 24-volt tap and an OPC UA client. Integration is solved only by an architecture — a namespace, a contract, a topology, a security posture, a governance model — that survives a decade of plant expansions, acquisitions, protocol drift, and vendor changes.

I have spent the last twelve years building that integration layer for the SYMESTIC Cloud-MES platform — today running on Microsoft Azure with microservice architecture, ingesting telemetry from 15,000+ machines across 18 countries on four continents, with availability measured in nines rather than percentages. This article is the architecture view of the problem. For the commissioning view — how to actually extract the signal from a specific machine in a specific plant — I recommend the companion article on production data acquisition by my colleague Martin, who has been solving that problem at the controller level since 1991. The two articles are deliberately complementary: he solves the signal-at-the-plant problem; this article solves the signal-at-the-platform problem. Both are required. Neither is sufficient on its own.

Acquisition versus integration — the disciplinary distinction that changes how the problem is solved

The clearest way to see the difference is to ask what each discipline is optimising for. They look similar at the surface; they are entirely different problems underneath.

Dimension	Acquisition (per machine)	Integration (per platform)
Scope	One machine, one controller, one protocol.	Ten thousand machines, fifty protocols, twenty sites.
Unit of work	Tag, cycle, signal.	Namespace, schema, contract, topology.
Primary optimisation	Correctness at the source (right value, right timestamp).	Consistency at scale (same meaning, same model, same SLA everywhere).
Typical actor	Automation engineer, commissioning engineer, plant electrician.	Platform architect, data engineer, security architect, SRE.
Time horizon	Hours to days per machine.	Years. An integration architecture lives longer than any individual machine on it.
Failure mode when done badly	Signal is wrong, missing, or mis-timestamped.	The signal is right at every individual machine — and the cross-site dashboard still tells you nothing, because no two machines name the same thing the same way.

The symptomatic distinction in real factories: an organisation that has solved acquisition but not integration can tell you, with microsecond precision, the cycle time of press number three. It cannot tell you the average cycle time of all presses of the same type across all sites — because each site's PLC engineer named the tag differently, modelled the setup state differently, and counted cycles against a slightly different trigger condition. Every cross-site question becomes a multi-week data-reconciliation project. The data is captured; the data is not integrated. The symptom scales with the number of sites, not with the number of machines — which is why the pain usually becomes visible only when the second or third plant comes online.

ISA-95 as the integration contract — why the standard is a data model, not just a hierarchy

The ISA-95 standard (IEC 62264 in its international form) is often reduced in popular writing to its five-level equipment-hierarchy diagram — Enterprise / Site / Area / Work Centre / Work Unit. That hierarchy matters, but it is less than half of what the standard actually delivers. The other half — Part 2 (object models) and the derived B2MML XML Schemas — is the part that makes enterprise machine-data integration tractable. Without it, every customer builds their own ad-hoc JSON shape and invents their own definitions of "production order", "material lot", "equipment element", "personnel class", "process segment". With it, you get a ready-made, internationally reviewed contract for the objects that cross the factory-to-enterprise boundary.

A mature integration architecture uses ISA-95 in three distinct ways:

ISA-95 element	Role in the integration architecture
Equipment Hierarchy (Part 1)	The canonical asset path every signal is tagged against. No tag exists in the platform without a full ISA-95 path from Enterprise down to the emitting Equipment Module.
Object Models (Part 2)	The canonical shape of Production Orders, Material Lots, Personnel, Process Segments — consumed by every downstream service. A new MES application does not design its own order schema; it consumes the ISA-95 one.
B2MML (Part 5 + schemas)	The transaction format on the ERP↔MES boundary. SAP IDocs, Dynamics events, Infor envelopes are all wrapped into B2MML-equivalent shapes — so the integration is standard-shaped even when the counterparties are not.

The practical consequence: when a platform treats ISA-95 as a contract, a new machine onboarded in the Hungarian plant on Tuesday is semantically indistinguishable from a machine of the same type onboarded in the Mexican plant two years ago. Same fields, same units, same state model, same lineage. Every analytic, every dashboard, every KI service written once works across all of them. When a platform does not — when every integration is a bespoke JSON mapping — the same company discovers five years later that its fleet of "1,200 connected machines" is actually forty-two separate data silos that happen to share a login screen.

The Unified Namespace — the canonical addressing model that makes cross-site integration feasible

The Unified Namespace (UNS) is an architectural pattern that has matured in the IIoT community over the last decade and is, in my experience, the single most consequential design choice a cloud-MES platform can make. Its core idea: every piece of state in the enterprise — every machine, every production order, every material lot, every sensor reading, every alarm — is addressable through a single, hierarchically structured namespace that every client can subscribe to. The namespace is the contract. The transport (MQTT, Kafka, AMQP, in our case a combination) is implementation detail.

A SYMESTIC-shaped UNS path looks like this:

Segment	Example	Contract
`tenant`	`meleghy`	Customer identifier; enforces multi-tenant isolation at the message boundary.
`enterprise / site`	`emea / wilnsdorf`	ISA-95 Part 1, Levels 1–2. Data residency is routed against this.
`area / work-centre`	`pressing / line-03`	ISA-95 Part 1, Levels 3–4.
`work-unit`	`press-sc-03`	Equipment element — survives machine replacement.
`signal`	`cycle-counter`, `state`, `temperature`	Canonical signal name from the platform's signal catalogue — not the PLC's tag name.

The full path — meleghy/emea/wilnsdorf/pressing/line-03/press-sc-03/cycle-counter — is the address. Any consumer in the platform (OEE engine, alarm service, AI assistant, dashboard) subscribes to this address or a wildcard pattern above it. No consumer knows or cares that behind that address sits a Siemens S7-1500 published through OPC UA via an Azure IoT Edge gateway with a Sparkplug-B bridge; another instance of the same signal could be a Modbus register polled from a frequency converter, and the consumer would not notice. That abstraction is the integration. Building it is the work.

The positive pattern I call The Canonical Contract: every gateway, no matter what it reads from the machine side, publishes into the UNS using canonical signal names, canonical units, canonical state codes, canonical timestamps — as defined in a platform-wide signal catalogue maintained as a first-class artefact. The negative pattern it replaces, The Namespace Babel, is the default in organisations that grew by acquisition: every site has its own tag conventions; every conversion between sites is a per-pair mapping; cross-site analytics are impossible without a data-engineering project.

The hybrid OPC UA plus MQTT topology — a role-based architecture, not a religious war

OPC UA versus MQTT is the most tediously recurring debate in the IIoT space, and it is almost always framed incorrectly as "which is better." The two protocols are optimised for different roles in the architecture, and a competent platform uses both, deliberately. The allocation that has survived in the SYMESTIC architecture and in every other serious cloud-MES platform I have examined:

Layer	Protocol	Why
Machine ↔ Gateway	OPC UA (where available), Siemens S7 / Modbus / digital I/O (where not)	OPC UA carries information models — the machine describes itself with types, units, and structure. Critical for semantic integration. Legacy protocols used as fallback, then enriched at the gateway.
Gateway ↔ Cloud	MQTT with Sparkplug B, TLS 1.3, outbound-only	Pub/sub at scale. Lightweight. Firewall-friendly (outbound TCP 8883). Sparkplug B adds industrial semantics (birth/death, state, QoS) missing from plain MQTT.
Cloud internal (event bus)	Kafka-class event streaming (Azure Event Hubs in our case), AMQP for transactional paths	Durable event log, consumer groups, replay. The internal event bus is where services decouple.
Cloud ↔ ERP / 3rd-party	REST + JSON (OAuth 2), B2MML for ISA-95 transactions, SAP IDoc / OData where legacy	Enterprise boundary; request/response plus event webhooks. Version-managed APIs with semantic versioning.
Cloud ↔ Web / mobile clients	HTTPS + WebSocket for live dashboards; REST for CRUD	Browser-compatible; CDN-cachable; plays well with identity providers (Azure AD).

The mental model that makes this work: OPC UA is the semantic layer; MQTT is the distribution layer; Kafka is the integration backbone; REST is the enterprise-facing boundary. Each serves a role that the others would do badly. A platform that tries to push OPC UA all the way into the cloud (as some vendors do) breaks at ~500 machines because OPC UA was never designed for fan-out messaging at internet scale. A platform that tries to bypass OPC UA and use MQTT directly at the machine loses the information model and inherits every vendor's ad-hoc tag soup. The hybrid is not a compromise; it is the architecture.

Sparkplug B on cloud scale — why birth, death, and state are the difference between reliable and unreliable integration

Plain MQTT is a thin wire protocol with no industrial semantics: it transports arbitrary payloads between publishers and subscribers. That is insufficient for integration at scale, because the subscriber cannot answer three questions that matter for every operational use case:

Is the absence of a message "the machine is idle" or "the gateway is dead"?
Is a newly-arriving payload a full state refresh or a delta from a state the subscriber never saw?
Has the publisher restarted since the last message, and if so, are its sequence numbers still meaningful?

Eclipse Sparkplug B solves all three through an explicit lifecycle protocol on top of MQTT: every publisher sends a Birth message on connect (declaring the set of metrics it will publish, with types and aliases), subscribes to its own state topic, and has a Death certificate registered as its last-will message. Subscribers can distinguish "no update because value unchanged" from "publisher offline" from "publisher rebooted" in a fully deterministic way. Sequence numbers, timestamps, and QoS-1 delivery complete the picture.

At SYMESTIC scale — ten-thousand-plus gateways publishing concurrently — Sparkplug B is the difference between an integration that degrades gracefully and one that silently corrupts the central data model. The antipattern this prevents I call The Silent Outage: a gateway whose network dies, whose last-will was never configured, whose absence the cloud interprets as "all machines idle"; the OEE dashboard shows green for four hours; a supervisor takes decisions against data that has not been fresh since lunchtime. Sparkplug B's Death certificate turns this failure mode from undetectable into an explicit, timestamped, alerted event.

Store-and-forward plus event sourcing — the architecture that makes integration survivable

Machine data integration runs over networks. Networks fail. Any architecture that does not assume network failure as a first-class operating condition will lose data the first time a plant's fibre link is cut by a construction crew. Two architectural patterns, layered, make integration survivable:

Store-and-forward at the gateway. Every edge gateway maintains a local persistent queue (in our implementation: a write-ahead log on the gateway's local SSD, with configurable retention up to 72 hours at full signal volume). When the cloud link goes down, the gateway continues capturing at full rate; when the link returns, it replays from the durable log in sequence, deduplicated by the cloud. No data loss across ordinary network events. The gateway is, by design, a small but complete MQTT broker plus durable store.

Event sourcing in the cloud backbone. Every incoming telemetry event is written once, immutably, to the event log (Azure Event Hubs / equivalent Kafka-class store) before any consumer processes it. The log, not the consumers, is the system of record. A new consumer — a new OEE algorithm, a new alarm rule, a new AI service — can rewind the log and compute against historical events as if it had been subscribed from the beginning. Bug fixes replay cleanly. Schema migrations replay cleanly. This is the same pattern that powers modern banking core systems and large-scale recommendation engines, and it is the correct pattern for manufacturing telemetry for the same reasons: the events are the truth, and the projections are just views over the truth.

The combination — durable log at the edge, durable log in the cloud, idempotent consumers between them — produces an integration that is formally crash-consistent: the cloud state is, at every moment, equal to the sequence of events that actually occurred on the shop floor, regardless of the sequence of network or service failures between them. The antipattern this prevents I call The Signal Flood: a consumer crashes, backs up, restarts, re-subscribes to a live stream, misses the 48,000 events that accumulated during its outage, and quietly carries an incorrect state forever. Event sourcing makes this failure mode impossible by construction.

Multi-tenant isolation and data residency — the boundary that enterprise integration crosses every day

A serious cloud-MES platform is multi-tenant by architecture and multi-region by regulation. Every machine signal crosses three boundaries before it becomes a KPI on a dashboard:

Boundary	Enforcement mechanism
Tenant isolation (customer A cannot see customer B)	Tenant ID is a mandatory first segment in every UNS path, enforced at the MQTT broker ACL layer, at every cloud service authorization check, and at the storage partition level. Not a convention — a broker rule.
Data residency (EU data stays in EU, China data stays in China)	Regional Azure deployments; tenant assignment to a specific region at onboarding; no automatic cross-region replication of telemetry; replication of configuration only with explicit customer opt-in.
Role-based access (a shift-leader in site A does not see site B)	Azure AD-backed RBAC with claims tied to ISA-95 scopes: a user's token declares "enterprise=x, sites=[a,b]" and every query is filtered through that claim set at the data layer.

These are not optional features. For German mid-market manufacturers subject to GDPR, for Chinese subsidiaries subject to PIPL, for US defence suppliers subject to ITAR — the integration architecture must enforce residency at the topology level, not at the application level. A platform that stores all telemetry in a single region and tries to handle residency with database column filters is a platform that will lose its German customers to a DSGVO audit finding. The correct answer is structural: tenant-to-region binding, enforced at the broker, from the first message onwards.

Zero-trust between OT and cloud — the security posture inverted from classical plant networks

The traditional OT security model is the castle-and-moat: a plant network isolated behind a firewall, with no inbound or outbound connections except for tightly controlled VPN tunnels. This model is incompatible with cloud integration, and attempting to preserve it — by punching holes through the firewall for inbound cloud access — is how the worst security incidents in industrial IIoT have occurred. The correct model, which I call The Trust Boundary Inversion when I teach it to plant IT teams, flips the assumption: the plant is no longer a trusted zone that the cloud reaches into; the plant is an untrusted zone that initiates outbound connections to a trusted, well-defended cloud endpoint.

The architectural rules that follow from this inversion:

Gateways initiate only outbound connections — typically MQTT over TLS 1.3 on TCP 8883, with certificate-pinned endpoints. No inbound ports open anywhere on the plant network. No VPN tunnels into the factory.
Mutual TLS (mTLS) everywhere between gateway and cloud — every gateway has its own X.509 certificate, issued by the platform CA, with a short validity window (90 days), rotated automatically. A compromised gateway cert can be revoked within minutes.
Scoped credentials — each gateway authenticates only to its tenant's namespace; broker ACLs prevent it from publishing or subscribing to any other tenant's topics even if its credentials are stolen.
No cloud-to-plant control paths by default — control operations (order dispatch, recipe download, alarm acknowledgement) happen via the gateway's outbound session, not via inbound cloud-to-gateway connections. The cloud requests; the gateway fetches and acts; no firewall changes required at the plant.
Audit trail of every cross-boundary event — which gateway connected, from which IP, with which cert, to publish which topic, at which timestamp. The audit trail, as covered in the companion article, is not optional infrastructure; it is the evidential backbone of integration security.

The practical test of a cloud-MES vendor's security posture: ask them to describe the architecture without using the word "firewall" as the primary defence. If they cannot — if their security model is "put the gateway in a DMZ and hope" — they have not made the trust-boundary inversion. If they describe outbound-only, mTLS, cert rotation, and ACL-scoped topics as the defaults, they have built for the cloud era rather than retrofitted on-premise patterns into it.

The integration tax — the quantifiable cost of the wrong architectural choice

The single most useful question to ask when evaluating the architectural maturity of any machine-data integration is: how long does it take to onboard the next machine of a type you already support, on a site you already operate, for a customer you already serve? The answer decomposes the entire architecture into one number.

Architecture pattern	Time-to-onboard next known machine type	Integration tax
Per-machine custom mapping	3–10 days of engineer time per machine, forever.	High. Grows linearly with fleet. Typical of classical on-premise MES deployments.
Per-site custom mapping with shared templates	½–2 days per machine.	Medium. Better than per-machine, still scales poorly across tenants.
Canonical UNS with typed signal catalogue	15 minutes to 2 hours for a known machine type; fully self-service for the customer's own team.	Near-zero. The architectural investment has retired the recurring cost.

The antipattern I call The Integration Tax is the first row of this table institutionalised — an organisation that has accumulated hundreds of per-machine custom mappings, where every new machine requires a two-week consulting engagement, where the rate of new integrations is bounded by the number of senior integration engineers on staff. This is, in practice, the state of most legacy on-premise MES deployments; it is what makes them so expensive to scale and so resistant to expansion. A canonical UNS eliminates this tax structurally, by paying its cost once, up-front, in platform engineering — and collecting the dividend for the next fifteen years.

From the SYMESTIC cloud-native rebuild in the mid-2010s: the decision that set the architectural direction for the next decade came from a customer conversation I had early in the platform redesign. We had, at that point, several mid-sized customers each running a version of the on-premise product, each with its own ad-hoc mapping layer between their machines and the MES database. One of them — a Tier-1 automotive supplier expanding aggressively through acquisition — came to us with a simple request: they had just bought two new plants in Eastern Europe and one in Mexico; they wanted the same dashboards they had in their German headquarters, live, within six weeks. The honest answer, given the architecture we had at the time, was "that is a twelve-month integration project for each plant, and the dashboards will need to be rebuilt per plant because the signal names will not match." Their reaction was polite and devastating: "then we cannot work with a platform like yours, because our strategy is to buy two new plants every year for the next five years." That conversation crystallised something I had felt for eighteen months but had not yet articulated. The on-premise MES industry had built itself around a business model where integration was a billable service — two weeks per machine, eight weeks per plant, six months per rollout — and the margin on that service was a meaningful part of the revenue. The cloud-native business model could not work that way. If onboarding cost two weeks per machine, our unit economics fell apart before we had a hundred customers. If onboarding cost twenty minutes per machine, we could scale to the tens of thousands of machines and dozens of countries that today's platform actually serves. The choice was not between "slightly better" and "current". The choice was between an architecture that scaled and one that did not, full stop. What we built from that conversation — the canonical signal catalogue, the unified namespace, the typed gateway contract, the ISA-95-aligned asset graph, the self-service onboarding portal, the Sparkplug-B bridge, the outbound-only security posture — is the architecture I have just described in this article. Twelve years later, the proof is the operational data. A new Carcoustics plant that joined the platform last year had 500+ machines onboarded in six months, the first 50 of them in the first three weeks, with the customer's own team doing most of the work by week four. Twenty years ago that same integration would have been a five-year programme for a blue-chip systems integrator. The customer is the same shape; the software generation is what changed. The companies that understand this shift — that integration at scale is architecture, not labour — are the ones building global manufacturing intelligence. The companies that do not are still paying The Integration Tax, one machine at a time, and wondering why their digital-transformation programme is underperforming three years in.

The six antipatterns of machine data integration — and the architecture decisions that prevent them

Antipattern	What it looks like	Architectural antidote
The Namespace Babel	Every site names signals differently; cross-site analytics require multi-week reconciliation.	Unified Namespace with platform-enforced canonical signal catalogue.
The Integration Tax	Per-machine custom mappings; 3–10 days of engineer time per onboarding; fleet growth bounded by integration staff.	Typed gateway contract with self-service onboarding against the canonical catalogue; the exception is the integration project, not the default.
The Silent Outage	Gateway network dies without announcing it; cloud interprets absence as idle machines; dashboards show green while nothing is happening.	Sparkplug-B birth/death protocol; last-will messages; heartbeat monitoring; explicit state transitions on gateway lifecycle events.
The Signal Flood	Consumer crashes, backs up, restarts, re-subscribes to live stream only; misses thousands of events from the outage window; state silently diverges.	Event-sourced ingestion; durable log as system of record; idempotent consumers; replay from the last acknowledged offset on restart.
The Trust Boundary Inversion (failed)	Cloud integration implemented by punching inbound firewall holes into the plant; first-class security incident in waiting.	Outbound-only gateway connections; mTLS with short-lived certs; no inbound plant-facing ports; control paths via outbound-initiated sessions.
The Cloud-Lift Pretence	Legacy on-premise MES lifted into a hyperscaler VM and marketed as "cloud"; inherits every scaling limit of the original architecture.	Cloud-native rebuild: microservices, managed services, horizontal auto-scaling, durable event bus, tenant-by-topology isolation. Not a migration; a re-architecture.

Decision matrix — which integration architecture for which situation

Situation	Right architecture
Single site, < 50 machines, moderate growth plans.	MES-embedded finite scheduling with native OPC UA connectivity is sufficient. A canonical UNS matters less at this scale, but will matter later; pick a platform that has one even if you do not need it on day one.
Multi-site, > 100 machines, cross-site analytics required.	Non-negotiable: canonical UNS, typed signal catalogue, hybrid OPC UA + MQTT, event-sourced cloud backbone. Anything less becomes The Namespace Babel within two years.
Multi-country, GDPR / PIPL / data residency requirements.	Tenant-to-region binding enforced at the broker and storage layer; no cross-region telemetry replication; explicit residency-selection UI at onboarding.
Regulated industry (pharma packaging, automotive safety) requiring full audit trail.	Event-sourced backbone (replayable truth), immutable audit log of configuration changes, audit trail architecture as a first-class platform feature — not a log file.
Heavy brownfield, machines 1990s–2010s without OPC UA.	Edge gateways with multi-protocol drivers (S7, Modbus, digital I/O) that publish into the canonical UNS on the gateway side. See the companion PDA article for the plant-side approach.
Aggressive acquisition-driven growth (new plants every 6–12 months).	Self-service onboarding portal against a stable, well-documented canonical catalogue; customer-team-first rollout model; target: first 50 machines in three weeks without vendor-led engineering.

FAQ

What is machine data integration in one sentence?
Machine data integration is the architectural discipline of turning heterogeneous signals from thousands of machines into a coherent, canonical, governable data layer that every downstream application can consume without custom per-machine work — distinct from, and sitting above, the per-machine acquisition problem.

What is the difference between machine data integration and machine data acquisition?
Acquisition is the signal-at-the-machine problem: how do you extract a correct, timestamped value from one specific controller. Integration is the signal-at-the-platform problem: how do you make ten thousand signals from five hundred different machines look like one consistent, queryable system. Acquisition is solved by a commissioning engineer in hours per machine; integration is solved by an architecture that lives for a decade.

Is OPC UA enough on its own?
For the machine-to-gateway leg, OPC UA is the correct first choice where available — because it carries information models, not just raw tags. For the gateway-to-cloud leg at scale, OPC UA is the wrong choice; pub/sub-oriented transports (MQTT Sparkplug B, Kafka) handle fan-out and cloud-friendly ingestion far better. A mature architecture uses OPC UA for semantics at the edge and MQTT/Kafka for distribution and storage in the cloud. The "OPC UA vs. MQTT" debate is usually a symptom of misunderstanding both.

What is a Unified Namespace (UNS)?
The Unified Namespace is an architectural pattern in which every piece of state in the enterprise — every machine, order, material lot, sensor reading — is addressable through a single, hierarchically structured namespace. Any application subscribes to the namespace (or a wildcard pattern within it) rather than integrating against individual systems. The namespace is the contract; the underlying transports (MQTT, Kafka, REST) are implementation details. A UNS is the single most consequential architectural choice a cloud-MES platform can make for multi-site integration.

What is The Integration Tax?
The Integration Tax is the recurring per-machine integration cost that accumulates when a platform lacks a canonical signal catalogue — every machine is a custom mapping, every new onboarding is an engineering project, and fleet growth is bounded by the headcount of available integration engineers. Eliminated structurally by a canonical UNS plus typed gateway contract; the cost is paid once, in platform engineering, and retired permanently.

What is The Silent Outage?
The failure mode in which a gateway's network connection dies without the cloud being told, so the cloud interprets the absence of messages as "all machines idle." Dashboards show green while nothing is happening; supervisors take decisions against data that has not been fresh for hours. Prevented by Sparkplug B's explicit birth-and-death protocol: every gateway's last-will message is registered at connect time, so cloud-side consumers are notified deterministically on disconnect.

What is The Trust Boundary Inversion?
The architectural shift from treating the plant as a trusted zone that the cloud reaches into (the legacy castle-and-moat model) to treating the plant as an untrusted zone that initiates outbound connections to a trusted, well-defended cloud endpoint. All cloud-native MES integration operates this way: outbound-only gateway connections, mutual TLS, certificate-pinned endpoints, short-lived credentials, no inbound plant-facing ports. The inversion is mandatory for cloud integration; architectures that try to preserve the castle-and-moat model by punching inbound holes in the firewall are the source of the worst industrial IIoT security incidents.

Does machine data integration require rebuilding my on-premise MES?
In strict terms, no — a well-architected cloud-MES can ingest data from an existing on-premise MES through its APIs or database replication. In practice, the recurring answer is that legacy on-premise MES platforms were built without a canonical namespace and without an event-sourced backbone, and extracting their data into a modern integration layer is itself a major project. The honest question is whether the on-premise MES is producing enough value to justify the parallel integration effort, or whether the modern cloud-MES should take over both roles. For mid-market manufacturers, the answer is usually the latter.

How does machine data integration relate to ISA-95?
ISA-95 provides the canonical data model for machine data integration — not just the equipment hierarchy, but the full object models for Production Orders, Material Lots, Personnel, and Process Segments, plus B2MML as the canonical transaction shape on the ERP boundary. A platform that treats ISA-95 as a contract produces semantically consistent data across all sites and machines; a platform that treats it as a diagram produces The Namespace Babel.

How does machine data integration relate to the audit trail?
The integration architecture generates the stream of events that the audit trail records. Every cross-boundary event — which gateway published what signal, from which certificate, with which timestamp, against which tenant — is captured in the audit log as first-class evidence. For regulated manufacturing (pharma packaging, automotive safety), the integration architecture and the audit architecture are two facets of the same platform-integrity problem, and both are required for regulatory sign-off.

About the author

Mark Kobbert

CTO of SYMESTIC GmbH. Responsible for the cloud-MES architecture since 2014. B.Sc. Wirtschaftsinformatik, SRH Hochschule Heidelberg (dual programme alongside SYMESTIC employment). Joined SYMESTIC as a software engineer directly after graduation; led the mid-2010s rebuild from on-premise product to cloud-native platform on Microsoft Azure; CTO since 2020. Technical accountability for the platform that today connects 15,000+ machines across 18 countries on four continents — microservice architecture, IoT-gateway integration, real-time data processing, OPC UA and MQTT Sparkplug-B connectivity, Azure-backed event sourcing and durable telemetry ingestion, 99.9 % platform availability, zero customer churn in 2024. Expertise: Cloud-native MES architecture, Microsoft Azure, microservice architecture, OPC UA, MQTT Sparkplug B, IoT-gateway development, edge computing, ISA-95 integration architecture, industrial connectivity, brownfield machine integration, REST APIs, C#/.NET, SQL, Docker/Kubernetes, real-time data processing, IT-OT convergence, platform security, multi-tenant SaaS. · LinkedIn

Start working with SYMESTIC today to boost your productivity, efficiency, and quality!