MES Software: Vendors, Features & Costs Compared 2026
MES software compared: vendors, functions per VDI 5600, costs (cloud vs. on-premise) and implementation. Honest market overview 2026.
Performance measurement in manufacturing is the systematic capture and analysis of production performance against a defined reference — typically OEE, throughput, cycle time, first-pass yield, and the cost and energy figures that sit behind them. That textbook answer is correct and it is also the least interesting part of the topic. The interesting question, the one that actually decides whether a plant improves or plateaus, is a different one: when you measure production, is the number honest?
I have spent 25 years working with manufacturing performance numbers on four continents — as a Six Sigma Black Belt on Johnson Controls headliner lines, as global lead for MES and traceability across 900+ machines in seven countries, as MES practice lead at Visteon, as a sales leader at iTAC and Dürr. I wrote a book in 2025 called OEE: Eine Zahl, viele Lügen — "OEE: One Number, Many Lies" — because after two and a half decades I had seen the same pattern repeat in every region, every industry, every company size: the measurement itself is rarely the problem. The honesty of the measurement is where the whole thing lives or dies. This article is about what it takes to measure performance in a way that actually drives improvement instead of producing comfortable numbers that justify the status quo.
Before the honesty question, the architectural one: not every performance measure belongs at every level of the organisation, and forcing them all through the same dashboard is one of the most common and most damaging mistakes in manufacturing reporting. The measurement hierarchy that works — across automotive, food, metals, consumer goods — has four levels, each with its own cadence, its own audience, and its own set of metrics.
| Level | Cadence | Primary metrics | Audience |
|---|---|---|---|
| Machine | Real-time / cycle | Cycle time, state, micro-stops, alarm events | Operator, line lead |
| Line / cell | Shift | OEE, first-pass yield, stop reason Pareto | Shift supervisor, production manager |
| Plant | Day / week | Throughput, schedule adherence, scrap cost | Plant manager, operations director |
| Enterprise | Month / quarter | Margin, cost per unit, delivery performance | COO, CFO, board |
The rule that follows from this split: do not aggregate machine-level metrics into board reports without the intermediate layers, and do not push board-level KPIs down to the shop floor. A CFO does not need to see every micro-stop; an operator does not need to see cost per unit. Dashboards that violate this rule either overwhelm the operator or mislead the executive, and usually both.
The hard part of performance measurement is not building the dashboard. It is making sure the dashboard reflects reality. In 25 years I have seen four systematic patterns by which performance numbers get distorted — not by malicious actors, but by local optimisation pressure, politically reasonable shortcuts, and definitions that nobody audits. Each is common, each is quantifiable once you know what to look for, and each produces a number that looks good on a slide and is worthless for decision-making.
| Pattern | How it works | How to detect it |
|---|---|---|
| Denominator shrinking | "Planned production time" is defined down — breaks, changeovers, meetings excluded — so the ratio looks better | Compare planned production time to calendar time; ratios below 60 % are a red flag |
| Definition gaming | "Nameplate speed" is quietly reduced to match actual speed, so performance = 100 % | Compare current nameplate against original OEM specification or historical records |
| Exclusion lists | Certain stop categories are flagged as "non-countable" (maintenance, material, tooling) and removed from availability | Audit stop-reason codes against a reference definition (the OEE Industry Standard is a good baseline) |
| Retroactive reclassification | Stops initially logged as unplanned get re-tagged as planned during end-of-shift review | Compare raw event data at the machine against the approved shift report |
The central thesis of the book and of this article: a low OEE that is honest is strictly more valuable than a high OEE that lies. A plant reporting 62 % OEE and meaning it has a real improvement roadmap. A plant reporting 85 % OEE through the combined effects of denominator shrinking, definition gaming and exclusion lists has no roadmap — it has a narrative. The moment a leadership team starts managing to the high number is the moment the plant stops improving, because every action that would reveal the real number becomes politically uncomfortable. I have watched this happen in plants in Germany, the US, China, Tunisia, France and Russia. The pattern is universal.
The counter-pattern — how plants that actually improve do their measurement — is built on five non-negotiable principles. Each one closes one of the common distortion channels.
Automated capture at the machine, not manual entry. Every stop, every cycle, every alarm captured at the event source in real time. Manual logs are where retroactive reclassification lives. Event-level automated capture removes the temptation and makes the raw signal auditable.
Standardised stop-reason taxonomy across the plant. A fixed set of reason codes aligned to a reference standard (the OEE Industry Standard, Nakajima's Six Big Losses, or an internal equivalent), with no local variants. When every plant in a multi-site organisation uses the same taxonomy, comparisons become possible and gaming becomes visible.
Immutable raw data layer. Machine events stored in a way that cannot be edited after the fact. Operator classifications and supervisor reviews become a separate layer on top. If the number ever needs to be questioned, the raw data answers the question — not the narrative.
Cadence matched to audience. The machine dashboard updates every cycle; the shift report runs every shift; the plant report every day; the board report every month. Each audience sees the metrics that apply to their decision horizon. Mixing cadences is how the wrong questions get asked at the wrong level.
Nameplate speed locked. The theoretical maximum rate is defined once, documented, and not adjusted to match actual performance. If the machine cannot reach it, that is information — it says the gap is real. Letting nameplate speed drift downward is how performance = 100 % becomes a lie everybody accepts.
Neoperl is an international manufacturer with headquarters in Müllheim, Germany, and additional sites in Bulgaria, the UK and Italy, specialising in precision water-flow products — backflow preventers, aerators, flow regulators. The engagement with SYMESTIC is the kind of textbook performance-measurement story that every plant can replicate, and it is worth describing in detail because the sequence matters.
The starting point was a four-week proof of concept on a single fully-automated assembly machine. The objective was narrow and honest: validate that the connectivity and data capture worked under real production conditions, and calculate a defensible ROI on that single-machine basis. PLC-based alarm capture was configured for automatic stop detection; the assembly line itself now classifies its own technical stops through its control system, without requiring operator intervention. PLC alarms were correlated directly against the stop-reason taxonomy and against quality defects detected on the same cycle. No manual classification, no retroactive reclassification, no nameplate adjustment — raw event data from the control system flowing into a measurement layer that operators could see in real time and supervisors could audit after the fact.
After the PoC validated the numbers, the contract was signed and three machines were integrated in the first rollout wave. Since then the connected base has expanded continuously — the modular SYMESTIC catalogue means Neoperl's own team can onboard additional machines without returning to the vendor for every incremental step. The results are the numbers honest measurement produces when it is acted on:
| Metric | Improvement | What drove the change |
|---|---|---|
| Stoppages | −10 % | Automatic capture and automated classification made the top reasons visible and actionable |
| Availability | +8 % | Structured analysis of alarm-stop correlations drove targeted engineering actions |
| Scrap | −15 % | Quality-defect data correlated with alarm patterns exposed systematic root causes |
| Productivity | +15 % | Targeted actions based on honest data — no single big project, a sequence of small ones |
The pattern in the right-hand column is the point. None of these numbers came from a single dramatic intervention. They came from a measurement system that told the truth, a stop-reason taxonomy that everybody trusted, and a CI culture that acted on what it saw — Neoperl uses SYMESTIC explicitly as a CI tool within the organisation. The 15 % productivity gain is the downstream consequence of the other three numbers moving, which is the downstream consequence of the measurement itself being believable. Take away the integrity at the base of the stack and the whole thing becomes another dashboard nobody trusts.
What are the most important manufacturing performance metrics?
For the shop floor: cycle time, machine state, micro-stops, alarm events. For line and cell level: OEE (availability × performance × quality), first-pass yield, stop-reason Pareto. For plant level: throughput, schedule adherence, scrap cost. For enterprise level: margin per unit, delivery performance, cost per unit. The mistake is not picking the wrong metrics — it is pushing the same metrics across all four levels. Each level has its own cadence and its own audience, and mixing them produces dashboards that overwhelm operators and mislead executives.
Why do reported performance numbers so often differ from reality?
Four systematic patterns account for almost all the distortion I have seen in 25 years across seven countries. Denominator shrinking — planned production time defined down until the ratio looks better. Definition gaming — nameplate speed quietly reduced to match actual performance. Exclusion lists — stop categories flagged as non-countable and removed from availability. Retroactive reclassification — unplanned stops re-tagged as planned during end-of-shift review. Each is detectable with an audit; none of them are rare. A plant reporting 85 % OEE without any of these patterns in place is usually reporting 62–70 % in reality.
Is a low OEE bad?
Not necessarily. A low OEE that is honest is more valuable than a high OEE that lies, because the honest low number identifies where the improvement potential is. A plant reporting 60 % honest OEE has a real roadmap — the 40 % gap is visible, categorised and actionable. A plant reporting 85 % OEE through definitional gymnastics has no roadmap, because acting on the real number would expose the gymnastics. The question is never "is the number high" — it is "is the number true."
How often should performance data be reviewed?
At four cadences simultaneously. Real-time at the machine for operators (cycle-by-cycle). Per shift for line leaders and supervisors. Daily or weekly for plant management and production leadership. Monthly or quarterly for enterprise and board level. Each audience needs the metrics that match their decision horizon. A monthly review at the machine level is useless because the stops have already been fixed or forgotten; a real-time feed to the CFO is noise without context.
What is the difference between OEE and TEEP?
OEE measures equipment effectiveness during planned production time — it excludes the time the equipment was scheduled off. TEEP (Total Effective Equipment Performance) measures the same effectiveness against calendar time — 24 hours a day, 365 days a year. TEEP is always lower than OEE for the same asset. The useful insight is the gap: if OEE is 75 % and TEEP is 45 %, the plant has a scheduling problem as much as a performance problem. Most plants only track OEE and never see the scheduling dimension.
Can performance measurement be automated end-to-end?
The data capture should be fully automated — machine events at cycle level, alarms from the control system, good-count from inline sensors. The classification of stop reasons is partially automatable (the control system can classify technical stops by alarm code, as Neoperl does) and partially requires operator input for non-technical reasons (material, operator, external). The analysis and decision-making should always involve humans. The failure mode is both extremes — fully manual logs where nothing can be trusted, or fully automated reports where nothing is acted on because nobody owns them.
What role does culture play in performance measurement?
More than any technical factor. A measurement system in a culture that punishes bad numbers will produce good numbers that lie. A measurement system in a culture that treats bad numbers as information will produce honest numbers and improvement. I have seen the same software deployed in two plants of the same company, and within six months one was improving and the other was back to where it started — because one management team asked "what do we do about it" and the other asked "can we report it differently." The technology enables; it does not decide. The decision is always management behaviour.
How does SYMESTIC support honest performance measurement?
Event-level automated capture at the machine — OPC UA for modern controls, MQTT via IoT gateways, digital I/O for brownfield assets — with no manual-entry dependency on the critical path. Immutable raw-data layer: operator classifications sit as a separate layer on top of raw machine events, and the raw events cannot be retroactively edited. Standardised stop-reason taxonomies aligned to the OEE Industry Standard, with no local-plant variants. Multi-cadence dashboarding: real-time for operators, shift reports for supervisors, daily plant reports, monthly enterprise rollups — one data layer, four representations. Bidirectional ERP integration so cost and margin context can be attached to operational performance without double-entry. 15,000+ machines connected across 18 countries with this architecture. See SYMESTIC Production Metrics.
Related: OEE · OEE Software · TEEP · Six Big Losses · Inefficiencies in Manufacturing · First-Pass Yield · MES · SYMESTIC Production Metrics
MES software compared: vendors, functions per VDI 5600, costs (cloud vs. on-premise) and implementation. Honest market overview 2026.
OEE software captures availability, performance & quality automatically in real time. Vendor comparison, costs & case studies. 30-day free trial.
MES (Manufacturing Execution System): Functions per VDI 5600, architectures, costs and real-world results. With implementation data from 15,000+ machines.