MES Software: Vendors, Features & Costs Compared 2026
MES software compared: vendors, functions per VDI 5600, costs (cloud vs. on-premise) and implementation. Honest market overview 2026.
A downtime reason catalog is the controlled, hierarchically structured list of coded reasons used by operators, line leaders, and maintenance to classify every production stoppage above a defined minimum duration. It is the vocabulary that turns raw stop events — captured automatically by the machine — into attributable, analysable, Pareto-sortable root-cause categories that a team can actually act on. Without a reason catalog, you do not have OEE; you have an availability percentage with no explanatory power, which in my experience after twenty-five years of MES implementations is functionally worse than having no OEE number at all, because a number without a why invites management to demand improvement without telling anyone which lever to pull.
I wrote a book in 2025 called "OEE: Eine Zahl, viele Lügen" — "OEE: One Number, Many Lies" — and the central argument of that book is that the OEE number itself is almost never the problem. The problem is always the downtime reason catalog that sits underneath it. A plant with a technically correct OEE calculation running against a badly designed, badly governed, or politically captured reason catalog will produce numbers that look like data and behave like fiction. The Pareto that falls out of that catalog will point the improvement team at the wrong causes, the improvements will produce no measurable impact, management will lose faith in the data, and within eighteen months the plant will either stop looking at OEE entirely or start manipulating the inputs to hit a target. I have watched this cycle happen in enough plants across four continents that I stopped treating it as a random failure and started treating it as a predictable consequence of a specific design decision — namely, treating the reason catalog as an afterthought during MES implementation rather than as the central artefact it actually is.
Seiichi Nakajima defined the Six Big Losses in 1988 as part of the original TPM framework at Japan Institute of Plant Maintenance, and this taxonomy is still the single most important reference for any downtime reason catalog design, because it maps directly onto the three factors of the OEE calculation (availability, performance, quality). A catalog that cannot be cleanly mapped onto the Six Big Losses will produce Pareto charts that do not agree with the OEE decomposition — which is the single fastest way to lose the trust of the operations team, because they will see that the numbers do not add up. The table below shows the canonical mapping.
| Nakajima Big Loss | OEE factor | Reason catalog category it drives |
|---|---|---|
| 1. Equipment failure / breakdown | Availability | Technical fault — subdivided by system (mechanical, electrical, sensor, drive, pneumatic, hydraulic, control). |
| 2. Setup & adjustment | Availability | Changeover — subdivided by type (product, format, tool, material), and by phase (teardown, clean, rebuild, qualification). |
| 3. Idling & minor stops | Performance | Micro-stop — stops below a threshold (typically 2–5 min) that do not trigger an operator entry. Almost always the most underreported category. |
| 4. Reduced speed | Performance | Running slower than rated speed — captured as a performance loss, not typically as a discrete reason entry (but may be linked to a reason if operator deliberately slowed the line). |
| 5. Process defects (in-run) | Quality | Quality event — NOK batch, inspection hold, parameter deviation — often causes a stop to investigate. |
| 6. Startup / yield losses | Quality | Start-up scrap, warm-up scrap, first-off rejects — typically linked to changeover completion. |
Three other external reference frameworks matter for a serious reason-catalog design. ISO 22400 (the international manufacturing-KPI standard) formalises the OEE calculation and the stop-time categories that feed it — any catalog not alignable with ISO 22400 will eventually cause problems in audits and cross-plant benchmarking. SEMI E10 (the semiconductor-industry equipment-state standard) is the most mature downtime taxonomy in existence, because semiconductor fabs have been using it for three decades; it defines six top-level equipment states (productive, standby, engineering, scheduled downtime, unscheduled downtime, non-scheduled) and is worth studying even outside semiconductor because it shows what a thirty-year-refined reason taxonomy looks like. VDMA 66412-1 is the German-speaking equivalent, covering MES-KPI definitions including availability and downtime categorisation, and is frequently referenced in German procurement specifications. A reason catalog built without awareness of these four frameworks — Nakajima, ISO 22400, SEMI E10, VDMA 66412-1 — is almost guaranteed to reinvent problems that these frameworks have already solved.
This is the observation from my Six Sigma Black Belt years at Johnson Controls (2001–2003) and the core of my 2025 book, and I have not yet seen a plant implementation where it was not the dominant failure mode: operators do not pick the reason code that best describes what happened. They pick the reason code that produces the least organisational friction. I call this pattern The Political Reason Code, and it is the single most destructive force in downtime data quality that exists. The mechanics are straightforward and predictable.
| Organisational consequence | Reason code avoided | Reason code picked instead |
|---|---|---|
| Triggers a maintenance ticket and a conversation with the maintenance supervisor | Technical fault — equipment breakdown | "Process adjustment" or "minor technical issue" — ambiguous enough to avoid the ticket |
| Starts an argument with logistics about why material was not at the workstation | Material missing | "Organisational" or "waiting for instruction" |
| Exposes that changeover took 90 min instead of the standard 45 min | Changeover overrun | Changeover (no subcategory) — duration absorbed into the generic bucket |
| Triggers a quality deviation report and possibly a customer notification | Quality hold / NOK investigation | "Line check" or "process adjustment" |
| Any organisational friction at all | Any specific code | "Other" — the universal escape valve |
The Political Reason Code is why the "Other" bucket in almost every plant I have walked into grows from its design target (typically < 5 %) to 25–40 % within the first six to twelve months of catalog operation. I call this secondary pattern The Other-Bucket Drift, and it is the most reliable leading indicator of reason-catalog failure that I know of. When "Other" crosses 15 %, the catalog is already ceremonial; when it crosses 30 %, you are measuring operator risk-aversion, not production reality. The fix is never "train operators better" — I have tried that discipline and it fails within a quarter because the underlying incentive structure has not changed. The fix is to make the specific reason codes politically safer to pick than "Other" — by separating data collection from performance evaluation, by making it explicit to operators that accurate classification is a protection, not a risk, and by never using reason-code data to assign individual blame.
Three hierarchy levels is the right depth for almost every discrete and batch manufacturing environment. Two levels are too coarse for meaningful Pareto analysis; four levels produce selection fatigue that drives operators into "Other." The target size numbers below come from catalogs I have personally designed and operated across automotive, electronics, plastics, and FMCG plants over the past fifteen years.
| Level | Purpose | Target size | Example |
|---|---|---|---|
| L1 — Category | High-level classification for management dashboards and cross-plant roll-up. Maps 1:1 to Six Big Losses + organisational categories. | 6–8 categories | Technical fault · Material & logistics · Changeover & setup · Quality & rework · Organisation & personnel · Planning · Micro-stop · Breaks (planned) |
| L2 — Subgroup | Mid-level classification for department-level Pareto (maintenance, logistics, quality). | 15–25 subgroups (3–4 per category on average) | Under Technical fault: Mechanical · Electrical · Sensor · Drive · Pneumatic · Control system |
| L3 — Specific reason | Operator-level selection; the actual pick during a stop event. | 20–40 total at go-live, expanded to 60–80 once data quality is stable (after 6–12 months) | Under Sensor: Light barrier contaminated · Photocell misaligned · Pressure sensor drift · Temperature sensor fault |
Two additional design principles that matter more than they look: every Level-3 reason needs a one-sentence operational definition that produces identical selection across two different shifts — if "material missing" is ambiguous between "material is not at the workstation" and "material exists but cannot be found," the catalog will produce noisy data indefinitely. And the catalog must be built with line-specific top-10 quick picks as the default operator view — putting the ten most frequent reasons for that specific line at the top of the selection list reduces misclicks by roughly half based on what I have measured at multiple rollouts, because operators are not scrolling through 40 options to find the three they actually need.
Every catalog has a minimum duration threshold below which events are not classified by the operator but aggregated automatically as "minor stops" or "micro-stops." The threshold is a tradeoff: too low produces operator fatigue (classifying 20-second stops all shift); too high hides losses. The practical range is 2–5 minutes, and the choice depends on the cycle time of the line — a plant with 12-second cycles cannot afford a 5-minute threshold because a 4-minute stop is 20 cycles, which is already a substantial loss.
The pattern I have encountered in approximately every plant I have looked at: micro-stops contain 30–50 % of the total availability loss, and operators are usually surprised by this number, because the stops are individually short and invisible to them. The line "feels" like it is running. The machine data says otherwise. The operational implication is that a catalog which captures only stops above the threshold will undercount the performance loss by half, and the Pareto will point improvement teams at large stops that matter less than the cumulative micro-stop burden. The correction is a separate performance-loss analysis feeding off machine data alone — cycle-time deviation, expected-vs-actual output — which substitutes for operator classification in the below-threshold range. An industrial data historian or MES with time-series capability does this natively; a reason catalog alone does not. The two must be designed as complements, not as substitutes.
Modern PLCs produce hundreds of fault codes per line per shift. Most of these are self-clearing and operationally irrelevant; a subset indicates actual production-affecting conditions. The catalog-to-PLC bridge — mapping specific PLC fault codes automatically to specific Level-3 reasons — is the single largest data-quality improvement available in modern MES architecture, and the difference between a plant that still has the Political Reason Code problem and one that has substantially neutralised it.
| Stop type | Classification source | Operator role |
|---|---|---|
| PLC fault with unambiguous mapping (e.g. sensor fault, drive fault, emergency stop) | Auto-assigned Level-3 reason from fault-code-to-reason table | Confirm or refine in optional free-text context field |
| PLC fault with ambiguous mapping (e.g. line stop — multiple possible causes) | Auto-narrowed to category or subgroup; operator picks specific reason | Select specific Level-3 reason from filtered subset |
| Operator-initiated stop (e.g. material replenishment, quality check, planned adjustment) | Full operator classification | Pick from line-specific top-10 or full Level-3 tree |
| Stop during changeover | Auto-assigned to "Changeover" category; operator confirms phase | Confirm phase (teardown, clean, rebuild, qualification) |
| Micro-stop (< threshold) | Aggregated automatically as micro-stop; no operator entry | None |
A good fault-code-to-reason mapping table takes 2–3 days of engineering work per line type, eliminates roughly 60–70 % of operator-initiated classification, and largely defuses The Political Reason Code problem for the events it covers — because a PLC does not care about organisational friction. What the operator then classifies is the residual set that genuinely requires human judgement, which is both a smaller set and a cognitively simpler one.
Beyond The Political Reason Code and The Other-Bucket Drift, a small set of recurring failure modes appears in almost every plant I have audited. Recognising them early prevents years of low-quality data.
| Antipattern | What it looks like | Corrective discipline |
|---|---|---|
| The Political Reason Code | Operators pick codes to minimise friction rather than to describe reality. | Separate data collection from individual performance evaluation. Make specific codes politically safer than "Other." |
| The Other-Bucket Drift | "Other" grows from <5 % at launch to 25–40 % within 6–12 months. | Monthly "Other" review; convert recurring text comments into specific reasons; enforce 10 % ceiling. |
| The Shift-End Batching Antipattern | Operators leave reason fields empty during the shift and classify all stops in a 10-minute burst before handover. Resulting data is guessed, not observed. | Enforce real-time classification at the event (within 2 min of stop end). Blocked further production start without reason entry. |
| The Category-Creep Problem | Catalog grows from 30 to 200+ reasons as every stakeholder adds "one more specific code." Operators give up and pick "Other." | Governance ownership (one named person). No additions without removing equivalents. Periodic consolidation reviews. |
| The Phantom Reason | Reasons that are theoretically in the catalog but never selected in practice — typically because they are too obscurely named, or duplicated by a more convenient adjacent reason. | Quarterly Phantom review: reasons with <5 selections per quarter are candidates for retirement. |
From twenty-five years of MES rollouts across Johnson Controls, Visteon, iTAC, Dürr, and now SYMESTIC — and the material that drove me to write "OEE: Eine Zahl, viele Lügen" in 2025: the single story I tell the most often in customer workshops comes from my Six Sigma Black Belt period at Johnson Controls Automotive Electronics in Rastatt around 2001–2002, on a headliner assembly line that was supposed to be running at 78 % OEE according to the reporting system. My DMAIC project was scoped to push it to 85 %. I spent the first two weeks doing what Six Sigma teaches you to do — go see the actual work — and what I discovered within five shifts on the floor was that the 78 % OEE number was mechanically calculated correctly but operationally meaningless. The downtime reason catalog had thirty-two reasons in it; "Other" was at 31 %; three of the technical-fault codes had not been selected in eighteen months; the top reason by minutes was called "Process Adjustment," which meant — when I sat down with operators and asked them directly, in the break room, off the record — whatever an operator did not want to admit publicly. "Process Adjustment" was the operator code for: I don't know, I don't want to write a maintenance ticket, the supervisor will be unhappy if I write sensor fault, and I cannot leave the field blank because the system will not let me start the next cycle. When we reclassified three months of historical "Process Adjustment" entries based on operator interviews and machine-data correlation, the true OEE was 61 %, not 78 %. The improvement team had been chasing the wrong causes for two years. This is what my book is about. The OEE number was never lying — the reason catalog was. And this was not a failure of any single person's integrity; it was a failure of catalog design. The catalog did not make the right answer the easy answer, so operators picked the easy answer instead, and the data became political. The correction at that plant took six months: we rebuilt the catalog from 32 reasons down to 22 specific Level-3 reasons plus one explicit "Unable to determine" code that carried no organisational consequence, automated PLC-to-reason mapping for the top seven fault types, killed the "Process Adjustment" bucket entirely, instituted a monthly "Other" review where recurring patterns got promoted to named reasons, and — most importantly — made it explicit from the plant manager that nobody would be individually evaluated based on reason-code distribution. Within three months, "Other" dropped from 31 % to 6 %. Within six months, the true OEE rose from 61 % to 72 %. The reported OEE in the old system had been falling simultaneously because we were now measuring reality — and the plant manager had to have a difficult conversation with corporate about why our "OEE dropped from 78 % to 72 %" when in fact we had just started telling the truth. My single strongest recommendation after three decades of this work, and the one I repeat in every customer workshop: a plant's true OEE is almost always 10–15 percentage points lower than its reported OEE, not because anyone is lying but because the reason catalog has been politically captured. The week your "Other" category drops below 10 % and your reported OEE drops by 10 points is the week your operation actually starts improving. Before that week, everything you are doing is improvement theatre.
Every reason catalog starts as a single-plant artefact. It becomes a multi-plant problem the moment the organisation tries to benchmark or consolidate across sites. Between 2006 and 2013 at Johnson Controls Automotive Electronics, I led the global harmonisation of downtime catalogs across seven countries — China, Mexico, USA, Tunisia, Macedonia, France, Russia — and 30+ manufacturing processes in soldering, assembly, and injection moulding. The pattern at the start was exactly what I have seen at every multi-plant customer since: each plant had its own catalog with its own vocabulary, its own depth, its own political codes, and its own "Other" definition, and corporate-level Pareto charts were mathematical fiction because the same physical event was classified under different names in different plants.
The harmonisation approach that actually works has three rules:
| Rule | What it does |
|---|---|
| L1 and L2 are corporate-mandatory; L3 is plant-local | Guarantees cross-plant Pareto integrity at the category and subgroup level without forcing every plant to speak identical Level-3 vocabulary. Injection moulders and SMT lines can keep their own specific reasons as long as they roll up to the same L1/L2. |
| L3 codes must declare their L2 parent | Prevents the most common harmonisation failure: plants inventing L3 codes that map ambiguously to the corporate L2 structure. |
| Governance cadence: quarterly L1/L2, plant-owned L3 | Corporate governance committee reviews L1/L2 structure once a quarter. L3 changes are plant-level with monthly review. Prevents both The Category-Creep Problem centrally and local ossification. |
The consequence of this three-rule structure is that corporate dashboards can safely aggregate across plants at the category and subgroup level (where data is comparable), while plant teams retain ownership of the Level-3 vocabulary they actually need for local improvement work. This is the only harmonisation approach I have seen work at scale; attempts to enforce identical Level-3 catalogs across plants produce either low-quality data (plants pick the nearest available code rather than the accurate one) or political standoffs between central IT and plant operations.
Every catalog degrades. The question is not whether it will drift, but how fast, and whether the governance structure catches the drift before it becomes operationally binding. A serious governance cadence has four layers:
| Cadence | Activity | Owner |
|---|---|---|
| Weekly | Line-level "Other" review during shift-handover or A3. Any recurring pattern in "Other" text comments flagged for monthly review. | Line leader + CI manager |
| Monthly | Plant-level catalog review. Promote recurring "Other" patterns into named L3 reasons. Retire Phantom Reasons (<5 selections/quarter). Audit "Other" ceiling (<10 %). | Plant CI manager |
| Quarterly | Cross-plant L1/L2 review (if multi-plant). Catalog-to-Six-Big-Losses mapping validation. Fault-code-to-reason mapping refresh. | Corporate OE / Operations governance |
| Annual | Full catalog health audit. Political-Reason-Code diagnostic (operator interviews, correlation with fault data). Re-baseline against ISO 22400, VDMA 66412, SEMI E10 where relevant. | Corporate OE + external review |
Reason catalogs exist for three primary use cases — OEE explanation, Pareto-driven improvement, and performance communication. In regulated or certified environments there is a fourth, equally important use case that gets less attention: audit evidence. For IATF 16949 audits, downtime reason history is part of the quality record demonstrating process control. For FDA-regulated environments under 21 CFR Part 11, reason-code entries require audit trail, electronic signatures, and validated change control. For ISO 9001 quality-management audits, reason classifications feed into nonconformity and corrective-action evidence. Catalogs that are designed only for improvement use cases and ignore the audit dimension tend to fail audit scrutiny for one of three reasons: ambiguous reason definitions that cannot be traced deterministically, missing audit trail on catalog changes (who modified a reason definition, when, why), or inability to export historical data in a format that auditors can work with. Designing the catalog with audit exportability as a first-class requirement adds perhaps 10–15 % to the initial specification effort and eliminates a substantial class of audit risk downstream. See also end-to-end traceability for how reason-catalog data integrates with broader production-record architectures.
How many downtime reasons should I start with?
20–40 Level-3 reasons at go-live is the range I have seen work reliably across automotive, electronics, plastics, and FMCG. This is deliberately conservative: fewer reasons means easier operator selection, lower "Other" drift, and cleaner Pareto analysis in the first six months. Expansion to 60–80 happens only after data quality is stable ("Other" consistently below 10 %) and only in response to recurring patterns that emerge from the monthly "Other" review. Catalogs that start with 100+ reasons almost always end up with 40 % "Other" within a year.
What is the difference between a downtime reason catalog and a fault catalog?
Terminology varies by vendor and industry. In most usage, a "downtime reason catalog" is the operator-facing vocabulary covering both planned and unplanned stops with organisational attribution; a "fault catalog" is typically the PLC-level technical fault-code list covering unplanned machine-level events only. The two should be explicitly mapped to each other (PLC fault codes auto-assign to reason catalog entries where possible). What matters for data quality is the internal definitional consistency, not the label.
How does the reason catalog integrate with a MES?
Four integration points: (1) the catalog is the structured metadata powering operator-side downtime capture at the shopfloor client; (2) PLC fault codes are auto-mapped to reason codes through a maintained fault-to-reason table; (3) the MES links each reason entry to the active production order, shift, operator, and machine, producing a multi-dimensional Pareto; (4) reason-code history feeds back into shift-log auto-pull, closing the operational loop between data collection and action management.
Who owns the catalog?
Single named owner at the plant level — typically the CI manager or production engineering lead. The worst outcome is shared ownership, which reliably produces Category Creep and governance gaps. The named owner runs the monthly review, coordinates with maintenance and quality on subgroup changes, and signs off on L3 additions/retirements. For multi-plant organisations, a corporate-level governance committee owns L1/L2 structure while plant owners retain L3.
What is The Political Reason Code?
The pattern in which operators select reason codes to minimise organisational friction rather than to describe what actually happened. Technical-fault codes are avoided because they trigger maintenance tickets; material-missing codes are avoided because they trigger logistics arguments; changeover-overrun codes are avoided because they expose performance gaps; "Other" becomes the universal escape valve. It is the single most destructive failure mode in downtime data quality. The fix is structural, not motivational: separate data collection from individual performance evaluation, and make accurate classification politically safer than "Other."
What is The Other-Bucket Drift?
The predictable pattern in which the "Other" category grows from its design target (< 5 %) to 25–40 % within 6–12 months of catalog operation. "Other" crossing 15 % means the catalog is already ceremonial; crossing 30 % means you are measuring operator risk-aversion, not production reality. The fix is monthly review of "Other" entries (text comments reveal recurring patterns), promotion of patterns into named reasons, and enforcement of a 10 % ceiling as a catalog-health KPI.
What is the right micro-stop threshold?
2–5 minutes depending on cycle time. Rule of thumb: the threshold should be roughly 10–20× the cycle time. A 12-second-cycle line should have a 2-minute threshold; a 3-minute-cycle line can run a 5-minute threshold. Below the threshold, stops are aggregated as micro-stops without operator classification; above the threshold, operator classification is mandatory. The micro-stop category typically hides 30–50 % of the true availability loss and requires machine-data analysis to surface.
How do I harmonise reason catalogs across multiple plants?
Three rules that actually work at scale: (1) Level 1 and Level 2 of the hierarchy are corporate-mandatory and identical across all plants; (2) Level 3 is plant-local with the constraint that every L3 code must declare its L2 parent; (3) governance cadence separates corporate-level quarterly L1/L2 review from plant-level monthly L3 review. This preserves cross-plant Pareto comparability at the category and subgroup level while allowing plant teams to own the vocabulary they actually need locally. Attempts to enforce identical L3 across plants produce either low-quality data or political standoffs.
Should the catalog align with Six Big Losses?
Yes — the Level 1 hierarchy should map cleanly onto Nakajima's Six Big Losses (1988), because this is what the OEE calculation itself is based on. A catalog whose Pareto cannot be reconciled with the OEE decomposition will produce persistent discrepancies and lose operational credibility. Equipment failure → Technical fault. Setup & adjustment → Changeover. Idling & minor stops → Micro-stop. Reduced speed → Performance loss (machine-data-driven). Process defects → Quality event. Startup/yield losses → Start-up scrap (linked to changeover).
What about ISO 22400, SEMI E10, and VDMA 66412-1?
ISO 22400 is the international MES-KPI standard and defines the stop-time categories that feed OEE — your catalog should be alignable with it, especially for international benchmarking or audit contexts. SEMI E10 is the semiconductor-industry equipment-state standard and represents thirty years of refinement; it is worth studying as the most mature downtime taxonomy in existence even outside semiconductor. VDMA 66412-1 is the German-language MES-KPI standard frequently referenced in DACH procurement. Serious catalog design should be tested against all three alongside the Six Big Losses.
Related: OEE: definition, calculation & practice · MES: definition, functions & benefits · OEE software · MES software compared · Digital shift log · Alarm management · Industrial data historian · Shop floor control · MESA-11 · Composable MES · MES requirements · Predictive quality · E2E traceability · Recipe management · Change control · Schedule adherence · On-Time Delivery · Rolled Throughput Yield · Scrap rate vs. rework rate · A3 problem solving · MDE · BDE · Production metrics · Alarms · Process data · Automotive · Metal processing · For COOs & plant managers · For operational excellence. External references: ISO 22400 (manufacturing KPIs) · SEMI E10 (equipment reliability, availability & maintainability) · VDMA 66412-1 (MES-KPIs) · Nakajima, S. (1988), Introduction to TPM, Productivity Press — the foundational Six Big Losses text · OEE: Eine Zahl, viele Lügen (Christian Fieg, 2025).
MES software compared: vendors, functions per VDI 5600, costs (cloud vs. on-premise) and implementation. Honest market overview 2026.
OEE software captures availability, performance & quality automatically in real time. Vendor comparison, costs & case studies. 30-day free trial.
MES (Manufacturing Execution System): Functions per VDI 5600, architectures, costs and real-world results. With implementation data from 15,000+ machines.