Skip to content

Downtime in Manufacturing: Definition, Categories & Analysis

By Uwe Kobbert · Last updated: April 2026

TL;DR: Downtime is any period during planned production time when an asset is not producing good parts at target rate. It splits into planned (changeovers, PM, meetings) and unplanned (breakdowns, material starvation, quality holds, micro-stops), and its honest measurement is the single largest lever on OEE. The arithmetic is trivial. The hard part is that almost every plant that measures downtime for the first time through an MES discovers the real number is two to three times what the old paper-based estimate suggested — and that 30 to 50 percent of the losses hide in micro-stops shorter than two minutes that manual logging never captures at all.

What is downtime in manufacturing?

Downtime is any interval within scheduled production time during which a machine, line or asset is not producing good parts at the specified rate. It is the headline loss category in every production system that is measured seriously, and it is the direct enemy of availability — the first of the three OEE factors. ISO 22400 defines the relevant time frames (planned busy time, actual production time, down time) and VDI 3423 provides the German-language vocabulary used across most mid-market plants in the DACH region. Both frameworks agree on the essentials; they disagree, productively, on edge cases that every plant has to resolve for itself.

What makes downtime different from every other production KPI is how unevenly it is distributed. A small number of stop reasons — typically three to five — cause the majority of the lost minutes. The rest is a long tail of one- and two-minute micro-stops that feel insignificant individually and add up to the single largest hidden loss in most discrete manufacturing operations. Anyone who has spent time on a shop floor has seen the pattern. The supervisor knows the two machines that fail every week. Nobody knows about the 14-second feeder jams that happen 80 times per shift.

How is downtime classified?

The classification most plants work with is simpler than the standards suggest and stricter than most operators initially like. Two axes, four cells, and a clear rule for the one category that always causes arguments.

Category Typical stop reasons Counted against OEE?
Planned, not-scheduled Weekend shutdowns, holidays, no demand No — excluded from planned production time
Planned, scheduled Changeovers, PM, shift meetings, scheduled breaks Yes — the honest view
Unplanned — availability loss Breakdowns, material starvation, tool changes mid-run, operator absence Yes
Unplanned — performance loss (micro-stops) Feeder jams, idle between orders, speed drops, chuta/chute blockages < 2 min Yes — usually against performance, not availability

The category that always causes arguments is "planned, scheduled" — specifically, whether changeovers and planned maintenance should count against OEE. The theoretical answer, and the one that produces numbers useful for continuous improvement, is yes. If changeovers are excluded from planned production time, the OEE number looks better and every SMED improvement becomes invisible. If they are included, the number reflects reality and the improvement potential is legible. I have had this argument in more plant meetings than I can count. The plants that include changeovers in the denominator are the ones that end up reducing them.

What are the six big losses?

The classical TPM framework — still the cleanest way to think about where downtime and performance losses come from — identifies six categories. Memorising them is worth the fifteen minutes it takes, because every real-world stop reason maps cleanly to one of them.

  1. Breakdown losses. Equipment failures that stop the machine for extended periods. High visibility, usually already tracked, rarely the biggest loss category in mature plants.
  2. Setup and adjustment losses. Changeover time, first-piece approval, machine warm-up. The SMED methodology was invented for this category specifically.
  3. Idling and minor stops. The feeder jams, sensor faults and product mis-positions that take 30 seconds to fix and happen 50 times per shift. The single largest hidden loss in most lines.
  4. Reduced speed losses. The line runs, but slower than design rate. Not visible on a binary "up/down" counter at all — only visible when you compare actual cycle time to theoretical cycle time.
  5. Defects and rework. Quality losses that consume capacity without producing sellable output.
  6. Start-up losses. Reduced yield during ramp-up after any stop. Often overlooked and often substantial on lines with long warm-up curves.

The losses that show up on a supervisor's radar are almost exclusively categories 1 and 2. The ones that kill OEE in a line that already handles breakdowns well are categories 3, 4 and 6 — precisely the ones that humans cannot reliably count without automated measurement.

Why does the real downtime number always shock people?

This is the pattern I have watched in several hundred plants across three decades, and it has never failed to appear. A plant estimates its downtime at eight to ten percent of planned production time, based on manual logs filled in at shift end. Automatic measurement goes live. The first honest number comes back at 22 to 28 percent. There is a meeting. Someone accuses the MES of miscounting. The MES is not miscounting. The plant was.

The mechanism is not fraud and it is not incompetence. It is the interaction of three honest phenomena that compound against each other:

  • Micro-stops disappear. A 90-second jam clearance is annoying, not memorable. By the time the shift-end report is filled in, it has been forgotten. Eighty of those per shift add up to two hours of invisible downtime that no manual system will ever capture.
  • Slow running looks like running. A line operating at 80 percent of design speed for two hours shows up on the operator report as "running." The lost 20 percent is a performance loss, not a downtime event — but it consumes the same capacity as the equivalent downtime, and it is only visible when actual cycle time is compared to target cycle time, which no paper-based system does.
  • Classification drifts toward excused. Borderline events get classified as "waiting for material" rather than "internal failure." Over months, the plant's self-image stabilises around a downtime number that flatters the maintenance and logistics teams equally. The number is internally consistent. It just does not describe the machines.

The honest rule we have arrived at after 15,000+ machine connections: the first automated downtime baseline is always 2 to 3 times the number the plant believed. That is not a failure of the new measurement. It is the first honest baseline the plant has ever had, and real improvement begins from there. In the Meleghy rollout — six plants across four countries — the same pattern showed up in every site, and the plants that accepted the new number rather than arguing with it delivered 10 percent fewer stoppages within six months.

How is downtime actually captured?

The capture method matters far more than the analytics layered on top of it. Three patterns dominate, and the quality of every subsequent decision depends on which one a plant is using.

Fully automatic capture. Machine state read from the PLC via OPC UA, or from digital I/O signals wired to a gateway on older equipment. Every state transition produces a timestamped event in the MES. Operator's only job is to classify the reason — ideally from a short, deliberately constrained list of 8 to 12 codes rather than a 40-item drop-down nobody reads. This is the only capture method that picks up micro-stops at all, and it is the pattern that dominates the SYMESTIC installed base for exactly that reason.

Semi-automatic capture. Duration and timing come from the machine; reason codes come from operator input at a shop-floor terminal. Acceptable for most discrete manufacturing, and often the pragmatic compromise on older equipment where a full PLC tap is impractical. The data quality sits roughly 80 percent of the way from paper to fully automatic — good enough for meaningful OEE, not quite good enough for root-cause work on the long tail.

Manual / paper-based capture. Operator writes stop events on a shift sheet; supervisor transcribes into a system the following morning. The resulting data is useful for broad trend awareness and nothing else. Any plant still working this way has an unknown real downtime figure, and the first investment worth making is not better analytics — it is the gateway that turns the machine into a source of truth.

How much downtime is normal?

The useful ranges below reflect what actually shows up in discrete manufacturing when downtime is measured automatically against true planned production time — including changeovers, including micro-stops. Ranges for plants that use cosmetic denominators will always look better and mean less.

Plant maturity Unplanned downtime share Corresponding availability
Reactive 25–40 % 60–75 %
Transitional 15–25 % 75–85 %
Mature 7–15 % 85–93 %
World-class < 7 % > 93 %

Any figure below 5 percent in a genuinely discrete manufacturing environment should be audited. It is not impossible, but in a population of several hundred plants it is rare enough that the first assumption should be measurement-system flattery rather than genuine excellence.

How do you reduce downtime in practice?

The sequence below is the one that works in the field, not the textbook version that starts with "implement a culture of continuous improvement." Culture matters, but culture on top of bad data produces confidently wrong decisions.

  1. Fix the measurement. Automate the capture. Accept that the new number will be worse than the old one. Resist every suggestion to "adjust" the methodology to make the new number match the old number. This single step is the difference between 15 years of stalled OEE improvement and 12 months of real progress.
  2. Pareto aggressively. Three to five stop reasons will account for 60 to 75 percent of your lost time. Rank them. Attack them in order. Do not chase the long tail until the top of the list is under control.
  3. Split availability and performance losses explicitly. Breakdowns and micro-stops require different fixes. The former is a maintenance and equipment-design problem; the latter is a process-stability and sensor-tuning problem. Treating them together is how plants spend two years on predictive maintenance and watch OEE rise by one percentage point.
  4. Instrument the top failure modes with process data context. The stop event alone is an outcome. The parameters at the moment of the stop — temperature, pressure, cycle number, material batch, operator, shift — are the cause. Modern MES platforms hold both on the same time base, which is where root-cause analysis actually happens.
  5. Lock improvements with alarms and control limits. Every fix erodes within six months unless it is protected by automated monitoring. The discipline of control is where most improvement programmes fail, not the discipline of analysis.

A realistic expectation for a plant that commits honestly to this sequence: 20 to 35 percent reduction in unplanned downtime within 12 months, with the biggest gains in the first 90 days coming from the Pareto work on the top three stop reasons alone.

Where does downtime measurement fit in the SYMESTIC platform?

In the SYMESTIC deployment pattern, machine states flow into production KPIs directly from the controller or from an I/O gateway on brownfield equipment. Every state transition produces a timestamped event with sub-second resolution, which is what makes micro-stop detection possible at all. The alarms module structures stop-reason codes into a deliberately short, operator-friendly list; the process data module supplies the parameter context at the moment of each stop. The authoritative vocabulary comes from ISO 22400 (manufacturing KPIs), VDI 3423 (availability) and SEMI E10 (equipment reliability and availability) — the documents are easy to find by name and worth reading in full by anyone designing a measurement system from scratch.

FAQ

What is the difference between downtime and idle time?
Downtime is a period during planned production time when an asset should be producing and is not. Idle time, in the precise sense, is time outside planned production time — the asset is not scheduled to run. Idle time does not count against OEE; downtime does. The terminology is used loosely in everyday shop-floor conversation, which is why written stop-reason taxonomies matter.

Do planned maintenance and changeovers count as downtime?
Yes, if they occur during planned production time. The temptation to exclude them — because they are "planned" — is strong and counterproductive. A plant that excludes changeovers from its denominator will never measure SMED improvements and will never have an honest OEE. Include them, measure them, attack them.

What is a micro-stop?
Conventionally, any stop shorter than two to five minutes (definitions vary). Micro-stops are functionally invisible to manual logging and are typically the single largest category of hidden loss in automated lines. ISO 22400 treats them as performance losses rather than availability losses, which changes where they appear in OEE but not how much capacity they consume. Practically, they are where the interesting improvement potential lives once the big breakdowns are under control.

How is downtime connected to OEE?
Directly. Availability — one of the three OEE factors — is operating time divided by planned production time, which is equivalent to (planned production time minus downtime) divided by planned production time. Reducing unplanned downtime raises availability one-for-one. See OEE and Availability for the full treatment.

Why does our OEE drop when we install an MES?
Because the previous number was wrong. Automated capture picks up the micro-stops and speed losses that manual logging missed. The 2–3× rise in reported downtime during the first 90 days of MES-based capture is the industry norm across 15,000+ machine connections. The honest baseline is the starting point of real improvement; the flattering old number was not.

How short is "too short" for a stop to count?
Any stop long enough to interrupt cycle time counts in principle. In practice, most plants set a threshold at 10 to 30 seconds for micro-stop classification, below which events are treated as within-cycle variation rather than stops. The threshold should be written down and stable over time. Moving it is a common way to accidentally make the number look better.

Is zero downtime a realistic target?
No, and the plants that get closest are not the ones that set it as a target. The realistic targets are planned downtime reduced to what is necessary for maintenance and changeover, and unplanned downtime reduced to a level consistent with equipment reliability and supply stability. World-class discrete manufacturing sits below 7 percent unplanned downtime; below that, the marginal cost of each additional percentage point rises steeply, and the investment is often better spent elsewhere.


Related: OEE · Availability · MTBF · MTTR · Machine Data Acquisition · Planned Maintenance Percentage · Preventive Maintenance · SMED · Six Big Losses · Alarms.

About the author
Uwe Kobbert
Uwe Kobbert
Founder and CEO of symestic GmbH. 30+ years in manufacturing IT — starting 1989 as consultant at SAS, then division head for industry at STERIA (process control and MES for food and beverage). Founded SYMESTIC in 1995 in Dossenheim near Heidelberg; in the mid-2010s rebuilt the platform from scratch as cloud-native on Microsoft Azure. Today SYMESTIC runs in 18 countries across four continents with 15,000+ connected machines, fully self-funded. Nominated for the Großer Preis des Mittelstandes. Dipl.-Ing. Communications Engineering/Electronics. · LinkedIn
Start working with SYMESTIC today to boost your productivity, efficiency, and quality!
Contact us
Symestic Ninja
Deutsch
English