Skip to main content
Workover rig on an upstream lease, the reactive maintenance event that the 33% preventable-downtime number eliminates
Back to Insights
The ProofIndustry Benchmark

33% of Your Production Downtime Is Preventable. The Supermajors Already Proved It.

A&M published the 33% number in 2015. ExxonMobil, ConocoPhillips, and Chevron confirmed it in peer-reviewed 2024 case studies. The math is durable. The operating model is named. The four-week adoption path is open right now.

Michael Atkin, P.EngMay 15, 202612 min read
33%
Reduction in production downtime under exception-based surveillance (A&M, 2015)
2%
ExxonMobil Permian neural-network ESP deployment, oil uplift (SPE Artificial Lift Conference, 2024)
80-90%
ConocoPhillips Norway BU digital twin program, reduction in select maintenance activities (JPT, 2024)
5%
Chevron Kaybob Duvernay closed-loop lift control, LOE reduction year one (JPT, 2024)

Alvarez & Marsal published the math in 2015. ExxonMobil, ConocoPhillips, and Chevron have peer-reviewed case studies from 2024. The number is durable, the operating model is named, and the supermajors run it every day. Most independents still respond to equipment failure on the next pumper round. The barrier is not the math. It is the adoption.


The Number That Has Not Moved in Ten Years

In 2015, Alvarez & Marsal published "The Advantages of Exception-Based Surveillance," a whitepaper that put a defensible number on a question every upstream operator had been asking informally for years. How much of the production downtime we currently absorb is structurally preventable, on the same wells, with the same crew, on the equipment we already own?

The answer was 33%.

That was the lever-one number inside the broader exception-based surveillance (EBS) framework. Roughly one-third of upstream production downtime is preventable through earlier intervention on the equipment degradation signal that is already in the historian by the time the failure happens. The other two-thirds is the work of better dispatch, faster rerouting, and tighter lift control. But the 33% sits in maintenance specifically. Catch the failure before it stops the well, and the downtime never gets booked.

The whitepaper is now ten years old. It has been cited at SPE roundtables. It has been productionized inside every supermajor with a basin position. And it has not been adopted at scale by the small-to-mid operators who collectively run more than half of North American producing wells.

This article does one job. It explains what the 33% actually is, why the supermajor case studies that followed validated the number, and what the four-week adoption path looks like for an independent that has not done this yet.

What Reactive Maintenance Actually Costs the Median Independent

Walk into a 500-well independent and the maintenance model is recognizable inside an hour. A pump fails. The pumper finds it on the next round, somewhere between four and thirty-six hours after the event. The pumper logs it on a paper ticket or a tablet, calls dispatch, and waits for a workover crew. Parts are ordered. The well sits offline. By the time the workover is complete, the deferment has been three days to a week. Production accounting catches up two billing cycles later, when the missed barrels show up in the variance to forecast.

The cycle repeats five to fifteen times a month, depending on fleet age, water cut, and basin. Every event is small enough to absorb. The aggregate is not.

A typical 500-well operator producing 8,000 BOE per day at $65 realized loses 2 to 4% of annual production to reactive maintenance. That is 160 to 320 BOE per day. At $65 realized, the deferment is roughly $4M to $8M of annual revenue. The associated emergency-cost premium on workover labor and trucking adds another $1M to $2M in OPEX. The combined annual exposure to reactive maintenance, on a 500-well operator, is in the range of $5M to $10M before any second-order effects on lease accounting, hedge timing, or working capital.

The math has been known since the A&M paper. The reason it has not moved at most independents is that the build was always quoted as a multi-year project. It is not, anymore.

Get the WorkSync Field Ops Brief

Monthly read for upstream + midstream operations leaders. Case studies, benchmarks, and what's changing in the field. Unsubscribe anytime.

Three Supermajors. Three Citations. Three Numbers You Can Verify.

The strongest evidence the 33% number is durable is not from the consulting whitepaper that published it. It is from the supermajor case studies that came after, each of which deployed an EBS-plus-closed-loop operating model on a real basin position and reported the result in a peer-reviewed venue.

ExxonMobil. Permian. 1,300+ wells. 2% production uplift. No new sensors. Presented at the SPE Artificial Lift Conference in 2024, the deployment was a neural-network closed-loop optimization on existing ESP fleet across more than 1,300 Permian wells. The reported production uplift was 2%, with no equipment changes. The optimization layer was algorithmic. The wells were existing. The signal was already in the historian. Run that across a comparable independent fleet and the deferment recovery alone covers the deployment in single-digit months.

ConocoPhillips. Norway business unit. 80% reduction in select maintenance activities. 90% in others. Reported in JPT (Journal of Petroleum Technology) in 2024 as part of an IOCaaS deployment, the Norway program ran a digital-twin-driven maintenance cadence on a defined set of activities. The reductions were not headcount reductions. They were activity-level reductions: scheduled tasks the model determined did not need to happen on the prior cadence because the signal said the equipment was fine. The freed crew hours went to higher-leverage work. The downstream effect on workover demand, parts inventory, and emergency mobilization was reported as material in the same publication.

Chevron. Kaybob Duvernay. 5% LOE reduction in year one. Also published in JPT in 2024, the Chevron case study covered a closed-loop lift control deployment on the Kaybob Duvernay. The 5% LOE reduction in year one is significant on its own. The composition of the reduction is more significant. The largest contributor was the shift in maintenance from emergency to scheduled, which carried a 3 to 5x cost differential per event on the prior baseline.

Three operators, three basins, three independent JPT or SPE-published numbers. None of them was vendor marketing. None of them required a sensor refresh as a precondition. All three confirmed the operating premise A&M put on paper a decade earlier: the signal is in the historian, the cost of the failure is preventable, and the gap is closeable inside an annual operating cycle.

The Three Mistakes That Keep Independents Stuck in Reactive Mode

If the math is published, the supermajor deployments are real, and the adoption window is open, why are most independents still operating reactively? Three mistakes account for the gap, in the order they typically derail the adoption conversation.

Mistake One: Waiting on a two-year data cleanup. The most common version of this is the IT-led data quality assessment that arrives in front of the executive committee three months into the AI conversation. The assessment is true. The data is messy. The historian has unmapped tags, the EAM has duplicate work-order types, and the production accounting close lag is longer than anyone wants to admit on paper. The conclusion that gets attached to the assessment is the wrong one: that the data must be cleaned before the AI can be deployed. The supermajors did not clean their data first. They scored what they had, ranked the day, and improved the data quality as a byproduct of operating on it. The data cleanup happens inside the deployment, not in front of it.

Mistake Two: Buying new sensors before scoring the data already in the historian. The second derailment usually arrives from a vendor pitch that opens on instrumentation gaps. The story is plausible. There are tags missing. There are sensors that would help. The conclusion that gets attached, again, is the wrong one: that the AI deployment is gated on the sensor refresh. Most upstream operations collect ten times more data than they score. The 33% preventable downtime number was produced on the data already collected, not on a hypothetical instrumentation tier. New sensors earn their place on individual well ROI math after the score is running, not before.

Mistake Three: Treating predictive maintenance as a separate project from dispatch. The third mistake is the most expensive. An operator stands up a predictive maintenance pilot in isolation from the dispatch loop. The model produces forecasts. The forecasts go into a dashboard. The dashboard is reviewed weekly by a planner. The pumper still runs the same fixed route on Tuesday. The forecast did not change tomorrow's work order. The model became a reporting artifact, not an operating change. A forecast only moves a number when it changes the order in the truck cab tomorrow morning.

The structural error common to all three is treating predictive maintenance as a technology procurement instead of an operating change. The supermajor case studies were operating changes that happened to use technology. The independent equivalents have to be the same.

What the 33% Looks Like on a Tuesday Morning

The 33% number is not abstract. On a Tuesday morning in a 500-well operation, it looks like four specific changes to the work order pumpers run.

Change one: the score ranks every well by failure probability and FCF at risk. Every well in the fleet sits on a score by 6 AM, computed overnight against the SCADA, historian, accounting, and EAM read of the prior 24 hours. The score is two numbers: probability that the well will experience an intervention-worthy event in the next 7 days, and the free cash flow at risk on that well if the event happens. A 200-bopd well at 70% failure probability outranks a 30-bopd well at 90% failure probability. A 200-bopd well at 30% failure probability with a high-value gas stream outranks both. The ranking is not subjective.

Change two: the route is the top of the score, not the calendar. The pumper opens the route on the tablet. The first stop is the highest-scored well in the basin, not the next well on the calendar rotation. The pumper visits what the score said matters. The wells that did not change overnight fall off the route. The pumper runs more high-value visits per shift.

Change three: the model gets the field observation back the same day. When the pumper finds a backside leak on the well the score flagged, that observation flows back into the score before tomorrow's ranking runs. Voice-first capture turns the radio chatter, the handwritten ticket, and the gut call into structured records that the score consumes. The score gets sharper every shift.

Change four: the workover gets scheduled before the well goes down, not after. The workover crew is scheduled against the forecast, not against the failure. The parts are ordered against the forecast. The 3 to 5x emergency-cost premium disappears. The deferment never happens. The 33% is not a productivity statistic. It is the line item the controller stops booking on the variance to forecast.

The supermajors made these four changes a decade ago. Independents can make them in a month.

The Four-Week Adoption Path

The 33% is not a multi-year initiative. The shape of an adoption pilot at a small-to-mid operator is the same shape as the existing pump-by-priority pilot the WorkSync deployment team runs.

Week 0: Sign the metric, not the software. The operations leader, the asset manager, and the CFO or controller pick one metric and write it down. For a preventable-downtime pilot, the most common Week-0 metrics are mean time from anomaly to first field response, deferred production recovered per crew shift in BOE, and emergency-versus-scheduled workover ratio on the pilot route. The threshold for "moved" is written into a one-page document the controller signs. No license fee. No kill fee. The Impact Guarantee makes the pilot the experiment.

Week 1: Read-only integration on the stack you already own. SCADA or historian (AVEVA PI, Ignition, Cygnet, eLynx), production accounting (Enertia, Quorum, Peloton), EAM or CMMS (Maximo, SAP PM, or mid-market alternative), GIS, and the pumper's existing field data capture. Read-only. Five to seven working days. No new sensors. No rip-and-replace.

Weeks 2-3: The score runs live, the plan hits the truck cab. The scoring loop runs nightly. Every well, every artificial lift unit, every open work order is scored on failure probability and FCF at risk. The ranked plan is published to the truck cab and the operations center by 6 AM. The superintendent adjudicates the top 20 items each morning and feeds the corrections back into the score. The model gets sharper inside the week.

Week 4: Measure, decide, sign. The chosen metric is measured against the Week-0 baseline. If the metric moved past the threshold, the operator signs the annual on the same Impact Guarantee terms and the pilot route expands to the full asset on a 60-to-90-day rollout. If the metric did not move, the operator walks away. No license fee. No kill fee. The baseline data and the integration documentation stay with the operator.

The deployed reference at a top-25 private producer running over 5,000 wells across the Western Anadarko, Permian, and Wyoming followed this exact shape. The metric moved. The annual signed. The same operating model the supermajors run is now running on an independent fleet at a fraction of the build.

What Happens to the OPEX Line in the First Quarter

The 33% downtime reduction does not show up as a single line item on the operating budget. It distributes across three.

Workover OPEX drops. The shift from emergency to scheduled work carries the 3 to 5x cost differential per event. On a fleet with 60 reactive workovers a quarter, moving half of them into scheduled work at half the per-event cost saves $1M to $2M per quarter on a typical 500-well operator. The savings are real. They are also small relative to the production lift.

Production deferment recovers. The bigger number is the production that does not get deferred because the well never goes offline. A 2 to 4% recovery on 8,000 BOE per day at $65 realized is $4M to $8M of annual revenue, with no associated unit cost. That number lands on FCF, not on the cost line. The controller sees it on the variance to forecast.

Lease accounting close compresses. The downstream effect is on the close cycle. Reactive deferments propagate through the accounting close as variance entries that have to be researched and reconciled. Scheduled workovers do not. Operators who shift the maintenance model into the forecast typically compress their close cycle by 1 to 2 days inside the pilot, with associated working-capital recovery the controller sees on the balance sheet.

The three numbers compound. The aggregate first-year impact on a 500-well operator running the 4-week pilot to full deployment is consistently in the range of $8M to $15M of annual FCF improvement on the same well count, no new wells drilled, no incremental headcount, and no hardware refresh.

Why This Number Is Not Going to Stay Available

The adoption window for the 33% is open right now. It is not going to stay that way.

Three forces are closing it. First, the EPA OOOOb methane rule is forcing tank vapor recovery and continuous monitoring across most onshore tank fleets on a 2026-2027 compliance window. Operators that put a closed-loop maintenance score in place during the OOOOb build absorb the monitoring upgrade as part of an operating change they were going to make anyway. Operators that do not, end up paying for the monitoring twice.

Second, the supermajor playbook is migrating downstream into the mid-cap operator community. The 5,000+ well reference deployment, the JPT case studies, and the SPE Artificial Lift Conference papers are now in the executive briefing decks at the top quartile of independents. The first-mover advantage on FCF is real and measurable. The second-mover position is competing for the same crude on a higher cost structure.

Third, the workforce demographic does not give operators the luxury of staying in a reactive maintenance model indefinitely. The lease operator with 18 years of basin-specific knowledge who builds the route from intuition is retiring. The replacement, if hired, is running off a structured score by default. The operating model has to be in place before the institutional knowledge walks out the door.

The 33% has been published for a decade. The supermajors have been deploying it for that long. The math is durable. The case studies are peer-reviewed. The integration is one week. The pilot is four. The operating change is permanent.

There is no operator-credible reason to keep absorbing the third of the downtime number that is structurally preventable. The build is no longer multi-year. The proof is no longer disputable. The window to adopt at independent scale, on the same operating model the supermajors productionized a decade ago, is open right now.

Frequently Asked

Where does the 33% preventable-downtime number come from?

Alvarez & Marsal's 2015 whitepaper "The Advantages of Exception-Based Surveillance" reported a 33% reduction in production downtime under exception-based surveillance (EBS), driven by catching equipment degradation early enough to intervene at scheduled cost rather than emergency cost. The number is the maintenance lever inside the broader EBS framework. It has held up across a decade of subsequent supermajor deployments.

Are the supermajor case studies peer-reviewed or vendor marketing?

The ExxonMobil case study (1,300+ Permian wells, 2% production uplift on closed-loop ESP optimization) was presented at the SPE Artificial Lift Conference in 2024. The ConocoPhillips Norway case study (80% reduction in select maintenance activities, 90% in others) and the Chevron Kaybob Duvernay case study (5% LOE reduction year one on closed-loop lift control) were published in JPT (Journal of Petroleum Technology) in 2024. JPT vets case studies editorially before publication. None of the three is vendor marketing.

Does this require new sensors or a SCADA refresh?

No. The 33% number was produced on the data operators already collect. WorkSync deploys read-only onto the existing SCADA, historian, production accounting, EAM, and GIS stack (AVEVA PI, Ignition, Cygnet, eLynx, Enertia, Quorum, Peloton, Maximo, SAP PM, ESRI). Operators on 15-year-old SCADA installations have moved the metric inside a 4-week pilot. New sensors earn their place on individual well ROI math after the score is running, not before.

What is the typical first-year FCF impact for a 500-well operator?

Aggregate first-year impact is consistently in the range of $8M to $15M of annual FCF improvement on the same well count. The composition: workover OPEX drops ($1M to $2M per quarter on the shift from emergency to scheduled work), production deferment recovers ($4M to $8M of annual revenue on a 2 to 4% recovery on 8,000 BOE/day at $65 realized), and lease accounting close compresses with associated working-capital recovery. No new wells drilled. No incremental headcount. No hardware refresh.

Why have most small-to-mid operators not adopted this yet?

Three mistakes account for the gap. (1) Waiting on a two-year data cleanup. The supermajors did not clean their data first, they scored what they had. (2) Buying new sensors before scoring the data already in the historian. Most operations collect ten times more data than they score. (3) Treating predictive maintenance as a separate project from dispatch. A forecast only moves a number when it changes tomorrow's work order. Strip those three friction points and the build is one week to integrate, one month to roll out.

What does the four-week adoption pilot look like for a preventable-downtime program specifically?

Week 0: the controller signs a metric (mean time from anomaly to first field response, deferred production recovered per crew shift in BOE, or emergency-versus-scheduled workover ratio on the pilot route). Week 1: read-only integration onto the existing SCADA, historian, accounting, and EAM stack. Weeks 2-3: scoring loop runs nightly, ranked plan in the truck cab by 6 AM, superintendent adjudicates the top 20 items each morning. Week 4: measure against baseline. If the metric moves past the threshold, the operator signs the annual. If not, the operator walks away. No license fee. No kill fee.

Is the adoption window actually closing?

Yes. Three forces are closing it. The EPA OOOOb methane rule is forcing continuous monitoring on tank fleets through a 2026-2027 compliance window, and operators who put a closed-loop maintenance score in place during the OOOOb build absorb the monitoring upgrade as part of the operating change. The supermajor playbook is migrating downstream into the top quartile of independents now, which means second-mover position is competing for the same crude on a higher cost structure. And the lease operator with 18 years of basin-specific knowledge is retiring, with the replacement running off a structured score by default. The operating model has to be in place before the institutional knowledge walks out.

Request Your Free Trial. 4-week adoption pilot on the stack you already own.

See how WorkSync can transform your operations.

Related Insights

Multiple pumpjacks operating across an upstream lease, the deployed reference for exception-based surveillance at scale
The Proof

60% vs 25%: The Field Productivity Gap the Supermajors Already Closed

A&M, ExxonMobil, ConocoPhillips, and Chevron all documented the same operating model. Three peer-reviewed citations, three operating levers, one durable conclusion: the approach is validated, the gap for small-to-mid operators is adoption speed.

Wellhead operations crew in the field, the deployment surface for a pump-by-priority pilot
The Approach

The 4-Week Pump-by-Priority Pilot: What Actually Happens, Week by Week

Most operators expect a multi-quarter build. The shape of an actual pump-by-priority pilot is one week to integrate read-only onto the existing stack, two weeks to put a ranked plan in every truck cab, and one week to measure against a metric the controller signed on Day Zero.

Field operator capturing observations on a tablet at a wellsite, the data already in the operation
The Approach

You Don't Need More Sensors. You Need a Better Question.

Most AI pitches in oil and gas start with a telemetry refresh. Wrong order. The data is already there. Nobody is scoring it. How to deploy AI on the SCADA, accounting, and EAM stack you already own, without buying a single new field-side sensor.

Field workers in the oil patch
The Proof

From Reactive Firefighting to Proactive Field Execution

The story from the field team's perspective: how daily work changed when operators went from spreadsheets to optimized, prioritized routes.

Industrial processing facility at dusk
The Approach

Closed-Loop Operations: Why Your Best Day Should Be Tomorrow

Most operational systems are open-loop: they generate reports, but never learn from outcomes. Closed-loop optimization retrains nightly.

Digital pumpjack hologram representing AI-enabled operations
The Approach

AI Without Infrastructure Is Just Expensive Noise

Most oil and gas AI projects fail for the same reason: the AI has nowhere to live. You need an operational foundation BEFORE agents can do anything useful. Here are the four layers that matter.