Every system in your stack has its own idea of what a well is. Your SCADA calls it a tag (WELL-007-PRESS). Your ERP calls it a cost center (WELL-7-A), sometimes two. Your CMMS calls it an asset (AST-42-007), with a third ID. Your GIS calls it a feature with coordinates and a fourth ID. Before any agent ranks a day’s work, those four systems have to agree on what is being ranked. This is the AI readiness problem most operators don’t realize they have until the integration discovery call.
The chapter you are reading is the most-bookmarked chapter of this guide. Save it. Read it again before you sign with any vendor that promises to deploy AI in your operation. Most AI programs that failed in oil & gas in 2023, 2024, and 2025 failed not because the AI was bad but because the integration was wishful.
The four authoritative sources (and what they each call “a well”)
SCADA, the live operating data. Pressures, flows, temperatures, runtime, dynacard shape, alarm status. The protocols are OPC-UA, MQTT, Modbus, DNP3, and the historian APIs that sit on top of them. The native unit is the tag. A typical mid-tier operator has 50,000 to 500,000 tags. Each tag is attached to a piece of equipment, but the mapping from tag to equipment is often kept in a spreadsheet that one person maintains.
Historians, the durable record of SCADA. AVEVA PI (formerly OSIsoft) is the dominant system; AVEVA Historian (formerly Wonderware), GE Digital Proficy, and Aspentech IP.21 round out the field. Historians compress and archive operating data so you can answer questions like “what was tubing pressure on this well during the failure window six months ago?” The historian is where most ML anomaly models actually live, because it’s where the labeled history of equipment behavior lives.
ERP, the financial truth. Cost centers, accounts payable, royalty owners, working interest, joint-interest billing, AFEs, daily production allocation. Enertia, Quorum, WolfePak, and P2 (now part of IFS following the 2023 acquisition) are the dominant platforms in upstream. The ERP’s native unit is the cost center, which sometimes maps one-to-one with a well, sometimes one-to-many, and sometimes many-to-one when a pad has multiple wells billed under one cost-allocation umbrella.
CMMS, the maintenance and asset hierarchy. IBM Maximo, IFS, eMaint, MaintainX, plus production-specific tools like Peloton WellView. The CMMS’s native unit is the asset, with its own asset ID, its own location hierarchy, and its own work order taxonomy. A well in CMMS may include sub-assets (rod pump, ESP, surface controller) that don’t exist as separate entities in SCADA or ERP.
GIS, the physical topology. ArcGIS dominates upstream and midstream. QGIS and custom geodatabases show up in smaller operators. GIS’s native unit is the feature, with coordinates, a feature class, and a feature ID. Wells, pipelines, batteries, and meters all live as features. GIS is also where regulatory boundaries, surface ownership, and right-of-way data live.
Each of these systems is mature. Each has been refined over decades. Each is the source of truth for its domain. The integration problem is not that any single system is bad. The integration problem is that the four systems have never been forced to reconcile.
The schema reconciliation problem
Take a real well. The pumper knows it as “Smith 12-A.” Across the four systems it shows up as:
Five identifiers for one physical well. Each system updates independently. The naming conventions drift over time, especially after M&A. A well sold from one operator to another may keep its CMMS asset ID but get a new ERP cost center. A well that gets re-completed may get a new SCADA tag prefix. A well that’s renamed for accounting purposes may stay the same physical asset but show up as “new” in three of the five systems.
Without reconciliation, an anomaly the ML model fires on SCADA tag WELL-SMITH-12A-PRESS does not connect to the ERP cost the cost center is accruing, or the CMMS work history that shows the rod pump was changed last quarter, or the GIS location that shows the well is on a lease with a 60-day workover restriction. The agent has the alarm, the agent does not have the context. So the agent can’t rank.
The integration problem isn’t that any system is bad. It’s that the four systems have never been forced to reconcile.
The reconciliation agent is what makes integration work. It ingests the IDs from all four systems, builds a canonical “asset graph” that maps each physical well to all of its system representations, and watches for drift. When a CMMS asset ID changes, the agent flags it and resolves it against SCADA + GIS evidence (location, tag prefix, parent battery). When a new well shows up in SCADA, the agent reconciles it against the ERP authorization-for-expenditure to know which cost center it belongs to. The agent is QA’d by an engineer; the engineer isn’t doing the work.
Read-only by default, why this matters
OT-grade integration starts read-only. We do not write setpoints to PLCs. We do not modify production records. We do not issue control commands to field equipment. The reasons are operational and regulatory. The OT environment runs on a different threat model from the enterprise IT environment, and IEC 62443 zone-and-conduit is the default architecture: each network zone has tightly defined ingress and egress, and the AI platform sits on the IT side consuming a read-only feed.
The buyer should ask exactly what “read-only” means in the vendor’s deployment. The strong version is enforced at the network layer (allowlisted endpoints, unidirectional data flow), not just at the application layer. The strong version uses an on-premises edge connector for strictly isolated deployments (air-gap support). The weak version is a config flag in the integration server that an admin can flip, which is not a security control.
Chapter 5 covers the OT security architecture in depth. For the integration discussion, the relevant fact is that read-only constrains what the agent can do. It can rank work, route crews, generate work orders, and notify supervisors. It cannot close a valve. That is by design.
What “integrate in 1 week” actually means
One number to anchor expectations before the day-by-day. The integration window, the SCADA tap plus schema mapping plus reconciliation, runs about a week. The full GOOD-tier deploy, with success metrics defined and a ranked plan in production for one basin or one module, runs 30 to 60 days. The structured paid pilot framework runs six weeks with a Week-0 success-metric scope and a Week-6 executive readout. The day-by-day below is the integration window, not the full deploy.
- DAY 1–2SCADA tap (read-only). Connect to the historian or directly to OPC-UA. Validate tag inventory. Sample-pull two weeks of historical data for the pilot well set. The deployment’s most fragile dependency is here, the SCADA tag taxonomy. If your tag naming has been refactored twice in five years, this takes longer.
- DAY 3–4ERP / CMMS / GIS schema mapping. Pull the asset hierarchy from each system. Run the reconciliation agent against the SCADA tag inventory. Engineer review of any conflicts. WorkSync ships with 40+ supported integrations across SCADA, ERP, CMMS, GIS, and historian platforms (Enertia, Quorum, Peloton WellView, Oildex, Cygnet, eLynX, AVEVA PI, Inductive Automation Ignition, IFS, Maximo, eMaint, ArcGIS, etc.) — the common stacks are covered.
- DAY 5–6Reconciliation against historical data. Run the canonical asset graph against 30+ days of historical operating, maintenance, and financial data. Validate that ranked-work output makes sense to the superintendent. Adjust scoring weights for basin-specific behavior.
- DAY 7Validation and go-live for the first basin. Field crews receive the first ranked plan. Foreman approves. Pumper’s morning is the new morning.
The caveat: this presumes the 40+ supported integrations cover your stack. If you are running a custom homegrown SCADA layer, an uncommon ERP, or a 1990s mainframe production-accounting system, the integration window stretches to 4 to 12 weeks because the connector doesn’t exist yet and we have to build it. We will tell you that on the discovery call.
Seven questions every CTO should ask any vendor
Including us. These are the questions we expect every operator to ask before granting any integration access. If a vendor can’t answer all seven without hesitation, the integration is not real.
- 01What does your reconciliation agent do, specifically? If the answer is some version of “we map IDs in a config file,” the integration won’t survive M&A, renaming, or a system migration.
- 02Which systems do you have native connectors for, and which require custom work? Get the actual list. If your stack isn’t on the native list, the integration estimate is 4–12 weeks of custom work, not a week of standard deployment.
- 03How is the schema mapping versioned and audited? A schema mapping is data, and data needs a history. If the mapping is “in the system” and there’s no version, the integration is opaque to your audit and impossible to debug when something drifts.
- 04What happens when an asset gets renamed in the source system? The strong answer: the reconciliation agent flags the change, cross-validates against SCADA tag prefix and GIS location, proposes a remap, and surfaces it to an engineer to approve. The weak answer: nothing automatic, the operator notices the next time a report breaks.
- 05Show me the read-only enforcement. Network-layer allowlisting and unidirectional data flow, ideally in writing, ideally with the architecture diagram you give to the security team.
- 06What’s the failure mode if SCADA goes offline? The agent should degrade gracefully, last-known state, queued writes (none in our case, since read-only), and a clear operator alert. The bad answer is “the platform stops working until SCADA comes back,” which means a routine SCADA maintenance window breaks your morning plan.
- 07What’s the data residency, and can we BYOK? Single-tenant deployment with bring-your-own-key (BYOK) or hold-your-own-key (HYOK) is the modern default for OT-adjacent AI. If the vendor pools your data with other customers in a multi-tenant store, your operating data is co-mingled and your negotiating leverage on incident response is gone.
The integration question is also a security question
The seven questions above are mostly integration questions, but three of them (5, 6, 7) are security questions in disguise. That’s the point. Integration architecture and OT security architecture are joined at the hip. Chapter 5 is the security chapter, and it picks up exactly where this one leaves off.
If you take one thing from this chapter: do not skip the integration discovery call. It is the single most diagnostic conversation you will have with any AI vendor in oil & gas. The vendors that can’t answer the seven questions will dodge them. The vendors that can will welcome them.
OT security for AI deployments
IEC 62443 specifics, single-tenant, BYOK, no-write-to-PLC. What the security review looks like and what to ask any vendor before you give them a SCADA tap.