Technical Reference · AUG-AI-WO-001

AI Well Optimization
ESP Lift

Data requirements mapped to the Augmentify five-phase lifecycle — from initial scoping through to real-time VFD setpoint optimization and predictive failure detection.

Electric Submersible Pump VFD Optimization Predictive Failure Detection Run Life Extension

Document Info

ReferenceAUG-AI-WO-001

Revision1.0

TypeTechnical Ref

Lift TypeESP

Phases01 – 05

Data Requirements Matrix — By Phase

Data Category	PH.01Initiate	PH.02Assess	PH.03Select	PH.04Define	PH.05Execute
Production & Reservoir Data
Fluid Properties & PVT
Completion & Well Architecture
ESP Equipment & VFD Configuration
Real-Time SCADA & Sensor Data
IPR & Well Test Data
Failure & Intervention History
Operational Constraints

Required at this phase

Partially required / preliminary

Production & Reservoir

Fluid & PVT

Equipment & Completion

SCADA & Real-Time

Operations & Constraints

By Phase

ESP Optimization — Data Requirements

ESP wells are the highest-value, highest-risk artificial lift assets in any portfolio. Predictive failure detection requires a minimum of 10–15 documented failure events and two years of hourly sensor data before model training. The data requirements below reflect this reality at each phase.

Phase 01 Initiate Scoping the AI optimization opportunity

Establishes whether AI ESP optimization is technically viable and commercially justified. Data requirements are high-level — enough to confirm instrument coverage, characterize production risk, and frame the optimization objective before committing to detailed assessment work.

Production & Reservoir — Preliminary

Field-level production rates (oil, gas, water) — historical trend
Number of active ESP wells and their operating status
Reservoir drive mechanism and depletion stage
High-level production decline profile
Known production deferrals and ESP downtime history

ESP Equipment — Preliminary

ESP inventory — pump make/model, motor rating, number of stages
VFD presence and frequency operating range per well
Downhole gauge presence and last known calibration
Average historical ESP run life across the asset
SCADA system type and historian availability

Operations & Constraints — Preliminary

Surface facility capacity limits (separator, export, water handling)
Current optimization workflow and ESP monitoring practice
Existing ESP failure tracking and maintenance logs
IT/OT integration landscape and data access restrictions

Key Questions to Answer at Initiate

? How many ESP wells carry downhole gauges?

? What is the current average run life and what is the target improvement?

? What historian/SCADA system holds the time-series data?

? How many documented ESP failure events exist in the last 3 years?

Phase Deliverables

Data Availability Assessment Optimization Opportunity Statement ESP Well Inventory Project Charter

Phase 02 Assess Data audit, quality assessment & model feasibility

Full data landscape audit. Every source, quality level, and gap is documented. This phase determines whether sufficient historical data — particularly failure events and sensor continuity — exists to train reliable production optimization and predictive failure models.

Production & Reservoir

Per-well production rates — oil, gas, water (daily and hourly)
Wellhead and flowing tubing head pressure (FTHP) history
Casing head pressure (CHP) history
Flowing bottomhole pressure — measured or calculated from intake gauge
Static bottomhole pressure and reservoir pressure surveys
Production allocation methodology and well test frequency
Reservoir depletion history and pressure-production trends

Fluid Properties & PVT

Oil API gravity and viscosity by well / zone
GOR and WOR histories
Bubble point pressure — critical for ESP intake pressure management
Full PVT analysis — Bo, Rs, µo, µg
H2S and CO2 content (impacts motor insulation and material selection)
Produced water chemistry and salinity

ESP Equipment & Completion

Pump model, stage count, and manufacturer performance curves
Motor rating (hp, voltage, current), cable size and length
VFD make/model and full frequency operating range
Downhole gauge depth, make, last calibration date
Tubing size, packer depth, and completion schematic
Wellbore trajectory (MD/TVD) for deviated/horizontal wells
Setting depth and pump intake pressure design point

Real-Time & SCADA — Audit

Historian system type (OSIsoft PI, Aspen IP21, Wonderware, etc.)
Tag inventory — motor current, frequency, intake/discharge pressure, motor temp, vibration
Data completeness — % tag uptime over last 3 years per well
Known sensor drift events and calibration records
Timestamp integrity and timezone consistency
OT/IT network architecture and data access pathway

IPR & Well Test Data

Well test history — frequency, method, last test date per well
Productivity index (PI) per well
Skin factor from pressure transient analysis (PTA)
Inflow performance relationship (IPR) curves
Reservoir permeability from core or well test

Failure & Intervention History

ESP failure log — date, failure mode, run life, operating conditions at failure
Failure mode breakdown: mechanical seal, motor burnout, gas lock, scale, sand ingestion
Pre-failure SCADA signatures available for each event
Workover and ESP changeout history per well
Sand, scale, and corrosion event records

Data Quality Thresholds for Model Viability

✓ Minimum 2 years of hourly production and sensor data per well

✓ >85% tag uptime on motor current, frequency, and intake pressure

✓ Individual well tests at minimum quarterly frequency

✓ At least 10–15 ESP failure events with pre-failure SCADA signatures

⚠ Fewer than 10 failure events requires synthetic augmentation or anomaly detection approach

Phase Deliverables

Data Audit Report Data Quality Assessment Gap Analysis & Remediation Plan Instrumentation Requirements Model Feasibility Assessment Risk Register

Phase 03 Select Model architecture, data pipeline & technology stack

Audit findings drive the model architecture choice, data pipeline design, and technology stack. For ESP optimization, the key architectural decisions are: supervised failure prediction vs. unsupervised anomaly detection (driven by failure event count), and VFD setpoint optimization approach based on available pump curve and downhole gauge coverage.

Data Pipeline & Integration

Historian API access and query performance benchmarking
Real-time streaming capability — OPC-UA, REST, or MQTT from SCADA
Data lake or cloud storage target (Azure, AWS, on-prem)
ETL requirements and data transformation specifications
Latency requirements for VFD setpoint recommendation loops
Cybersecurity and OT segmentation constraints

Training Dataset Definition

Feature set — motor current, frequency, intake pressure, temperature, vibration, production rate
Label definition — failure event (binary/time-to-failure) and production rate targets
Train / validation / test split strategy (temporal, not random)
Handling of missing data and known sensor outage periods
Time-windowing strategy for failure prediction horizon (7/14/30 day)
Cross-well vs. per-well model approach based on equipment homogeneity

ESP Physics Inputs

Full pump performance curves from manufacturer (H-Q, efficiency, power)
Motor current vs. frequency operating envelope for VFD optimization
Cable resistance and voltage drop calculations at operating temperature
Nodal analysis model inputs for physics-informed hybrid models
Bubble point pressure for intake pressure management constraints

Constraint Data for Optimization

VFD operating envelope — min/max frequency, ramp rate limits
Motor overload protection thresholds
Surface facility capacity constraints per stream
Intake pressure floor — must stay above bubble point
Regulatory production caps or injection limits

Architecture Selection Drivers

→ >15 failure events → supervised failure prediction (LSTM or gradient boosting)

→ <10 failure events → unsupervised anomaly detection approach

→ Full downhole gauge coverage → pump curve-based setpoint optimization

→ OT constraints → edge inference preferred over cloud round-trip

Phase Deliverables

AI Model Architecture Decision Data Pipeline Design Feature Engineering Specification Technology Stack Selection Vendor / Platform Assessment

Phase 04 Define Data preparation, model development & validation baseline

All data is cleaned, structured, and used to build and validate the AI model. This is the most data-intensive phase — every sensor feature must be engineered, every failure event labeled, and model performance validated against held-out field data before any deployment decisions are made.

Cleaned & Labeled Training Dataset

Cleaned time-series — outliers removed, dropout periods flagged
ESP failure events labeled with full operating conditions at time of failure
Production uplift labels from historical VFD frequency changes
VFD setpoint change records correlated with production and motor response
Synchronized multi-source dataset at consistent timestamp resolution
Normalization and scaling applied per feature

Validation & Benchmarking Data

Held-out well test data for production rate prediction validation
Known failure events withheld from training for failure model testing
Historical optimization decisions for recommendation engine backtesting
Physics-based pump curve outputs for hybrid model calibration
Operator log entries correlated with anomaly events

Reservoir & Fluid Updates

Latest PVT data — any fluid sample updates since Assess phase
Updated reservoir pressure from most recent surveys
Revised IPR curves post any stimulation or workover
Updated water cut trends for each well

Acceptance Criteria Data

Baseline KPIs: current production efficiency, deferral rate, ESP run life
Model performance thresholds agreed with operations (accuracy, recall)
Operator trust metrics — acceptable human override frequency
Integration test data for SCADA / DCS write-back validation

Model Acceptance Gate — Data-Driven Criteria

✓ Production rate prediction MAPE <10% on validation wells

✓ ESP failure prediction recall >80% at 14-day forward horizon

✓ Optimization recommendations validated against >6 months backtest

✓ Real-time data latency <5 minutes end-to-end

Phase Deliverables

Cleaned Training Dataset Trained & Validated AI Model Model Performance Report Data Pipeline (Production-ready) Backtesting Results Deployment Specification

Phase 05 Execute Deployment, live inference & continuous learning

The model deploys into the live production environment. Data requirements shift from historical to real-time — live sensor feeds drive inference, VFD setpoint recommendations are generated continuously, and actioned outcomes feed back into the continuous learning loop to improve performance over time.

Live Real-Time Inference Feeds

Streaming motor current, frequency, intake pressure, motor temp, vibration — <5 min latency
Live production rates (or well-test-corrected allocation)
Wellhead pressure and FTHP per well
VFD setpoint commands — actioned vs. recommended tracking
Surface facility real-time constraints (separator level, export pressure)

Continuous Learning Data

Operator override log — when and why VFD recommendations were rejected
Production response data post-setpoint change (actioned recommendations)
New failure events labeled in real time as they occur
Monthly well test updates to refresh production allocation
Model drift monitoring — prediction error trending over time

ESP Change Tracking

ESP changeout records — new pump model, stages, motor rating, run date
VFD range updates following equipment changes
Workover outcomes — new completion parameters and post-workover IPR
Downhole gauge replacements and new calibration baseline

Performance & Value Tracking

Production uplift attributed to AI recommendations (vs. baseline)
Production deferral avoided through predictive failure alerts
ESP run life improvement vs. pre-deployment baseline
Operator adoption rate and recommendation acceptance trending
Model retraining triggers and schedule

Ongoing Data Governance Requirements

↻ Monthly well tests mandatory to maintain allocation accuracy

↻ Sensor calibration checks quarterly — flag drift to model ops team

↻ Model retrain trigger: prediction MAPE >15% for 2 consecutive weeks

↻ All ESP changeouts must be logged within 48 hours of execution

↻ Annual full data audit to assess model refresh requirements

Phase Deliverables

Deployed ESP Optimization Model Live Data Pipeline Operator Dashboard Failure Detection Module Performance Monitoring Report Continuous Learning Framework Data Governance Protocol

AI Well OptimizationESP Lift

ESP Optimization — Data Requirements

AI Well Optimization
ESP Lift