Proposed Validation Protocol

"What TorqueScope would do with access to POD data"

7 MW Samsung SWT-7.0-154 574 SCADA channels Jan 2017 – present

The Levenmouth alarm log provides something none of our existing validation datasets offer: named, timestamped fault events with ORE Catapult attribution. This enables us to measure, for the first time on real offshore data, not just whether TorqueScope detects anomalies — but how many hours before a logged fault it does so.

Comparison

Why Levenmouth is different

TorqueScope has been validated on three open-access wind farms totalling 41 turbines, and on the CARE benchmark's 95 anonymised datasets. Levenmouth adds capabilities that neither source provides.

Capability	Existing Open DatasetsKelmarsh · Penmanshiel · Hill of Towie	CARE Benchmark	Levenmouth LDT
Turbines	41 (Kelmarsh 6 · Penmanshiel 13 · Hill of Towie 20¹)	95 datasets, 3 farms	1 (dedicated R&D asset)
Turbine classes	Senvion MM92 · Senvion MM82 · Siemens SWT-2.3-82	Onshore mixed	Samsung SWT-7.0-154 (offshore-class)
Named farms / operators	✓ Cubico · RES/TRIG	✗ Anonymised	✓ ORE Catapult
Sensors mapped	10–13 per turbine (from 86–655 available)	86–957	574 full SCADA + 1 Hz
Core code changes required	None across all 3 farms	—	None expected
Stop / alarm log	Stopping alarms only (no fault categorisation)	Synthetic events	✓ Categorised fault log, Jan 2017–present
Met mast	✗	✗	✓ 11 sensors, multi-height
Ground truth for earliness	✗ (stop events ≠ fault events)	Synthetic only	✓ Named faults, timestamped
Institutional endorsement	CC-BY-4.0 open data	Academic benchmark	ORE Catapult

¹ T06 excluded — insufficient training data. Penmanshiel T03 also excluded for same reason.

Protocol

The five-step validation protocol

Each step feeds the next. The pipeline requires no bespoke code for Levenmouth — the same codebase validated on Kelmarsh, Penmanshiel, and Hill of Towie processes the LDT data without modification.

Signal Mapping

SCADA 10-min aggregates

From 574 available SCADA channels, TorqueScope identifies the 15–20 signals that map to its standard sensor roles: power output, wind speed, nacelle ambient temperature, main bearing temperatures (front and rear), gearbox oil temperature, generator winding temperatures, converter temperatures, and hydraulic system temperature.

This mapping requires no bespoke code. TorqueScope's signal mapper ingests the channel description file and applies keyword matching against its internal sensor taxonomy — the same process used for Kelmarsh, Penmanshiel, and Hill of Towie. The 7 MW Samsung drivetrain uses a different bearing and gearbox configuration from the 2 MW Senvion fleet, so this step also validates that the sensor taxonomy generalises across turbine classes.

Expected output: A signal mapping table (channel ID → TorqueScope role) and a coverage report showing which sensor categories are populated.

Normal Behaviour Model Training

SCADA 10-min aggregates (healthy baseline)

Using a 12-month rolling window from a period with no logged alarm events, TorqueScope trains its Normal Behaviour Model (NBM): a 200-bin operational grid (20 power bins × 10 ambient temperature bins) storing median and standard deviation of each temperature sensor under normal conditions.

The met mast data adds a significant advantage over the onshore validation datasets: dedicated multi-height wind speed measurements (rather than relying on the nacelle anemometer) produce a cleaner power-wind curve and more accurate operational binning. The standard deviation metric within each bin is computed from the 10-min stdev aggregate field directly — no re-processing is needed.

Expected output: Per-sensor NBM coverage maps showing bin population density, and training residual distributions (target: < 1 °C mean residual per sensor).

Lomb–Scargle Periodic Baseline

SCADA 10-min aggregates (training period)

TorqueScope computes Lomb–Scargle periodograms on each mapped temperature sensor over the 12-month training window. Dominant periodic components are identified (expected: ~24 h diurnal thermal cycle, ~annual seasonal envelope) and the coefficient of variation (CV) is measured across 4 temporal sub-windows. Sensors with CV < 5 % qualify as having a stable periodic baseline that feeds the heuristic scoring stage.

The offshore environment introduces periodic signals absent from onshore datasets: tidal loading on the foundation (approximately 12.4 h M2 tidal period) may be detectable in structural vibration sensors, and sea-state seasonality will modulate thermal gradients differently from land. This makes Levenmouth the first dataset where the Ab Astris cross-domain methodology — validated independently on oceanographic tidal data and wind turbine SCADA — may find its two domains simultaneously present in a single asset.

Expected output: Per-sensor periodogram plots, CV table, periodic baseline coefficients for deployment.

Hybrid Detection Forward Pass

SCADA 10-min aggregates (full dataset)

The trained v5 hybrid pipeline runs forward across the full dataset from Jan 2017 to present: heuristic scoring (LS periodic analysis) + NBM residual scoring + criticality counter (alarm threshold: 72). Every timestamp where the criticality counter reaches threshold is logged as a detection event, recording the timestamp, the dominant sensor signal, and the hybrid score trajectory leading up to alarm.

The 1 Hz data is used selectively for validation rather than routine processing — specifically to examine signal morphology in the 48–72 hour window before each alarm, checking whether faster-timescale precursors are visible that the 10-min aggregates might smooth over. This is an exploratory analysis rather than a core pipeline step.

Expected output: Full detection timeline (criticality trace), list of detection events with timestamps and triggering sensors, anomaly rate by month.

Alarm Log Cross-Validation

Alarm log + Step 4 detection events

TorqueScope cross-references its detection events against the ORE Catapult alarm log. For each logged alarm event, the following metrics are computed:

Lead time — hours between TorqueScope's first criticality threshold crossing and the alarm log entry.
Detection rate — fraction of logged alarm events that had a TorqueScope detection within the preceding 72 hours.
False positive rate — detection events with no corresponding alarm log entry within ±24 hours.
Fault type breakdown — detection performance segmented by alarm category (electrical, mechanical, thermal, control system).

This produces the first ground-truth earliness measurement on real offshore SCADA data outside a competition benchmark. The result is directly comparable to TorqueScope's CARE benchmark performance (current CARE Earliness score: 0.636) but on a named, independently attributable asset.

One acknowledged limitation: as a single turbine, the alarm event count will be small, giving the detection rate and false positive rate lower statistical power than a fleet-scale evaluation. The per-fault-category breakdown mitigates this by grounding each result in a specific, reproducible event.

Expected output: Lead time distribution histogram, detection/false-positive table by fault category, CARE-equivalent earliness score for LDT.

Outputs

What TorqueScope would produce from LDT data

Deliverables upon completion

Fleet Health View

A Levenmouth panel in the TorqueScope Fleet Health mode — weekly anomaly heatmap for the LDT alongside Kelmarsh, Penmanshiel, and Hill of Towie. The first offshore turbine in the fleet view. Sharable with ORE Catapult for their own communications.

Earliness Report

Per-fault-category lead time distribution, with direct comparison to CARE benchmark results. The first published earliness measurement on real offshore fault data from a named UK turbine.

Cross-OEM Validation Note

A technical note documenting the signal mapping from Samsung SWT-7.0-154 SCADA taxonomy to TorqueScope's sensor roles. With Levenmouth integrated, TorqueScope's validated OEM coverage spans four turbine classes: Senvion MM82 (Penmanshiel), Senvion MM92 (Kelmarsh), Siemens SWT-2.3-82 (Hill of Towie, 14/20 turbines meeting NBM calibration criteria with zero core code changes), and Samsung SWT-7.0-154 (Levenmouth) — ranging from 2 MW onshore to 7 MW offshore-class.

Data Requirements

What we need from POD

Three datasets. Approximately 206 MB total. No proprietary software access required.

POD Dataset	Resolution	Period	Size (est.)
SCADA 10-min aggregates	10 minutes	Jan 2017 – Dec 2024	~200 MB
Alarm log	Per-event	Jan 2017 – Dec 2024	< 1 MB
Met mast 10-min	10 minutes	Jan 2017 – Dec 2024	~5 MB

1 Hz SCADA used only for exploratory analysis of pre-fault signal morphology.
Not required for core pipeline validation.
All processing performed on Modal Foundation infrastructure.
No data leaves the analysis environment.

Investment Proposal → Open Demo →