How It Works

Our Methodology

Transparent, deterministic, and explainable. Every insight we deliver is traceable to a specific data source and a documented analytical step.

The Four-Step Intelligence Pipeline

From raw regulatory filings to actionable exploration insights.

01

Data Ingestion

idempotent

We pull data from Texas RRC and EIA sources on a scheduled cadence. Every ingestion job is idempotent — re-running it never creates duplicate records. Each dataset carries a freshness timestamp, and our system tracks the provenance of every field from source to output. The pipeline is built to support additional state regulators as we expand nationally.

Texas RRC EDI RRC Well Permits RRC Production Reports EIA Basin Statistics Historical Scout Archives
02

Canonical Normalization

deterministic

Raw regulatory data is messy. Operator names appear in dozens of variations; well IDs change across filings; formation names differ by era and county. Our normalization pipeline resolves these inconsistencies to a canonical entity graph — one operator, one well, one formation — so analysis is reliable across decades of data.

Operator Resolution
Name variants → canonical entity
Well Canonicalization
API number + alias resolution
Formation Unification
Era and county-specific harmonization
03

Scoring Engine

explainable

Each prospect area receives a composite score from 1–100, derived from weighted sub-scores across five dimensions. Every score is fully decomposable — you can see exactly which factors drove the result and by how much. No black boxes.

Historical Production Density
25%
Permit Activity Velocity
20%
Formation Quality Score
25%
Operator Concentration
15%
Scout Ticket Signal Strength
15%
04

Intelligence Delivery

cited

Scored data feeds our AI summary layer, which generates plain-language narratives for each prospect. Every AI-generated statement is grounded in specific data points — citations are embedded inline, not as afterthoughts. You can follow any claim back to its source record.

Example: "This area has seen 14 permitted horizontal wells in the last 18 months [RRC P-4, 2023–2024], with three operators active in the Wolfcamp A [Scout Tickets #TX-4821, #TX-4822, #TX-5001]. Historical IPs average 412 BOE/day in the target zone [RRC Production, 2018–2022]."

Primary Data Sources

We integrate authoritative public and proprietary data — always attributed.

Texas RRC

Railroad Commission of Texas — the authoritative source for well permits, completions, production, and operator records in Texas. Currently the primary source for our Texas launch. Additional state regulatory sources (OCD, OCC, COGCC) are in the integration roadmap.

EIA

U.S. Energy Information Administration — basin-level production statistics, formation reports, and national drilling metrics.

USGS + Scout Archives

USGS geologic maps and stratigraphy data combined with digitized historical scout ticket archives from Texas county repositories.

Our Commitment

No Black Boxes. Ever.

In an industry where capital decisions run into the millions, you deserve to know why a prospect scored the way it did. Every score, every insight, and every AI summary we produce is fully traceable to the underlying data. If you can't audit it, we won't ship it.

Every score factor is documented and weighted
AI summaries cite specific source records
Data provenance tracked from source to output
Methodology versioned and change-logged
Scores are reproducible and auditable