brk_oracle
Version 2
Pure on-chain BTC/USD price oracle. No exchange feeds, no external APIs. Derives the bitcoin price from transaction data alone. Tracks block by block from height 525,000 (May 2018) onward.
Inspired by UTXOracle by @SteveSimple, which proved the concept. brk_oracle takes the same core insight and redesigns the algorithm for per-block resolution and rolling operation. See comparison below.
The signal
People buy bitcoin in round dollar amounts. Each purchase creates a transaction output whose satoshi value depends on the current price:
$100 at $50,000/BTC → 200,000 sats
$100 at $100,000/BTC → 100,000 sats
Thousands of these round-dollar purchases happen every day: $10, $20, $50, $100, $200, $500. Plot every transaction output in a block on a log-scale histogram and clear spikes emerge at each round-dollar amount:
$5 $10 $20 $50 $100 $200 $500 $1k $5k $10k
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
│ ▌ ▌ ▌ ▌
│ ▌ █ █ ▌ █ ▌ █ █ ▌ █
│ ▐█▌ ▐█▌ █▌ █ ▐█▌ ▐█ ▐█▌ ▐█▌ █ ▐█ ▐█▌
│▄▄████▄███████▄▐█▌▄█████▄██▌▄█████▄███▄▐█▌▄▄▄███▄████▄▄
└─────────────────────────────────────────────────────────→
log₁₀(satoshis)
On a log scale, when the price changes all spikes shift together by the same number of bins. A 2x price move always shifts the pattern by ~60 bins, whether bitcoin moves from $1k to $2k or from $50k to $100k:
price × 2 → sats ÷ 2 → shift left by log₁₀(2) × 200 ≈ 60 bins
$50k: ···· █ ···· █ ···· █ ···· █ ····
$100k: ·· █ ···· █ ···· █ ···· █ ······
◄── 60 bins ──►
The spacing between spikes is constant (set by the ratios between dollar amounts). Only the position changes. The oracle detects this pattern and reads the price from where it lands.
How it works
The oracle tracks the price incrementally, block by block, starting from a known seed price. Each new block nudges the estimate. The search window is narrow (about ±10 bins, or ±12%), so the oracle can only follow gradual movement — it cannot jump to an arbitrary price from scratch. This is by design: it makes the algorithm resistant to noise.
For each new block:
1. Filter outputs
Skip the coinbase transaction, and skip every output of a transaction carrying an OP_RETURN: that transaction is protocol machinery, not a dollar-denominated payment, so its payout amounts are not price signal. Then exclude noisy outputs: script types dominated by protocol activity (P2TR by default), dust below 1,000 sats, and round BTC amounts (0.01, 0.1, 1.0 BTC, etc.) that create false spikes unrelated to dollar purchases.
2. Build a log-scale histogram
Each remaining output becomes a bin index in a 2,400-bin histogram spanning 12 decades (1 sat to 10¹² sats):
bin = round(log₁₀(sats) × 200) 200 bins per decade
3. Smooth over recent blocks
A single block has too few outputs for a clean signal. The oracle keeps a ring buffer of the last 12 block histograms and combines them into an exponential moving average (EMA) that weights recent blocks more heavily:
EMA[bin] = Σ weight(age) × histogram[age][bin]
age=0..11
weight(age) = α × (1 − α)^age default α = 2/7 (~6-block span)
The EMA is recomputed from the ring buffer each block. This makes the oracle deterministic: since only the last 12 histograms matter, any oracle started from a known price converges to the exact same state after 12 blocks, regardless of prior history. This is what makes checkpointing and restoring possible.
4. Score with a 19-point stencil
The fixed ratios between round-dollar amounts ($1, $2, $3, $5, ... $10,000) create a fingerprint: a pattern of 19 spikes with known spacing on the log scale. A stencil encodes this spacing as bin offsets from a $100 reference point:
$1 $5 $10 $50 $100 $200 $1k $10k
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
·────────·──────·────────────·─────·─────·──────────·─────────────·
-400 -260 -200 -60 0 +60 +200 +400
bin offsets from the $100 reference point
(19 offsets total)
The oracle slides this stencil across the EMA histogram within the search window. At each candidate position:
- Read the EMA value at all 19 expected spike locations
- Normalize each value by dividing by that offset's peak within the search window — this gives rare amounts like $3 equal voting weight to common amounts like $100
- Sum the 19 normalized values into a single score
The position with the highest score is where the fingerprint best matches the histogram.
5. Convert bin to price
A $100 purchase at price P produces $100 / P × 10⁸ sats, which lands in bin:
bin = log₁₀($100 / P × 10⁸) × 200
= (2 + 8 − log₁₀(P)) × 200
= (10 − log₁₀(P)) × 200
So the stencil's winning position — the bin where $100 purchases land — directly encodes the price:
price = 10^(10 − bin / 200) dollars
Parabolic interpolation between the best bin and its two neighbors refines the estimate to sub-bin precision.
Pipeline
block ──→ filter ──→ histogram ──→ ring buffer ──→ EMA ──→ stencil ──→ best bin ──→ $
outputs 2,400 bins ×12 19-point parabolic
log-scale scoring interpolation
Input
The oracle consumes one pre-built histogram per block via process_histogram(&hist), a [u32; 2400] bin-count array, and returns the updated reference bin.
The caller does the filtering when it builds the histogram. For each block it skips the coinbase, drops every output of a transaction carrying an OP_RETURN, then bins the rest. default_eligible_bin(sats, output_type) (or Oracle::output_to_bin for a non-default Config) applies the per-output rules: excluded script types, dust, and round-BTC values. It returns the bin index, or None for a filtered output.
The initial seed must be close to the real price at the starting height. The crate includes a PRICES constant with exchange prices for every height up to 630,000 to derive a seed from.
Configuration
All parameters via Config with sensible defaults:
| Parameter | Default | Purpose |
|---|---|---|
alpha |
2/7 | EMA decay rate (~6-block span) |
window_size |
12 | Ring buffer depth in blocks |
search_below / search_above |
9 / 11 | Search window around previous estimate (bins) |
min_sats |
1,000 | Dust threshold |
exclude_common_round_values |
true | Filter d × 10ⁿ (d ∈ {1,2,3,5,6}) to prevent false stencil matches |
excluded_output_types |
P2TR | Script types dominated by protocol activity |
Comparison with UTXOracle
UTXOracle by @SteveSimple proved that BTC/USD can be derived purely from on-chain data. Both projects share the same core insight (round-dollar detection via log-scale histogram) but make different engineering choices:
| brk_oracle | UTXOracle | |
|---|---|---|
| Resolution | Per-block (~10 min) + daily candles | Per-run consensus price + per-output intraday scatter |
| Operation | Rolling: EMA over ring buffer, updates each block | Batch: processes a full day from scratch, stateless |
| Algorithm | Single-pass stencil scoring with per-offset normalization | Multi-step: dual stencil → rough estimate → output-to-USD mapping → iterative convergence |
| Stencil | 19 round-USD offsets ($1 to $10k), each normalized to its own peak | 803-point Gaussian + weighted spike template targeting 17 round-USD amounts |
| Round BTC handling | Excluded from histogram entirely | Histogram bins smoothed by averaging neighbors |
| Output filtering | Per-tx OP_RETURN drop, then per-output: script type, dust threshold, round BTC | Per-tx: exactly 2 outputs, ≤5 inputs, no same-day inputs, ≤500-byte witness |
| Validated from | Height 525,000 (May 2018) | December 2023 |
| Language | Rust | Python |
| Dependencies | None (pure computation, caller provides block data) | Bitcoin Core RPC |
| Bins per decade | 200 | 200 |
Accuracy
Tested over 411,251 blocks (heights 525,000 to 949,800, as of May 2026) against exchange OHLC data. Error is measured per block as distance from the oracle estimate to the exchange high/low range at that height. If the oracle falls within the range, the error is zero.
Per-block
| Metric | Value |
|---|---|
| Median error | 0.11% |
| 95th percentile | 0.67% |
| 99th percentile | 1.7% |
| 99.9th percentile | 5.4% |
| RMSE | 0.50% |
| Max error | 33.4% |
| Bias | +0.00 bins (essentially zero) |
| Blocks > 5% error | 472 (0.11%) |
| Blocks > 10% error | 177 |
| Blocks > 20% error | 3 |
Daily candles
Oracle daily OHLC built from per-block prices vs exchange daily OHLC:
| Median | RMSE | Max | |
|---|---|---|---|
| Open | 0.21% | 0.65% | 15.3% |
| High | 0.53% | 1.12% | 28.0% |
| Low | 0.51% | 1.38% | 19.7% |
| Close | 0.24% | 0.73% | 15.4% |
By year
| Year | Blocks | Median | RMSE | Max | >5% | >10% | >20% | Price range |
|---|---|---|---|---|---|---|---|---|
| 2018 | 31,492 | 0.21% | 1.11% | 33.4% | 169 | 109 | 3 | $3,129–$8,488 |
| 2019 | 54,272 | 0.16% | 0.69% | 17.4% | 165 | 53 | 0 | $3,338–$13,868 |
| 2020 | 53,102 | 0.10% | 0.44% | 12.6% | 70 | 6 | 0 | $3,858–$29,322 |
| 2021 | 52,733 | 0.07% | 0.47% | 14.4% | 42 | 9 | 0 | $27,678–$69,000 |
| 2022 | 53,230 | 0.07% | 0.32% | 6.8% | 10 | 0 | 0 | $15,460–$48,240 |
| 2023 | 54,032 | 0.10% | 0.25% | 6.6% | 5 | 0 | 0 | $16,490–$44,700 |
| 2024 | 53,367 | 0.10% | 0.28% | 6.7% | 7 | 0 | 0 | $38,555–$108,298 |
| 2025 | 53,113 | 0.11% | 0.25% | 5.8% | 4 | 0 | 0 | $74,409–$126,198 |
| 2026 | 5,910 | 0.11% | 0.27% | 3.2% | 0 | 0 | 0 | $60,000–$97,900 |
The oracle is only as good as the signal it reads. The largest errors cluster in late 2018: the November price crash fell faster than the narrow search window could follow (33% max error), and on-chain volume was lower then, so the round-dollar pattern was weaker (1.1% RMSE for the year). By 2020 the signal is strong enough for 0.1% median accuracy, and since 2022 no block exceeds 10% error.
Why no outlier smoothing?
Post-hoc smoothing — for example, correcting any block whose price deviates more than 5% from both its neighbors — would improve the aggregate numbers. This is deliberately not done, for two reasons:
- Simplicity: The oracle is a single forward pass with no lookback corrections. Adding smoothing means defining thresholds, neighbor windows, and replacement strategies, all of which add complexity for marginal gain.
- Finality: Each block's price is produced once and never revised (unless the block itself is reorged). Downstream consumers can treat the oracle output as append-only. Smoothing would require retroactively changing already-published prices, breaking that property.
Changelog
v2
Changes from v1:
- OP_RETURN filter: every output of a transaction carrying an
OP_RETURNis now dropped from the histogram. Such transactions are protocol machinery (cross-chain swaps, anchoring) whose payout amounts can form false round-dollar patterns. This was the trigger for the worst price glitches in v1. - P2WSH reactivated: once the OP_RETURN filter removes the protocol noise, P2WSH outputs are usable round-dollar signal again, so they are no longer excluded. P2TR stays excluded.
- Earlier start: on-chain tracking begins at height 525,000 (May 2018) instead of 550,000, adding about 25,000 blocks of history.
VERSION is exposed as a crate constant so downstream consumers can invalidate prices computed by an earlier algorithm.