mirror of
https://github.com/bitcoinresearchkit/brk.git
synced 2026-04-24 06:39:58 -07:00
brk_indexer
Full Bitcoin blockchain indexer for fast analytics queries.
What It Enables
Transform raw Bitcoin blockchain data into indexed vectors and key-value stores optimized for analytics. Query any block, transaction, address, or UTXO without scanning the chain.
Key Features
- Multi-phase block processing: Parallel TXID computation, input/output processing, sequential finalization
- Address indexing: Maps addresses to their transaction history and UTXOs per address type
- UTXO tracking: Live outpoint→value lookups, address→unspent outputs
- Reorg handling: Automatic rollback to valid chain state on reorganization
- Collision detection: Validates rapidhash-based prefix lookups against known duplicate TXIDs
- Incremental snapshots: Periodic checkpoints for crash recovery
Core API
let mut indexer = Indexer::forced_import(&outputs_dir)?;
// Index new blocks
let starting_indexes = indexer.index(&blocks, &client, &exit)?;
// Access indexed data
let height = indexer.stores.txidprefix_to_txindex.get(&txid_prefix)?;
let blockhash = indexer.vecs.block.height_to_blockhash.get(height)?;
Data Structures
Vecs (append-only vectors):
- Block:
height_to_blockhash,height_to_timestamp,height_to_difficulty - Transaction:
txindex_to_txid,txindex_to_height,txindex_to_base_size - Input/Output:
txinindex_to_outpoint,txoutindex_to_value,txoutindex_to_outputtype - Address: Per-type
typeindex_to_addressbytes
Stores (key-value lookups):
txidprefix_to_txindex- TXID lookup via 10-byte prefixblockhashprefix_to_height- Block lookup via 4-byte prefixaddresshash_to_addressindex- Address lookup per typeaddressindex_to_unspent_outpoints- Live UTXO set per address
Processing Pipeline
- Block metadata: Store blockhash, difficulty, timestamp
- Compute TXIDs: Parallel SHA256d across transactions
- Process inputs: Lookup spent outpoints, resolve address info
- Process outputs: Extract addresses, assign type indexes
- Finalize: Sequential store updates, UTXO set mutations
- Commit: Periodic flush to disk
Performance
| Machine | Time | Disk | Peak Disk | Memory | Peak Memory |
|---|---|---|---|---|---|
| MBP M3 Pro (36GB, internal SSD) | 3.1h | 233 GB | 307 GB | 5.5 GB | 11 GB |
| Mac Mini M4 (16GB, external SSD) | 4.9h | 233 GB | 303 GB | 5.4 GB | 11 GB |
Full benchmark data: /benches/brk_indexer
Recommended: mimalloc v3
Use mimalloc v3 as the global allocator to reduce memory usage.
Built On
brk_iteratorfor block iterationbrk_storefor key-value storagebrk_grouperfor address type handlingbrk_typesfor domain types