# brk_parser **High-performance Bitcoin block parser for raw Bitcoin Core block files** `brk_parser` provides efficient sequential access to Bitcoin Core's raw block files (`blkXXXXX.dat`), delivering blocks in height order with automatic fork filtering and XOR encryption support. Built for blockchain analysis and indexing applications that need complete Bitcoin data access. ## What it provides - **Sequential block access**: Blocks delivered in height order (0, 1, 2, ...) regardless of physical file storage - **Fork filtering**: Automatically excludes orphaned blocks using Bitcoin Core RPC verification - **XOR encryption support**: Transparently handles XOR-encrypted block files - **High performance**: Multi-threaded parsing with ~500MB peak memory usage - **State persistence**: Caches parsing state for fast restarts ## Key Features ### Performance Optimization - **Multi-threaded pipeline**: 3-stage processing (file reading, decoding, ordering) - **Parallel decoding**: Uses rayon for concurrent block deserialization - **Memory efficient**: Bounded channels prevent memory bloat - **State caching**: Saves parsing state to avoid re-scanning unchanged files ### Bitcoin Integration - **RPC verification**: Uses Bitcoin Core RPC to filter orphaned blocks - **Confirmation checks**: Only processes blocks with positive confirmations - **Height ordering**: Ensures sequential delivery regardless of storage order ### XOR Encryption Support - **Transparent decryption**: Automatically handles XOR-encrypted block files - **Streaming processing**: Applies XOR decryption on-the-fly during parsing ## Usage ### Basic Block Parsing ```rust use brk_parser::Parser; use brk_structs::Height; use bitcoincore_rpc::{Auth, Client}; // Setup RPC client (must have static lifetime) let rpc = Box::leak(Box::new(Client::new( "http://localhost:8332", Auth::CookieFile(Path::new("~/.bitcoin/.cookie")), )?)); // Create parser let parser = Parser::new( Path::new("~/.bitcoin/blocks").to_path_buf(), Some(Path::new("./output").to_path_buf()), rpc, ); // Parse all blocks sequentially parser.parse(None, None) .iter() .for_each(|(height, block, hash)| { println!("Block {}: {} ({} txs)", height, hash, block.txdata.len()); }); ``` ### Range Parsing ```rust // Parse specific height range let start = Some(Height::new(800_000)); let end = Some(Height::new(800_100)); parser.parse(start, end) .iter() .for_each(|(height, block, hash)| { // Process blocks 800,000 to 800,100 }); ``` ### Single Block Access ```rust // Get single block by height let genesis = parser.get(Height::new(0)); println!("Genesis has {} transactions", genesis.txdata.len()); ``` ### Real-world Usage Example ```rust use brk_parser::Parser; use bitcoin::Block; fn analyze_blockchain(parser: &Parser) { let mut total_transactions = 0; let mut total_outputs = 0; parser.parse(None, None) .iter() .for_each(|(height, block, _hash)| { total_transactions += block.txdata.len(); total_outputs += block.txdata.iter() .map(|tx| tx.output.len()) .sum::(); if height.0 % 10000 == 0 { println!("Processed {} blocks", height); } }); println!("Total transactions: {}", total_transactions); println!("Total outputs: {}", total_outputs); } ``` ## Output Format The parser returns tuples for each block: - `Height`: Block height (sequential: 0, 1, 2, ...) - `Block`: Complete block data from the `bitcoin` crate - `BlockHash`: Block's cryptographic hash ## Performance Characteristics Benchmarked on MacBook Pro M3 Pro: - **Full blockchain** (0 to 855,000): ~4 minutes - **Recent blocks** (800,000 to 855,000): ~52 seconds - **Peak memory usage**: ~500MB - **Restart performance**: Subsequent runs much faster due to state caching ## Requirements - Running Bitcoin Core node with RPC enabled - Access to Bitcoin Core's `blocks/` directory - Bitcoin Core versions v25.0 through v29.0 supported - RPC authentication (cookie file or username/password) ## State Management The parser saves parsing state in `{output_dir}/blk_index_to_blk_recap.json` containing: - Block file indices and maximum heights - File modification times for change detection - Restart optimization metadata **Note**: Only one parser instance should run at a time as the state file doesn't support concurrent access. ## Dependencies - `bitcoin` - Bitcoin protocol types and block parsing - `bitcoincore_rpc` - RPC communication with Bitcoin Core - `crossbeam` - Multi-producer, multi-consumer channels - `rayon` - Data parallelism for block decoding - `serde` - State serialization and persistence --- *This README was generated by Claude Code*