mirror of
https://github.com/bitcoinresearchkit/brk.git
synced 2026-04-25 07:09:59 -07:00
153 lines
4.7 KiB
Markdown
153 lines
4.7 KiB
Markdown
# brk_parser
|
|
|
|
**High-performance Bitcoin block parser for raw Bitcoin Core block files**
|
|
|
|
`brk_parser` provides efficient sequential access to Bitcoin Core's raw block files (`blkXXXXX.dat`), delivering blocks in height order with automatic fork filtering and XOR encryption support. Built for blockchain analysis and indexing applications that need complete Bitcoin data access.
|
|
|
|
## What it provides
|
|
|
|
- **Sequential block access**: Blocks delivered in height order (0, 1, 2, ...) regardless of physical file storage
|
|
- **Fork filtering**: Automatically excludes orphaned blocks using Bitcoin Core RPC verification
|
|
- **XOR encryption support**: Transparently handles XOR-encrypted block files
|
|
- **High performance**: Multi-threaded parsing with ~500MB peak memory usage
|
|
- **State persistence**: Caches parsing state for fast restarts
|
|
|
|
## Key Features
|
|
|
|
### Performance Optimization
|
|
- **Multi-threaded pipeline**: 3-stage processing (file reading, decoding, ordering)
|
|
- **Parallel decoding**: Uses rayon for concurrent block deserialization
|
|
- **Memory efficient**: Bounded channels prevent memory bloat
|
|
- **State caching**: Saves parsing state to avoid re-scanning unchanged files
|
|
|
|
### Bitcoin Integration
|
|
- **RPC verification**: Uses Bitcoin Core RPC to filter orphaned blocks
|
|
- **Confirmation checks**: Only processes blocks with positive confirmations
|
|
- **Height ordering**: Ensures sequential delivery regardless of storage order
|
|
|
|
### XOR Encryption Support
|
|
- **Transparent decryption**: Automatically handles XOR-encrypted block files
|
|
- **Streaming processing**: Applies XOR decryption on-the-fly during parsing
|
|
|
|
## Usage
|
|
|
|
### Basic Block Parsing
|
|
|
|
```rust
|
|
use brk_parser::Parser;
|
|
use brk_structs::Height;
|
|
use bitcoincore_rpc::{Auth, Client};
|
|
|
|
// Setup RPC client (must have static lifetime)
|
|
let rpc = Box::leak(Box::new(Client::new(
|
|
"http://localhost:8332",
|
|
Auth::CookieFile(Path::new("~/.bitcoin/.cookie")),
|
|
)?));
|
|
|
|
// Create parser
|
|
let parser = Parser::new(
|
|
Path::new("~/.bitcoin/blocks").to_path_buf(),
|
|
Path::new("./output").to_path_buf(),
|
|
rpc,
|
|
);
|
|
|
|
// Parse all blocks sequentially
|
|
parser.parse(None, None)
|
|
.iter()
|
|
.for_each(|(height, block, hash)| {
|
|
println!("Block {}: {} ({} txs)", height, hash, block.txdata.len());
|
|
});
|
|
```
|
|
|
|
### Range Parsing
|
|
|
|
```rust
|
|
// Parse specific height range
|
|
let start = Some(Height::new(800_000));
|
|
let end = Some(Height::new(800_100));
|
|
|
|
parser.parse(start, end)
|
|
.iter()
|
|
.for_each(|(height, block, hash)| {
|
|
// Process blocks 800,000 to 800,100
|
|
});
|
|
```
|
|
|
|
### Single Block Access
|
|
|
|
```rust
|
|
// Get single block by height
|
|
let genesis = parser.get(Height::new(0));
|
|
println!("Genesis has {} transactions", genesis.txdata.len());
|
|
```
|
|
|
|
### Real-world Usage Example
|
|
|
|
```rust
|
|
use brk_parser::Parser;
|
|
use bitcoin::Block;
|
|
|
|
fn analyze_blockchain(parser: &Parser) {
|
|
let mut total_transactions = 0;
|
|
let mut total_outputs = 0;
|
|
|
|
parser.parse(None, None)
|
|
.iter()
|
|
.for_each(|(height, block, _hash)| {
|
|
total_transactions += block.txdata.len();
|
|
total_outputs += block.txdata.iter()
|
|
.map(|tx| tx.output.len())
|
|
.sum::<usize>();
|
|
|
|
if height.0 % 10000 == 0 {
|
|
println!("Processed {} blocks", height);
|
|
}
|
|
});
|
|
|
|
println!("Total transactions: {}", total_transactions);
|
|
println!("Total outputs: {}", total_outputs);
|
|
}
|
|
```
|
|
|
|
## Output Format
|
|
|
|
The parser returns tuples for each block:
|
|
- `Height`: Block height (sequential: 0, 1, 2, ...)
|
|
- `Block`: Complete block data from the `bitcoin` crate
|
|
- `BlockHash`: Block's cryptographic hash
|
|
|
|
## Performance Characteristics
|
|
|
|
Benchmarked on MacBook Pro M3 Pro:
|
|
- **Full blockchain** (0 to 855,000): ~4 minutes
|
|
- **Recent blocks** (800,000 to 855,000): ~52 seconds
|
|
- **Peak memory usage**: ~500MB
|
|
- **Restart performance**: Subsequent runs much faster due to state caching
|
|
|
|
## Requirements
|
|
|
|
- Running Bitcoin Core node with RPC enabled
|
|
- Access to Bitcoin Core's `blocks/` directory
|
|
- Bitcoin Core versions v25.0 through v29.0 supported
|
|
- RPC authentication (cookie file or username/password)
|
|
|
|
## State Management
|
|
|
|
The parser saves parsing state in `{output_dir}/blk_index_to_blk_recap.json` containing:
|
|
- Block file indices and maximum heights
|
|
- File modification times for change detection
|
|
- Restart optimization metadata
|
|
|
|
**Note**: Only one parser instance should run at a time as the state file doesn't support concurrent access.
|
|
|
|
## Dependencies
|
|
|
|
- `bitcoin` - Bitcoin protocol types and block parsing
|
|
- `bitcoincore_rpc` - RPC communication with Bitcoin Core
|
|
- `crossbeam` - Multi-producer, multi-consumer channels
|
|
- `rayon` - Data parallelism for block decoding
|
|
- `serde` - State serialization and persistence
|
|
|
|
---
|
|
|
|
*This README was generated by Claude Code* |