readmes: simplified

This commit is contained in:
nym21
2025-12-18 17:10:23 +01:00
parent 549e2da05b
commit c5657b9c31
23 changed files with 812 additions and 2968 deletions

View File

@@ -1,199 +1,46 @@
# brk_reader
High-performance Bitcoin block parser for raw Bitcoin Core block files with XOR encryption support.
High-performance Bitcoin block reader from raw blk files.
[![Crates.io](https://img.shields.io/crates/v/brk_reader.svg)](https://crates.io/crates/brk_reader)
[![Documentation](https://docs.rs/brk_reader/badge.svg)](https://docs.rs/brk_reader)
## What It Enables
## Overview
Stream blocks directly from Bitcoin Core's `blk*.dat` files with parallel parsing, automatic XOR decoding, and chain-order delivery. Much faster than RPC for full-chain scans.
This crate provides a multi-threaded Bitcoin block parser that processes raw Bitcoin Core `.dat` files from the blockchain directory. It supports XOR-encoded block data, parallel processing with `rayon`, and maintains chronological ordering through crossbeam channels. The parser integrates with Bitcoin Core RPC to validate block confirmations and handles file metadata tracking for incremental processing.
## Key Features
**Key Features:**
- **Direct blk file access**: Bypasses RPC overhead entirely
- **XOR decoding**: Handles Bitcoin Core's obfuscated block storage
- **Parallel parsing**: Multi-threaded block deserialization
- **Chain ordering**: Reorders out-of-sequence blocks before delivery
- **Smart start finding**: Binary search to locate starting height across blk files
- **Reorg detection**: Stops iteration on chain discontinuity
- Multi-threaded pipeline architecture with crossbeam channels
- XOR decryption support for encrypted block files
- Parallel block decoding with rayon thread pools
- Chronological block ordering with height-based validation
- Bitcoin Core RPC integration for confirmation checking
- File metadata tracking and incremental processing
- Magic byte detection for block boundary identification
**Target Use Cases:**
- Bitcoin blockchain analysis tools requiring raw block access
- Historical data processing applications
- Block explorers and analytics platforms
- Research tools needing ordered block iteration
## Installation
```bash
cargo add brk_reader
```
## Quick Start
## Core API
```rust
use brk_reader::Parser;
use bitcoincore_rpc::{Client, Auth, RpcApi};
use brk_types::Height;
use std::path::PathBuf;
let reader = Reader::new(blocks_dir, &rpc_client);
// Initialize Bitcoin Core RPC client
let rpc = Box::leak(Box::new(Client::new(
"http://localhost:8332",
Auth::None
).unwrap()));
// Stream blocks from height 800,000 to 850,000
let receiver = reader.read(Some(Height::new(800_000)), Some(Height::new(850_000)));
// Create parser with blocks directory
let blocks_dir = PathBuf::from("/path/to/bitcoin/blocks");
let outputs_dir = Some(PathBuf::from("./parser_output"));
let parser = Parser::new(blocks_dir, outputs_dir, rpc);
// Parse blocks in height range
let start_height = Some(Height::new(700000));
let end_height = Some(Height::new(700100));
let receiver = parser.parse(start_height, end_height);
// Process blocks as they arrive
for (height, block, block_hash) in receiver.iter() {
println!("Block {}: {} transactions", height, block.txdata.len());
println!("Block hash: {}", block_hash);
}
```
## API Overview
### Core Types
- **`Parser`**: Main parser coordinating multi-threaded block processing
- **`AnyBlock`**: Enum representing different block states (Raw, Decoded, Skipped)
- **`XORBytes`**: XOR key bytes for decrypting block data
- **`XORIndex`**: Circular index for XOR byte application
- **`BlkMetadata`**: Block file metadata including index and modification time
### Key Methods
**`Parser::new(blocks_dir: PathBuf, outputs_dir: Option<PathBuf>, rpc: &'static Client) -> Self`**
Creates a new parser instance with blockchain directory and RPC client.
**`parse(&self, start: Option<Height>, end: Option<Height>) -> Receiver<(Height, Block, BlockHash)>`**
Returns a channel receiver that yields blocks in chronological order for the specified height range.
### Processing Pipeline
The parser implements a three-stage pipeline:
1. **File Reading Stage**: Scans `.dat` files, identifies magic bytes, extracts raw block data
2. **Decoding Stage**: Parallel XOR decryption and Bitcoin block deserialization
3. **Ordering Stage**: RPC validation and chronological ordering by block height
## Examples
### Basic Block Iteration
```rust
use brk_reader::Parser;
let parser = Parser::new(blocks_dir, Some(output_dir), rpc);
// Parse all blocks from height 650000 onwards
let receiver = parser.parse(Some(Height::new(650000)), None);
for (height, block, hash) in receiver.iter() {
println!("Processing block {} with {} transactions",
height, block.txdata.len());
// Process block transactions
for (idx, tx) in block.txdata.iter().enumerate() {
println!(" Tx {}: {}", idx, tx.txid());
}
}
```
### Range-Based Processing
```rust
use brk_reader::Parser;
let parser = Parser::new(blocks_dir, Some(output_dir), rpc);
// Process specific block range
let start = Height::new(600000);
let end = Height::new(600999);
let receiver = parser.parse(Some(start), Some(end));
let mut total_tx_count = 0;
for (height, block, _hash) in receiver.iter() {
total_tx_count += block.txdata.len();
if height == end {
break; // End of range reached
}
}
println!("Processed 1000 blocks with {} total transactions", total_tx_count);
```
### Incremental Processing with Metadata
```rust
use brk_reader::Parser;
let parser = Parser::new(blocks_dir, Some(output_dir), rpc);
// Parser automatically handles file metadata tracking
// Only processes blocks that have been modified since last run
let receiver = parser.parse(None, None); // Process all available blocks
for (height, block, hash) in receiver.iter() {
// Parser ensures blocks are delivered in chronological order
// even when processing multiple .dat files in parallel
if height.as_u32() % 10000 == 0 {
println!("Reached block height {}", height);
}
for block in receiver {
// Process block in chain order
}
```
## Architecture
### Multi-Threading Design
1. **File scanner**: Maps `blk*.dat` files to indices
2. **Byte reader**: Streams raw bytes, finds magic bytes, segments blocks
3. **Parser pool**: Parallel deserialization with rayon
4. **Orderer**: Buffers and emits blocks in height order
The parser uses a sophisticated multi-threaded architecture:
## Performance
- **File Scanner Thread**: Reads raw bytes from `.dat` files and identifies block boundaries
- **Decoder Thread Pool**: Parallel XOR decryption and block deserialization using rayon
- **Ordering Thread**: RPC validation and chronological ordering with future block buffering
The parallel pipeline can saturate disk I/O while parsing on multiple cores. For recent blocks, falls back to RPC for lower latency.
### XOR Encryption Support
## Built On
Bitcoin Core optionally XOR-encrypts block files using an 8-byte key stored in `xor.dat`. The parser:
- Automatically detects XOR encryption presence
- Implements circular XOR index for efficient decryption
- Supports both encrypted and unencrypted block files
### Block File Management
The parser handles Bitcoin Core's block file structure:
- Scans directory for `blk*.dat` files
- Tracks file modification times for incremental processing
- Maintains block height mappings with RPC validation
- Exports processing metadata for resumable operations
## Code Analysis Summary
**Main Type**: `Parser` struct coordinating multi-threaded block processing pipeline \
**Threading**: Three-stage pipeline using crossbeam channels with bounded capacity (50) \
**Parallelization**: rayon-based parallel block decoding with configurable batch sizes \
**XOR Handling**: Custom XORBytes and XORIndex types for efficient encryption/decryption \
**RPC Integration**: Bitcoin Core RPC validation for block confirmation and height mapping \
**File Processing**: Automatic `.dat` file discovery and magic byte boundary detection \
**Architecture**: Producer-consumer pattern with ordered delivery despite parallel processing
---
_This README was generated by Claude Code_
- `brk_error` for error handling
- `brk_rpc` for RPC client (height lookups, recent blocks)
- `brk_types` for `Height`, `BlockHash`, `BlkPosition`, `BlkMetadata`