skip to content
Rob The Writer

Irys: Storage, Consensus, and Execution

/ 36 min read

Permanent storage and smart-contract execution usually belong to different parts of the crypto stack.

Storage networks are concerned with data durability: who keeps the bytes, how retrieval works, what economic promise keeps the data available, and how the network verifies that storage is actually happening. Execution chains are concerned with deterministic state transitions: which transactions run, how contract state changes, what fees are paid, and whether every node computes the same result.

Irys is interesting because it tries to make those two concerns part of the same protocol. Data is not only stored; it can become an input to EVM execution. Contracts should be able to read protocol-stored data without forcing large payloads through calldata or relying on an external service to bridge the gap.

That idea sounds simple at the product level. At the protocol level, it touches almost everything: consensus, execution, fork choice, block validation, pricing, mempool admission, P2P chunk movement, hardfork activation, and test infrastructure.

This post builds the context first. It starts with the distinction between consensus and execution, then covers the EVM standards that make compatibility valuable, then compares IPFS, Filecoin, and Arweave as storage models. After that, Irys becomes easier to place, and the engineering work I did on the project becomes less like a list of subsystems and more like one boundary problem: making storage, consensus, and execution agree.

Three pillars — storage, consensus, and execution — with a highlighted boundary band where they must agree, and stored data flowing toward execution input

This is also Part 3 of a broader blockchain retrospective. Part 1 was the Builder Era, Part 2 was Axelar-Solana, and Part 4 is about the patterns I kept seeing.

Consensus and execution

Most blockchain applications make the system feel like one machine. A wallet signs a transaction, a block includes it, and some state changes.

Internally, it is useful to separate two responsibilities:

LayerResponsibilityPlain version
Consensus layerSelect and validate canonical history.Which blocks count, and in what order?
Execution layerApply transactions to state.Given those blocks, what changed?

The consensus layer is responsible for agreement over history. It handles fork choice, block validity, timing, finality, validator or miner rules, and any protocol-specific evidence that a block must carry.

The execution layer is responsible for the state transition. It takes a block, executes its transactions, updates balances and contract storage, emits logs, produces receipts, and computes the resulting state root.

Ethereum made this split especially visible after The Merge. A modern Ethereum node runs a consensus client, such as Lighthouse or Prysm, together with an execution client, such as Geth, Nethermind, Besu, or Reth. The consensus client decides what the head of the chain is. The execution client determines whether the transactions in that head produce a valid state transition.

Consensus layer and execution layer as separate responsibilities, with ordered blocks flowing down and valid payloads flowing up

When these two layers drift, the failure mode is subtle. The node may not crash. Instead, consensus and execution may each continue with a different view of reality. Protocol engineering spends a lot of time preventing exactly that class of quiet divergence.

What lives inside the EVM

The EVM is often described as the virtual machine that runs Solidity contracts. That is correct, but incomplete.

At the runtime level, the EVM is a deterministic state machine. Every node starts with the same prior state, executes the same transaction bytes, and must compute the same next state. A block is valid only if this transition matches the rules.

At the lowest level, the EVM works with:

ComponentRole
StackTemporary 256-bit words used by opcodes.
MemoryTemporary byte-addressed space during a transaction.
StoragePersistent key-value state owned by contracts.
CalldataInput bytes attached to calls and transactions.
LogsEvent output consumed by indexers and applications.
AccountsEOAs and contracts, each with nonce, balance, code, and storage root.
GasThe metering system for computation, memory, storage, and IO.

But the EVM’s practical value also comes from the standards around it:

Standard or conventionWhy it matters
ABIDefines how function calls, arguments, and return values are encoded as bytes.
JSON-RPCGives wallets, scripts, explorers, and applications a common node interface.
PrecompilesExpose native client code at fixed contract-like addresses.
EIP-1559Defines the modern base-fee and priority-fee model.
EIP-2930 access listsLet transactions declare accounts and storage slots up front.
Transaction typesAllow new transaction formats without breaking older assumptions.

Layered EVM diagram: deterministic state transition strip, ephemeral vs persistent resources inside one transaction, and outer tooling standards ring

This is why EVM compatibility matters. It is not just bytecode compatibility. It is compatibility with wallets, libraries, explorers, indexers, signing flows, calldata conventions, gas estimation, and developer habits.

That ecosystem is powerful, but it is also constraining. If a new chain wants EVM tooling to work, it has to respect the shapes that tooling already understands.

The Engine API

Ethereum consensus clients and execution clients communicate through the Engine API. This is not the public JSON-RPC API used by wallets. It is the local, authenticated interface through which the consensus client drives the execution client.

The flow is roughly:

Method familyMeaning
engine_forkchoiceUpdatedThe consensus client tells execution which block is head, safe, and finalized.
engine_getPayloadConsensus asks execution for a payload it has been building.
engine_newPayloadConsensus asks execution to validate a received execution payload.

The important point is directional: consensus drives fork choice; execution validates and applies state transitions.

Sequence diagram of the Engine API: consensus client sends forkchoiceUpdated, requests payloads, and submits newPayload to the execution client

This distinction matters for Irys because Irys has its own consensus layer but still wants Ethereum-shaped execution. In that architecture, Irys consensus has to provide the coordination role that an Ethereum consensus client normally provides. It must tell Reth which chain head to follow, when to build payloads, and which received payloads are valid.

In short: Irys consensus has to keep Reth aligned with Irys fork choice.

Storage networks

“Storage blockchain” is an overloaded phrase. IPFS, Filecoin, Arweave, and Irys are often discussed near each other, but they make different promises.

SystemPrimary promise
IPFSContent addressing and peer-to-peer retrieval.
FilecoinStorage markets with proofs and time-bounded deals.
ArweavePay-once permanent storage with endowment economics.
IrysStorage plus programmable EVM-compatible execution.

Those differences matter because “stored somewhere” is not precise enough — you need to know how data is addressed, who is paid to keep it, what happens if they stop, and whether contracts can compute over it natively.

IPFS is not a blockchain. It is a peer-to-peer content-addressed network: a CID tells you what bytes you want rather than where to fetch them, so any peer can serve the file and the client can verify correctness. What IPFS does not provide is a persistence guarantee — if no peer is hosting a CID, the CID remains valid but the data is unavailable.

Filecoin adds an economic layer on top of that model. Clients still address data by CID; deals record which provider stores those bytes, for how long, and under what proof and collateral rules. Providers must prove ongoing storage, and failure can be penalized. Deals are time-bounded rather than permanent. The two systems were designed to stack: IPFS handles addressing and retrieval, Filecoin handles economic commitment, over the same CIDs.

Arweave takes a different approach: pay once, store permanently. An upfront fee seeds a storage endowment; storage costs are assumed to decline over time; miners are incentivized to keep historical data because access to that history is tied to block production. “Permanent” here is an economic promise — it holds while incentives and participation hold — which Arweave treats as a central design objective rather than a side assumption.

Comparison of IPFS, Filecoin, Arweave, and Irys by persistence promise and native compute

What Irys adds

Irys began in the Arweave ecosystem. As Bundlr, it provided faster and easier uploads into Arweave by batching transactions and supporting payments from multiple chains. Over time, the project moved toward an independent datachain: storage, consensus, and execution in one protocol.

The storage model has two tiers. Permanent storage works through an endowment: a one-time fee funds miners indefinitely through a reserve that declines as physical storage costs do — the same basic logic as Arweave. Term storage is duration-based: the fee is calculated as bytes × replicas × term length, held in the protocol endowment, and distributed to miners when the partition expires at an epoch boundary. Not every application needs permanent storage; rollup inputs, application caches, and short-lived proofs have defined lifetimes, and paying for permanence when a term suffices is economically wasteful.

Both tiers share a property that cloud storage does not: reads are free at the protocol level. Cloud providers charge egress fees because retrieval flows through infrastructure they own and control — it recovers bandwidth costs and makes migration expensive. Decentralized storage is structurally different: data is replicated across independent miners, retrieval is not gated by any single party, and no per-read fee enters the economic model. You pay once to commit data. Reading it back is outside the fee model entirely.

The product thesis is Programmable Data.

The problem is that stored data and smart-contract execution usually live in separate worlds. Storage networks can preserve data, but contracts cannot necessarily compute over that data natively. Execution chains can run contracts, but storing large data directly in calldata or contract storage is prohibitively expensive and architecturally wrong.

Irys tries to bring those worlds together:

ComponentRole
StorageData is a protocol-level object, not an external attachment.
ConsensusThe network agrees on ordering, validity, storage evidence, and economics.
EVM executionDevelopers get familiar contracts, wallets, ABI, and JSON-RPC tooling.
Programmable DataContracts can read stored data through a native execution path.

The goal is not merely “store data on-chain.” The goal is to let applications use stored data as an execution input without forcing that data through calldata or external services.

Irys stack diagram showing permanent storage feeding EVM execution through Programmable Data, with PoW and VDF consensus above

From a protocol-engineering perspective, this creates a difficult boundary problem. Irys consensus has to understand storage proofs, pricing, forks, VDFs, ledgers, miners, and block validity. The execution layer has to remain close enough to Ethereum that existing EVM tooling works.

That boundary is where most of my work happened.

My role

I worked on the Irys project as a blockchain protocol engineer and worked across the consensus and execution boundary for over a year.

The work touched block tree and fork choice, Reth integration, block production, validation, storage-ledger accounting, pricing, Programmable Data, mempool flow, P2P chunk movement, hardfork machinery, and test infrastructure.

This post is not a changelog. It is a technical map of the main systems I worked on and the problems they were trying to solve.

Block tree

PoW-style fork choice is naturally tree-shaped. Two miners can produce competing blocks at similar times. Different peers can observe them in different orders. The network temporarily has multiple possible histories until cumulative work identifies the heavier branch.

In Irys, that structure is not an abstract metaphor. The block tree is the node’s in-memory view of known blocks—main chain and competing forks—together with the logic that ingests discoveries, schedules validation, updates canonical markers, and tells Reth when execution may advance.

Block tree growing left to right from genesis, with competing forks at the same height; each node pairs a consensus-layer (CL) header with an execution-layer (EL) payload

What the block tree stores

The tree is a bounded cache of recently observed blocks. Each entry carries more than a header:

Field on each blockWhy it matters
Sealed block (header + body)Validation and payload building need full block data.
Chain stateWhether the block is on the canonical chain, a validated fork, or still awaiting validation.
Parent → children linksCompeting branches at the same height remain addressable.
Commitment / epoch / EMA snapshotsPricing, storage ledgers, and epoch transitions are derived per parent; children inherit updated snapshots at insert time.

The tree also maintains indices: cumulative difficulty by block hash, blocks grouped by PoA solution hash (to detect duplicate solutions deterministically), and height → hash sets for observability.

On startup, the node rebuilds the cache from the database and block index so it can resume with the same snapshot state a live node would have had at the restored tip. The block index is the durable store (MDBX): headers, transaction metadata, and height/hash lookups that survive restarts. The tree holds recent blocks in memory; migration is the step that copies a canonical block from the tree into the block index once it is deep enough on the main chain to treat as settled.

Chain state versus validation state

Irys separates where a block sits in the fork from whether validation has finished:

Chain stateMeaning
OnchainPart of the canonical chain (confirmed by a descendant on the main branch).
ValidatedPassed validation but may sit on a side fork.
Not onchainKnown to the tree; not yet promoted to canonical.
Validation stateMeaning
UnknownNot yet scheduled for full validation.
ScheduledValidation requested.
ValidFull validation succeeded.

A block can therefore be fully valid and still not be the tip. That is normal during parallel fork validation.

Small block tree showing independent chain-state and validation-state labels on canonical blocks, a validated fork, a scheduled block, and an unknown gossip arrival

Lifecycle: from discovery to execution head

The block tree is the hub for incoming blocks. The path looks like this:

  1. Pre-validated insert — After lightweight checks, a block is added with snapshots derived from its parent, marked for validation, and handed to the validation pipeline.
  2. Validation finished — On success, the block is marked valid. The tree then decides whether fork choice should move.
  3. Concurrent readers — Block production, pricing, and the migrator query canonical snapshots and fork structure without racing the writer.

Block tree lifecycle from discovery through validation, strict cumulative-difficulty tip change, fork-choice markers, and Reth head update

The important fork-choice rule is strict inequality: a challenger replaces the tip only when cumulative difficulty is greater than the old tip’s value, not merely equal. The tree also refuses to move the tip away from the current heaviest block when another candidate ties on work—an invariant added after a pre-launch bug where nodes could briefly canonize the wrong fork.

Fork-choice markers and Reth

When the tip does change, the block tree derives fork-choice markers and sends engine_forkchoiceUpdated to Reth. Three pointers move together, each tied to a different lifecycle stage on the canonical chain:

Engine API fieldIrys source (conceptually)
headCurrent canonical tip after promotion — still in the tree, may still be pruned if a deeper reorg appears.
confirmedMigration boundary — the newest canonical block written to the block index. Blocks above confirmed still live only in the tree until they reach block_migration_depth behind the tip.
finalizedPrune boundary — deep enough that the tree can drop old fork entries without losing data the node still needs.

The distinction between head and confirmed is the distinction between the live fork-choice view and persisted chain state. Reth’s execution head tracks head; APIs, transaction-status lookups, and chunk-indexing workflows lean on the block index behind confirmed.

Reth applies execution state against that view. Irys consensus does not ask execution to guess which fork won.

A tip change can trigger a reorg: orphaned blocks on the old branch are collected, trimmed to the fork point, and broadcast so the mempool, migrator, and other services can unwind and replay against the new canonical suffix. If the reorg reaches back past confirmed, it would require rewriting rows already committed to the block index — so reorgs deeper than the configured migration depth are treated as fatal rather than silently repaired.

Two tips, one tree

Readers should hold two related pointers in mind:

PointerRole
Heaviest known workUsed when walking the canonical-chain view and when reasoning about which branch has the most PoW behind it.
Canonical tipThe validated head—what block production and Reth fork choice treat as current reality.

They usually align after validation catches up. They can diverge briefly while a heavier block is still being validated on a fork, which is why fork-choice updates are tied to the validation-finished path rather than to gossip alone.

Why this is consensus-adjacent

The block tree is not a passive cache. It is where the node chooses local reality: which blocks are canonical, which snapshots pricing and storage logic should use, when execution may advance, and when a reorg must fan out to the rest of the system.

The diagram below isolates the tie-breaking rule. The validation section covers what must pass before a block is allowed to move that pointer.

Block tree fork-choice diagram showing equal cumulative difficulty tips, with a guard that only switches canonical head when the challenger is strictly heavier

Reth integration

Informing Reth as the tip moves

The block tree section above is not only about caching headers. Each time the tree promotes a new canonical tip, it must tell Reth through engine_forkchoiceUpdated. Until that call lands, Reth is still executing against an older branch.

That coupling matters because most of what users treat as “the chain” is execution state: account balances, contract storage, logs, and anything a wallet or JSON-RPC endpoint reads from the node. Those surfaces do not read the block tree directly. They read Reth. When fork choice moves, the worldview those tools present moves with it.

Reth holds the mutable EVM ledger and applies payloads faithfully, but it does not independently choose between competing forks or decide which history is canonical. Irys consensus does—through validation, cumulative work, migration boundaries, and the head / confirmed / finalized markers the tree emits. Execution materializes a branch; consensus authorizes which branch is the one this node exposes outward.

That is the split the integration work had to preserve: Reth runs the EVM; the block tree decides which execution snapshot is the statefully frozen worldview this node presents outward until fork choice moves again.

Two aligned block chains: Irys block tree above with head, confirmed, and finalized markers; Reth below with matching Engine API fork-choice fields and arrows from engine_forkchoiceUpdated

Reth is an Ethereum execution client written in Rust. It was a natural execution-layer foundation for Irys: Rust codebase, Ethereum compatibility, active upstream development, and a real EVM implementation rather than a new runtime invented from scratch.

The challenge was integration surface.

An early approach relied on a heavier fork of Reth. That can move quickly at first, because internal behavior is easy to patch. Over time, however, every upstream upgrade becomes expensive. Local changes conflict with upstream refactors. Internal APIs move. Assumptions that were convenient during initial integration become long-term maintenance costs.

The direction of the rewrite was to reduce that surface area:

Previous shapeTarget shape
Broad Reth fork surfaceSmaller custom node and bridge crates.
Consensus reaching into execution internalsExplicit coordination messages and payload attributes.
Indirect block-production pathsDirect payload building from the Irys block producer.
Scattered fork-choice coordinationBlock-tree-owned head, confirmed, and finalized markers.
Protocol effects mixed with normal transaction flowShadow transactions outside the public mempool.

The rewrite encoded that split in code. Reth receives fork-choice updates and execution payloads only along paths the block tree has already validated and canonized. The block producer requests execution payloads for specific Irys parents; received blocks pass through consensus validation before their execution payloads are submitted to Reth in ways that can affect local execution state.

The goal was not clever glue code. The goal was a narrow boundary that could survive Reth upgrades and make CL/EL disagreement harder to introduce.

Before and after Reth integration: wide fork surface versus narrow bridge with block tree owning fork choice

Shadow transactions

Shadow transactions only make sense once you see how Irys queues work and how money moves between layers.

What a mempool is

A mempool is a node’s local waiting room for transactions that might be included in a future block. It is not canonical — only blocks on the canonical chain are — but it strongly shapes user experience, gossip, and what a block producer can select.

On Ethereum after The Merge, mempools follow the consensus/execution split described earlier. The consensus client tracks consensus-layer objects. The execution client maintains its own transaction pool for EVM transactions the payload builder may include. Each layer has its own admission rules and its own view of what is pending.

Irys follows the same structural split, but both sides are first-class protocol surfaces:

PoolLayerTypical contents
Consensus mempoolIrys consensusData transactions, commitment transactions (stake, pledge, unpledge, unstake), and related storage ingress
EVM mempoolReth executionSigned EVM transactions, including Programmable Data contract calls

A storage upload and a Solidity call therefore do not necessarily enter the same queue. They are validated against different rules, selected on different paths, and only meet again when the block producer assembles an execution payload.

Side-by-side Irys consensus mempool and Reth transaction pool with typical transaction types listed for each

How the consensus mempool works

Pending storage and commitment transactions land here from user submission and peer gossip. The node validates before keeping anything. Pools are capped — when full, cheaper work can be dropped for better-paying work — and some transactions wait on earlier ones (for example, a pledge until the account has staked).

Confirmed blocks update what is still eligible, trigger storage proof work, and prune aged transactions. Reorgs can put orphaned work back through validation. On restart, pending work can be reloaded from disk.

Block production reads this pool directly when choosing what to include, and re-checks fees and balances for the block being built. EVM transactions and protocol balance updates at payload build time use separate paths.

Consensus mempool service and state with user, peer, and chain inputs and block building output

Shared funds across layers

On many post-Merge chains, consensus and execution keep separate economic ledgers. Validator balances live on the beacon chain; application balances live on the EVM. Moving value between them is explicit, infrequent, and treated as a special case.

Irys does not treat storage economics that way. Stakes, pledges, storage fees, treasury balances, and epoch payouts share one protocol balance model that must stay consistent in consensus accounting and visible in EVM state.

The practical consequence is direct. When you pay for term storage or pledge capacity through a consensus-layer transaction, the same account’s spendable balance on the execution side must reflect that commitment. When epoch boundaries trigger ledger expiry payouts, miner rewards, or fee distribution, wallets and contracts must see the results as EVM balance changes — not only as fields in a consensus-only ledger.

Storage participation is therefore not a sidecar to the EVM. It is part of the same monetary reality contracts and users observe.

Protocol effects outside the pools

Some state transitions are not user-signed transactions at all, but they still need to be reflected in EVM state.

Examples include:

  • block rewards
  • storage fee distribution
  • commitment fees
  • ledger expiry payouts
  • permanent fee refunds
  • treasury updates
  • Programmable Data fee updates

These effects are consequences of consensus block processing. A user does not sign them, but contracts and accounts still need to observe the resulting state.

Routing them through either public mempool would blur two separate domains: user-submitted work waiting for inclusion, and protocol-generated accounting that follows deterministically from the parent block. Instead, Irys uses shadow transactions: protocol-generated execution effects attached during payload construction, outside both the consensus mempool and Reth’s transaction pool.

The critical invariant is that shadow transactions must fail closed.

If consensus expects a protocol payment to happen and execution silently fails to apply it, the system can continue while consensus accounting and EVM state diverge. That is one of the most dangerous failure modes in a blockchain client: not a crash, but a silent split in the node’s internal model.

So the rule became straightforward: if a shadow transaction cannot be applied, block production must fail. In the healthy case, this is invisible. In the unhealthy case, it prevents the node from producing a block whose consensus effects and execution state disagree.

Irys CL generates unsigned shadow txs from protocol accounting; block N execution payload merges user txs from Reth pool and shadow txs; Reth applies balance changes with fail-closed guard

Block validation

Block validation is the point where every subsystem has to agree.

A received block is not valid merely because it has a well-formed header. It must satisfy consensus rules, storage rules, economic rules, execution rules, timing rules, and fork-choice constraints.

The validation path has to answer questions like:

QuestionWhy it matters
Is the parent known and valid?A child block is only meaningful relative to a valid parent.
Does the block satisfy the VDF path?Sequential delay is part of consensus timing.
Are storage proofs attached and valid?Storage commitments cannot be separate from block validity.
Are data and commitment transactions ordered correctly?Economic flows depend on deterministic ordering.
Does the Reth payload match the block?Execution state must correspond to the consensus block.
Do shadow transactions match expected protocol effects?Protocol accounting must agree with EVM state.
Are PD chunks available and correctly declared?Contracts cannot read data the protocol cannot supply.

One important change was delaying Reth payload submission until after consensus checks had passed. If a block is invalid at the consensus layer, the execution layer should not apply side effects from that block first and discover the rejection later.

Another important area was validation scheduling and cancellation. The ValidationService does not validate blocks one-at-a-time in a single loop. Incoming ValidateBlock messages enter a priority queue for VDF work. Only one VDF task runs at a time on a rayon thread pool, but that slot is preemptible: if a higher-priority block arrives (or a reorg changes priorities), the running task is cancelled and requeued. Priority is CanonicalExtensionCanonicalForkUnknown, then lower height and fewer VDF steps within each tier. Reorgs trigger reevaluate_priorities on both the running and pending VDF tasks.

After VDF passes, the block moves into a concurrent JoinSet, where many blocks can be in flight at once. Each runs six parallel consensus checks—recall range, proof-of-access, seeds, commitment ordering, data-transaction fees, and shadow-transaction validation—before waiting for its parent to be validated and only then calling submit_payload_to_reth. Invalid blocks never reach Reth; cancelled work can be requeued (with caps on repeated concurrent cancellations).

Validation therefore became more event-driven, explicit about active work, and careful about reporting results back into the block tree.

Shared VDF stage with priority queue, then three parallel copies of the same consensus check pipeline ending in Reth newPayload only when valid

The wrong-fork bug

The most memorable block-tree issue: some production nodes would occasionally select the wrong fork as canonical.

The damage was worse than an immediate crash. A node could accept a tip that had already diverged from the network, keep mining on that local fork, and migrate those blocks into the block index—the durable store that does not admit reorgs past its confirmed boundary. Peers would continue advancing the real chain while the node committed to the wrong history. Often on the order of ten minutes after the initial mistake, it could no longer keep pace with the network and would exit. When several nodes made the same wrong choice at once, they could extend the same divergent fork in parallel before anyone saw the split.

Existing tests passed, and local reproductions did not initially show the issue. The useful step was to stop searching only through logs and build a harness that could exercise the conditions around the bug:

  • multiple local nodes
  • restart schedules
  • laggy peers
  • racing forks
  • storage expiry edges
  • startup recovery paths
  • assertions over protocol state

Once those scenarios were cheap to run, the bug became much smaller. The block tree could replace the current tip without checking that the challenger had strictly more cumulative work.

The fix was a guard. The work was the harness that made the missing guard visible.

That pattern repeated often: the implementation bug was small, but the system needed better machinery to make the state space observable.

Programmable Data

Programmable Data is the execution feature that connects Irys storage to the EVM.

The intended flow is:

  1. Data is stored in Irys.
  2. An EVM transaction declares which stored data it intends to read.
  3. A contract calls a precompile.
  4. The precompile returns bytes from stored chunks.
  5. The contract uses those bytes during normal execution.

This avoids putting large data directly into calldata and avoids outsourcing reads to an external oracle or backend service. Stored data becomes an execution input.

Programmable Data flow from stored chunks through access-list declaration to precompile read and contract use

Addresses, contracts, and precompiles

Every account on the chain is keyed by a 20-byte address. When one contract calls another, the EVM does not jump to a function name — it looks up what lives at the callee address and dispatches from there.

A deployed contract account stores bytecode. A CALL runs that bytecode in the interpreter: opcodes, gas metering, memory, and storage updates all follow the usual EVM rules.

A precompile is different. The node registers native implementation code at a fixed address, the same width as any other account. From Solidity the call still looks like address.call(...), but execution never enters bytecode. The client handles the request directly — for example ecrecover at 0x01 or SHA-256 at 0x02 on Ethereum. That is why precompiles are sometimes described as built-in contracts: they occupy a slot in the address map and accept calls like one, but they run outside the interpreter layer.

Irys assigns Programmable Data to 0x500, a custom address in that same map. The PD precompile is native node code, not deployed bytecode.

EVM address map showing low precompile slots, Irys PD at 0x500, and a deployed contract address

The initial design direction used a custom transaction prefix for PD metadata and fees. That was convenient at the protocol boundary, but inconvenient for the EVM ecosystem. Wallets, ABI encoders, explorers, and libraries expect calldata to begin with a function selector followed by ABI-encoded arguments. A custom prefix would force every tool to understand an Irys-specific dialect.

The better design was to preserve the normal contract-call shape:

NeedDesign
Declare chunk readsEIP-2930 access-list entries.
Include fee metadataTyped access-list keys under the PD precompile address.
Read data from SolidityreadData / readBytes.
Execute efficientlyNative precompile at 0x500.
Keep large reads economically viableCustom gas handling for PD-originated return data.

The transaction access list declares the chunks and byte ranges the contract intends to read. The contract then calls the PD precompile at 0x500. The precompile checks that the requested read was declared, resolves the bytes through the PD context, and returns ABI-encoded data.

From the Solidity side, the intended experience is ordinary:

bytes memory data = IrysPD.readData(0);

The complexity behind that line is substantial: chunk geometry, access-list parsing, fee caps, priority fees, per-block chunk budgets, hardfork gating, P2P chunk fetching, cache lifetimes, precompile gas rules, and EVM memory behavior for large return data.

The purpose of the design is to keep that complexity behind a standard-looking interface.

PD chunk movement

Programmable Data introduces a distributed-systems problem: a validator executing a block may not already have the chunks referenced by a PD transaction.

Rejecting the block simply because the local node does not have the chunk is not viable. At the same time, the node cannot execute a contract read against data it cannot verify.

The PD provisioning path addresses that gap:

  1. A PD transaction enters the EVM mempool.
  2. The node parses the access list and identifies required chunks.
  3. PdService checks local storage and cache.
  4. Missing chunks can be fetched from peers.
  5. Fetched chunks are verified before use.
  6. Validation and block building wait on readiness where required.

A pull-only model can create a thundering-herd problem. When a block containing PD transactions propagates, many validators may request the same chunks at the same time.

To reduce that pressure, later work added optimistic push: nodes can distribute referenced chunks earlier, around mempool time, so peers may already have the needed data before validation reaches the critical path.

This is not merely an optimization. Chunk availability affects validation latency, and validation latency affects consensus health.

PD chunk provisioning at mempool time with local cache, peer fetch, verify, and optional optimistic push before validation

Pricing and fees

Fees are easy to treat as a wallet concern: estimate a price, submit a transaction, pay if it lands. On Irys they were also a consensus concern.

The same storage payment can pass through the public quote API, mempool admission, block production, block validation, treasury accounting, and EVM-visible balance updates. If those boundaries disagree, the failure is not cosmetic. A node can reject a transaction another node would keep, build a block another validator rejects, or update treasury state without the execution layer seeing the same balance movement.

The invariant was simple to state: one fee rule must describe the same transaction at every boundary. The source of truth for that rule is the block tree, not a live pricing service.

Public API and gossip feeding the mempool and block tree, while pricing checks read canonical block history and can replay stored pricing snapshots after a reorg

The product reason for this machinery was stable storage pricing. Most chains quote storage or gas in a native token whose dollar price moves on markets. Irys prices storage in USD — cost per GB, decay, replicas, pledges — and converts to IRYS only at payment time. A user can reason about “about this many dollars for this much data” without mentally repricing storage every time the token moves.

That does not mean every node can ask an oracle whenever it wants. A live price lookup would make fees depend on off-chain timing. Instead, the chain turns market price into chain history: nodes poll external feeds, block producers write oracle-fed IRYS/USD values into block headers, and the protocol derives a smoothed EMA from those headers. If the canonical tip changes during a reorg, the node can replay the new branch with the pricing snapshots stored in that branch. Consensus config defines the USD cost model; oracles only supply the conversion rate.

Public prices intentionally lag. Quotes use the EMA from two intervals ago, so a transaction submitted near an interval boundary still has time to propagate and validate after the rate moves. The smoothing applies to the exchange rate, not to bytes, replicas, term length, or demand.

Timeline showing the tip receiving current oracle data while public fee quotes read the EMA from two intervals earlier

After that, every subsystem reads from the canonical chain view rather than inventing its own price. /v1/price/* reads the canonical tip’s EmaSnapshot, not a live oracle per request. Publish-storage quotes use the same helpers as validation. Commitment quotes also account for pending mempool pledges. PD fee history walks committed blocks and Reth execution state.

The checks get stricter as a transaction moves closer to a block. API ingress can reject data transactions whose fees, balance, or fee split are obviously wrong. Gossip stays lighter and mostly structural, because peers should not reject useful data just because their local price view is slightly behind. Block selection and validation then re-check against the parent block’s EMA, which is the price context the new block actually extends.

Programmable Data followed the same principle through a different path. The Reth pool rejected shadow transactions and zero-fee PD transactions, while minimum cost and base-fee caps were enforced during EVM execution through pseudo-accounts updated by shadow transactions. There was no special “stale price” state. If the canonical fee context moved, an underpriced transaction was skipped or rejected at the boundary that could prove it.

Split columns: consensus data tx stages from API through validation versus PD Reth pool stages through EVM execution

Treasury was the other half of the same problem. Storage fees, commitment locks, expiry payouts, refunds, and PD pricing updates are protocol effects, but users and contracts observe balances through the EVM. Pre-Sprite, treasury was primarily a header field. Post-Sprite, it was also mirrored through TREASURY_ACCOUNT, an EVM pseudo-account. The shadow pipeline made those consensus effects visible to execution, and validation rebuilt the expected shadow list against Reth.

I built this pricing layer end to end: oracle and EMA plumbing, quote paths, mempool checks, block validation, PD base-fee logic, and treasury consistency across consensus and execution.

Horizontal pipeline from block producer through shadow tx phases to header treasury and EVM treasury account, then validation

Hardfork machinery

Protocol changes need precise activation boundaries. This is what hardforks are for.

A hardfork is a backwards-incompatible consensus change. It changes the rules that define whether a block is valid. After the fork, a node still running the previous rule set cannot fully validate the new chain, because the new chain may contain blocks that are invalid under the old rules.

That makes hardforks different from ordinary software releases. A release can improve performance, fix a bug, or change an API without changing the meaning of the blockchain. A hardfork changes protocol state itself: what transactions are allowed, how fields are interpreted, which fees are charged, which accounts are updated, or which proofs are accepted.

Before an activation point, old rules apply. After it, new rules apply. Every subsystem that cares about the rule must agree on the same boundary: mempool admission, block production, validation, execution behavior, APIs, tests, and restart paths.

Two upgraded nodes building blue pre-fork blocks, crossing an activation boundary, then building green post-fork blocks under the new consensus rules

Irys used hardfork machinery for features such as Programmable Data activation, reward-address behavior, ingress proof changes, and other consensus-parameter updates. I was responsible for building the mechanism that enabled those forks, keeping the activation logic maintainable, and making sure new protocol versions could be introduced without turning every subsystem into its own little fork scheduler.

The difficult part is rarely the conditional itself. The difficult part is ensuring the same activation condition is applied everywhere the rule matters. A transaction that is valid at height N + 1 might be invalid at height N. A block producer, validator, mempool, and execution layer all need to answer that question from the same protocol view.

Without that consistency, a hardfork becomes a scheduled disagreement.

P2P and gossip

Blocks and transactions move through gossip under imperfect conditions: peers disconnect, data arrives out of order, duplicate messages appear, restarted nodes re-announce state, and payload sizes vary.

Storage chains make this more complicated because consensus objects and data objects are related but not identical.

A block header is not enough if required storage proof chunks are missing. A PD transaction may be syntactically valid but not yet execution-ready because its referenced chunks are unavailable. Large chunk traffic needs different backpressure than normal transaction gossip.

This work touched bounded fanout, chunk fetching, optimistic chunk push, restart behavior, duplicate proof handling, and the distinction between propagating transaction metadata and propagating data needed for validation.

The recurring requirement was to avoid believing incomplete information too early while still moving data quickly enough for block production and validation.

Testing and observability

The more cross-layer the system becomes, the less sufficient isolated unit tests are.

Unit tests are still valuable for fee math, parsers, ABI behavior, access-list decoding, and small invariants. But fork-choice and validation bugs often span:

  • node startup
  • gossip timing
  • block production
  • validation queues
  • database persistence
  • reorgs
  • restarts
  • missing chunks
  • execution payload timing
  • fork-choice updates

The test harness became important because it made those interactions observable. It allowed local multi-node scenarios, controlled delays, restarts, competing forks, and assertions over protocol state.

The practical lesson was that many consensus bugs are not hard because the final fix is large. They are hard because the system does not yet make the failure state easy to reproduce.

The common thread

These areas can sound separate:

  • block tree
  • Reth integration
  • shadow transactions
  • block validation
  • Programmable Data
  • pricing and fees
  • mempool flow
  • P2P chunk movement
  • hardfork activation
  • test infrastructure

The common thread is agreement.

Consensus must agree with execution. Storage availability must agree with declared reads. Pricing must agree across quote, mempool, producer, validator, and treasury. Gossip must move enough information for validation without treating incomplete state as final. The database must recover the same view after restart. Tests must exercise the situations where those views can diverge.

The dangerous failures are often not the ones that crash immediately. They are the ones where the node continues running while two subsystems have quietly stopped describing the same chain.

Stop here

Irys was not only a storage project, and my work was not only an EVM integration. The interesting part was the boundary between storage, consensus, and execution.

Once stored data becomes programmable, many systems become consensus-adjacent: fee calculation, chunk movement, mempool readiness, execution payload timing, fork choice, restart recovery, and test infrastructure.

The visible feature is a contract reading stored data.

The protocol work is making every layer agree that the read, the payment, the block, and the resulting state all describe the same world.

The code is open source at Irys-xyz/irys, so if you want to go deeper, you can read the implementation directly, contribute to it, and fix any bugs I may have left behind.

Glossary -- terms used above
Key concepts
Consensus layer

The part of a blockchain responsible for agreeing on canonical history: fork choice, block validity, timing, finality, and who gets to extend the chain.

Execution layer

The part responsible for applying transactions to state: accounts, contracts, balances, logs, receipts, and execution payloads.

EVM

Ethereum Virtual Machine. A deterministic smart-contract runtime used by Ethereum and many EVM-compatible chains. Solidity compiles to EVM bytecode.

ABI

Application Binary Interface. The encoding convention that turns function names and arguments into calldata bytes, and return values back into typed data.

JSON-RPC

The public RPC interface used by wallets, scripts, explorers, and apps to talk to execution nodes.

Precompile

Native client code exposed at a contract-like address. Called like a contract, implemented inside the node.

EIP-1559

Ethereum’s modern fee market: protocol base fee, priority fee, and max fee caps. Relevant because many EVM systems inherit the same mental model.

EIP-2930 access lists

A transaction feature that declares accounts and storage slots up front. Irys uses the access-list shape for Programmable Data chunk declarations and fee metadata.

Engine API

The private API Ethereum consensus clients use to drive execution clients: fork-choice updates, payload building, and payload validation.

Mempool

A node’s local queue of transactions awaiting possible block inclusion. On Irys, the consensus mempool holds storage and commitment transactions; Reth’s pool holds EVM transactions. Neither pool is canonical, but both shape production and user experience.

Block tree

Irys consensus’s in-memory DAG of known blocks (main chain and forks). It stores entries and snapshots, drives validation and tip changes, and sends Reth fork-choice updates.

Cumulative difficulty

Total mining work behind a fork. In PoW-style fork choice, the heavier branch wins. Irys moves the canonical tip only when a validated block’s cumulative difficulty is strictly greater than the current tip.

VDF

Verifiable Delay Function. Slow to compute, fast to verify. Useful when a protocol needs a proof that sequential time passed.

PoA chunks

Proof-of-access-style storage proof chunks attached to blocks. If consensus requires storage evidence, a block without the right evidence is not enough.

EMA

Exponential Moving Average. A smoothed value used for pricing so fees do not thrash every block.

Term vs permanent storage

Term storage is time-bounded. Permanent storage is pay-once with a protocol/economic promise to keep data available long-term.

References