Neo SPCC published a new article detailing how the team has approached management of blockchain state data in NeoGo. The article acts as a primer on the types of data stored by a Neo node, the challenges in safely deleting old state data, and includes benchmarking results that show how the tail-cutting measures included in NeoGo will impact a node’s performance and DB size.

The team begins with an overview of the different types of data stored by Neo nodes. Some are essential—for example, block headers must always be stored so that the blockchain can be followed and verified, and contract data is required to correctly compute the network’s state transitions between blocks.

Two of the biggest sources of disk usage are block data, the actual transactions sent by users, and Merkle Patricia trie (MPT) data. MPT data mostly mirrors the contract data, but structured in a way that allows for the calculation of a hash value called the Merkle root. The Merkle root of the state data is called the state root, used for checking state correctness between nodes and for the verification of proofs.

Keeping latest state

Unlike the reference Neo node implementation, NeoGo always calculates and stores state using MPT. It allows for the quick detection of inconsistencies between nodes, since the state roots can simply be compared for a given block height—nodes that are correctly processing block data should determine precisely the same root.

Though a net benefit for decentralization (and a sensible default for responsible node operators), MPT data has a habit of ballooning out of control. Neo SPCC explains:

“Given that any block does some state changes, MPT also changes with every block. Each key-value pair change affects all MPT nodes from the leaf changed to the top, for the current tries it’s roughly 10 nodes. This means that many new nodes get created with every block and if you don’t do anything, the DB grows very quickly. In fact, when this functionality was introduced into Neo Legacy (0.76.0 version on NeoGo timeline) the DB size exploded more than 2-fold, and very soon everyone realized that storing all of MPT data is quite expensive.”

The team explored garbage collection schemes, but found them untenable—fully traversing the tree for every cycle made the system almost unusable. The team adopted the approach used in the C# node, reference counting, which led to the implementation of the KeepOnlyLatest setting. With this setting active, the node stores only a single MPT, the most recent one.

With this, the DB size receives almost a full reset back to where it was before state tracking was introduced. To test the change, Neo SPCC ran benchmarks with two machines and two database configurations.

With the more powerful machine, a Ryzen 9 5950X with 64GB RAM, synchronizing a MainNet data dump offline required 12.9 GB of space with LevelDB, increasing to 22.2 GB with BoltDB. With KeepOnlyLatest activated, the sizes reduced to 2 GB and 4.5 GB respectively.

Synchronization speed was also influenced, highlighting some notable differences in backend choices. With LevelDB, sync time increased from 37 to 51 minutes while handling the overhead of removing MPT data, while with BoltDB, the duration reduced from the default 64 minutes to only 28 minutes.

Removing old block data

Following the migration to Neo N3, new parameters were introduced at the protocol level that help with the management of transaction data. MaxValidUntilBlockIncrement makes it possible to make transactions invalid after a certain block is reached, and MaxTraceableBlocks sets a network-wide limit on how far back in the chain’s history an invocation can retrieve data from. This limits how many blocks worth of data the nodes on the network will need to keep available.

In NeoGo, a setting named RemoveUntraceableBlocks is used to trim block data that falls outside the MaxTraceableBlocks limit. On MainNet and TestNet, this value is 2,000,000, so even with the option active, no blocks are removed, but garbage collection logic is still activated, resulting in a measurable performance overhead.

Using the same benchmarking setup as before, the Ryzen 9 machine with LevelDB took 30 minutes without RemoveUntraceableBlocks, increasing to 67 minutes with it turned on. With BoltDB, the sync time changed from 40 to 53 minutes.

The team then tested with more aggressive MaxTraceableBlocks settings which resulted in actual data being removed. Set to 100K, sync time with LevelDB decreased to just below 64 minutes, however the blockchain data store dropped from 12.87 to 3.43 GB. With BoltDB, sync time was reduced to 29 minutes, though the DB remained larger, dropping from 22.15 to 7.29 GB.

At 10K, even more significant space savings were demonstrated. With LevelDB, sync time reduced to 38 minutes and DB size to 1.65 GB, while with Bolt DB the time decreased to 25 minutes with a DB size of 4.45 GB.

Adjusting the MaxTraceableBlocks setting in this way on public networks is not recommended as it is likely to introduce incompatibilities in state. However, the results serve as a demonstration of the reductions in size that can be expected once MainNet and TestNet move beyond 2 million blocks.

Regarding node usage, Neo SPCC issued its recommendations based on the results. LevelDB emerged as the ideal backend for nodes that decide to maintain a full data archive, while BoltDB performs better when cutting the tail and performing read-intensive loads.

The original article and benchmarking results can be read at the following link:
https://neospcc.medium.com/cutting-blockchain-tail-with-neogo-5256a120f6bb