
Elastic sharding in the Iota network
How the shards are defined is perhaps as important as how the shards themselves prevent doublespends and coordinate in the first place. In this article, I will present a potential solution to this problem. All constructive comments, questions, and criticisms are welcomed, as this is by no means a finalized design or formal specification.
We define a Shard Derivation Function as a function that, given an object hash and a node identity hash, can determine if said object should be stored by the node. This is so that neighbors of a node can determine inexpensively which neighbors to broadcast which transactions to. The definition of such a function is paramount to the functioning of a sharded network.
A naive example of a SDF is to define a constant number of shards, simply modulo the node identity by the number of shards to determine a shard number, and only store objects whose hashes, modulo the number of shards, is equal to the shard number of the node. This naive approach has many flaws. For instance, each shard still grows proportional to the total network size, which means nodes that cannot store a full shard still cannot participate in the network. As well, changing the shard size is very difficult, as it requires a single synchronized event in which nodes have to replace, depending on the number of shards before and after, possibly most of their transactions. Coordinating these shard count changes across a decentralized network is also very difficult in and of itself.
Our proposed solution is to derive a starting hash s from the node identity hash, and then defining a hash range size r, initialized to the maximum hash value (i.e the hash 0xFF..FF in binary systems or ZZ..ZZ in ternary) M on initial node installation and persisted thereafter. Any hash between s and min(M, s+r) or 0 and max(0, s-M+r) is persisted by the node (this allows the range to wrap around the boundaries of the hash space). Additionally, we define a node specific saturation threshold, such as a number of transactions or a certain database size, that indicates that a node does not wish to store any more transactions. When the limit is reached, r is lowered such that the transaction database is under the saturation threshold again, and the new hash range size is broadcasted to all neighbors. The transactions that now lie outside the persistence range are dropped; as the range is only ever decreased from the right, removing transactions is computationally less expensive. This can be likened to a deck of cards, where cards are inserted into the center of the deck; and whenever the deck of cards grows too tall, they are discarded from the top until it is the desired height.
When the node first enters the network, it quickly becomes saturated from the massive number of transactions in existence, and the hash range size plummets as new transactions are seen. As the hash range size decreases, the number of transactions seen and stored decreases too, which in turn causes less transactions to be purged. The hash range size approaches the optimal hash range size, and slowly declines inversely proportional to the size of the network. The saturation threshold can be easily raised or lowered arbitrarily at any time, as the hash range size will automatically adapt. This has some interesting implications for the network: for example, the mean number of replications per transaction will be close to the theoretical maximum given the amount of space available, making maximum utilization of the space available to provide maximum security with the resources available at any given time, unlike current DLTs, which have to account for the lowest common denominator of node while not fully utilizing larger ones. As well, this eliminates permanodes completely: Because permanodes need to be powerful enough and have enough space and bandwidth to receive and store every transaction ever, only very few will be able to afford to run and maintain one, and this could potentially create an enormous bottleneck and centralize permanodes, possibly leading to monopolies on data.
This article omits many implementation details for the sake of simplicity. A few such potential enhancements include: using an expensive hash function for node id hash to increase difficulty of eclipse attacks; rebroadcasting dropped transactions to better ensure transactions are not lost; or each node generating several node identities and then discarding all but the one that is the farthest from every neighbor’s transaction ranges, to help ensure uniform distribution. Additionally, this system requires altruism of nodes. An incentive system to encourage node operators to allocate more space to their nodes than the bare minimum would also be a vitally important component (possible solutions include nodes prioritizing neighbors that can prove to have more transactions stored; but that’s a topic for another day!). However, since this would make running a node viable on nearly any denomination of device, it would encourage users to run their own nodes, which would in turn lead to a homogenization of the network, eliminating the divide between users and node operators. These are all topics for future work; we reiterate that this is by no means a finished protocol, and invite any questions, comments, and criticisms.