Scaling IOTA Part 1 — A Primer on Sharding

By Hans Moog Last updated Apr 21, 2020

Scaling IOTA Part 1 – A Primer on Sharding

The following blog posts are going to introduce some of the fundamental ideas and concepts behind a new sharding solution for the IOTA Tangle. We will discuss how the proposed mechanism works and why certain decisions were made when designing this solution.

While this topic is still being researched and some of the details might change as research progresses, it is anyway mature enough to be openly discussed and we are looking forward to receive feedback.

Motivation

One of the biggest problems of DLT today is the limited throughput of transactions per second that the networks can process. If the demand is higher than the processing capabilities, then the dynamics of “supply and demand” lead to increased fees and longer confirmation times. This, in turn, lowers the demand by making the network a little bit less “attractive” for its users.

This way of “managing” a network’s resources works pretty well for keeping the network operational, but the dynamic fees are a very real problem for user experience and make it extremely hard to build reliable real-world applications on top of it (see the recent collapse of the MakerDAO system that builds on top of Ethereum).

Because of its zero fees, IOTA obviously does not suffer from these fluctuating costs to issue a transaction. Nevertheless, IOTA nodes do have an upper limit of transactions per second (TPS) that they can process. So how is IOTA going to react to these differences in supply and demand without the network breaking down?

Let’s first have a look at the reasons for these limitations and how other projects address this.

Reasons and workarounds for this limitation

The primary reason for this throughput limitation is the fact that every node in the network needs to process every single transaction, and that hardware capabilities of nodes are limited. To optimize throughput, there are only two options:

Delegate all the computation to a smaller set of very powerful nodes (i.e. Hashgraph, EOS and so on …).
Split the tasks and make each node only perform a subset of the total amount of work.

While the first approach might solve the problem for a while by increasing the throughput to levels that are unlikely to be exceeded in the near future, it does not really solve the problem itself. As adoption rises, the networks will inevitably face the same limitations again. I, therefore, tend to call these solutions micro-optimizations.

The second approach is called sharding and is how, for example, Ethereum tries to tackle the problem of scalability. Instead of running a single blockchain, they are planning to run 64 blockchains in parallel, with every shard having its own validators. These chains are completely independent but are able to interact through another chain that connects all of the shards (the beacon chain).

While this indeed increases the scalability, it still has the same problems of having an upper limit of how many transactions each shard can process. This will, therefore, require the same mechanism of dynamic fees to regulate supply and demand for throughput. So even though this approach is closer to an actual solution (because we can simply increase the number of shards after a few years as adoption rises), it still does not really solve the issues of unpredictable fees and transaction times — in fact, it even introduces some new problems.

Additional problems with sharding

Since the validators have to be split between all existing chains (including the beacon chain), each shard will be secured by fewer validators than before. The security of the system will consequently be lower than in an unsharded environment. This problem exists in any sharding solution since the very idea of sharding is to split the work among the validators.

Blockchains, however, have additional technology-specific problems:

Running multiple chains in parallel requires consensus on the number of shards, and changing this number is not possible “on-the-fly”. To be future-proof it is necessary to split the network into more shards than the current throughput requires, which results in reduced security from day one. It is impossible for the network to organically react to growing adoption or to different network conditions.
Since the throughput in every shard is defined by system parameters like block size and block times, every node needs to fulfill certain hardware requirements, which prevents low-powered IoT devices from taking part in the network.
Since we cannot arbitrarily increase the number of shards (at some point the beacon chain gets overloaded), this solution also does not offer real infinite scalability with the number of nodes.

This traditional way of sharding (by simply running multiple instances of the same technology) does therefore not provide an answer to the vision of IOTA.

IOTAs Vision

IOTA’s vision is to provide a DLT platform that is able to automatically keep up with growing adoption by providing an ever-increasing throughput that scales with the number of nodes in the network. At the same time, the mechanism used has to be flexible and fast enough to react to things like shocks in supply and demand of network throughput, so that the network can stay operational without having to use fees as a way for nodes to decide which transactions to process.

Considering how sharding solutions work today, this seems to be a hard problem, which is not easy to solve and IOTA has consequently received a lot of criticism for publicly committing to these goals without ever revealing how this is supposed to be achieved.

A lot of people still consider this to be an unsolvable problem and blame IOTA for selling snake oil and there is even a video that seemingly “proves” that this problem cannot be solved (we will revisit the described problem later on).

This blog post is now trying to finally shed some light on how IOTA plans to achieve this scalability.

Requirements for IOTA’s Vision

Before starting to introduce the new concepts in the later parts of this blog post, I want to take a short detour and discuss some of the requirements as they directly affect certain design decisions of the proposed scaling solution:

The solution has to incorporate some form of sharding, since micro-optimizations are merely delaying the problems rather than solving it.
The solution should prevent double-spending without relying on some complicated form of inter-shard communication (see the video mentioned above).
The network should only shard if it is really necessary, maintaining as much security of the system as possible in times of low activity but at the same time being able to grow with rising adoption.
The sharding layout has to be dynamic enough to instantly react to changes in the network without nodes having to “talk” to each other to negotiate a new sharding layout or use fees to determine which transactions to favor (agent-centric approach).
The sharding should use a “meaningful mapping” of the real world (i.e. geographic mapping) so that nodes that are close to each other are always able to directly communicate without having to use some complicated form of inter-shard communication.
Nodes should be able to individually decide how much and which data they want to process so that we can have a heterogeneous network with low-powered IoT devices alongside more powerful nodes.
Nodes should be able to freely move around between shards (mobile nodes like cars) without having to suddenly download and validate huge amounts of data.

Conclusion

Due to the feeless nature of IOTA, we have very concrete but complex requirements for our scaling solution, that prevents us from using existing concepts, and we have to come up with a completely different approach that must be vastly more flexible than traditional ways of sharding.

The 2nd part of this blog post will make a step-by-step introduction to this new sharding concept (from the very first abstract ideas to an actual concrete solution).