Explaining the Qubic Computation Model: Part 1
Ever since we first revealed the Qubic protocol we have been asked to go into more detail about the inner workings of the Qubic Computation Model. This is the first, in a series of blog posts, that will attempt to do exactly that. So fasten your seat belt, it’s going to be an interesting ride!
There is a lot of ground to cover, and we will for now ignore the higher level Qubic protocol concepts like Assemblies, Quorums, or even the IOTA Tangle. Instead, we will first look at how Qubic programs are being executed using a structured functional dataflow language specification called Abra, and discuss the basic concepts used in Abra.
In future installments we will examine Qupla, a Qubic Programming Language, which is the first implementation of the Abra specification, and look at Qupla’s basic entities, functions, expressions, operations, and some programming examples. After that we will examine the part of a Qubic-enabled node (Q-node) that initiates and facilitates the actual processing of a qubic task: the Qubic Supervisor. We will show how the Supervisor is closely intertwined with the Abra processing model.
Abra: a functional dataflow language specification
Disclaimer: the Abra specification is not yet 100% complete. We may introduce additional concepts and changes that we think are useful while we are trying to use the specification for real-world tasks. But overall the concepts are clear. We currently have a working Qupla interpreter and compiler that implements the Abra specification.
Abra specifies an extremely minimal general instruction set that is designed to provide natural mapping of dataflow based functional programs to the wildly varying types of hardware run by IoT devices. That means it can be easily mapped to run on any device from CPUs, to GPUs, to FPGAs, to ASICs. However, Abra is primarily geared towards creating FPGA and ASIC circuitry. It is expected that in the future >99% of all IoT devices will be running on such hardware, and general-purpose CPU/GPU devices will mostly be used for PoC solutions and/or server solutions that target humans. Note that in these articles we will talk about CPU when we mean CPU/GPU and about FPGA when we mean FPGA/ASIC.
To summarize before diving in: Abra is unique in that it employs trinary logic and directly supports wave-pipelining. These are two aspects that allow it to maximize the efficiency of the associated code, which is important for IoT devices due to their limited energy and processing resources. Wave-pipelining of combinatorial circuits has been shown to achieve clock rates of 2 to 7 times those possible for the same circuits with conventional pipelining. Additionally, using trinary-based circuits can result in up to 50% more energy-efficiency due to the denser representation of values. Note that such circuits can even be created with traditional binary NAND gates.
Don’t let the techno-babble of that summary scare you. We will take you by the hand through the concepts and slowly build up from the smallest parts. Let’s start with the most visible difference with most language specifications: Abra specifies trinary code and data.
Binary versus trinary code and data
Binary systems use bits to represent code and data, where each bit can assume only one of 2 possible values. Larger values can be represented as binary numbers by using a series of bits.
Similarly, trinary systems use so-called trits to represent code and data, where each trit can assume only one of 3 possible values. Larger values can be represented as trinary numbers by using a series of trits.
Abra is a trinary system. Trinary systems require less wiring because values can be represented about 1.58 times more efficiently, which translates into a reduction in energy usage.
Abra supports only a single native data type: the trit vector. A trit vector is a consecutive series of trits. That’s it. Trit vectors always have a fixed size. Even a single trit is represented by a trit vector, namely a trit vector that has size 1. There is no way to define variable-sized trit vectors in Abra.
Note that the interpretation of the meaning of each trit in a trit vector is left completely up to the implementation or programmer. The Abra specification is totally agnostic in that regard. All it does is specify how data flows and how to transform it to calculate results in a way that will map to any hardware.
Also note that this means that in Abra there is none of the usual data type mapping to architecture-dependent (multi-)byte boundaries as found in traditional programming languages. The programmer is free to select any trit vector size, and the generated code will reflect this size exactly. Of course it is always possible for a programmer to select trit vector sizes that approximate the usual (multi-)byte sized data types when targeting specific hardware, but that will require intimate knowledge of how trit vectors will be mapped to that specific hardware.
Limiting data type size to match value ranges that are actually used usually achieves a reduction of energy needs by limiting the amount of circuitry actually needed on FPGAs. Imagine a variable that only needs to be able to represent the values 0–10. With most traditional programming languages you would need to use an 8-bit (0 to 255) byte variable at a minimum, where 4 bits (0 to 15) would have sufficed, and the hardware is designed to manipulate bytes at a time anyway. With Abra you can suffice with a trit vector of 3 trits (0 to 26) instead, and on FPGA you end up with only the circuitry necessary to manipulate 3 trits.
Transformation through look-up tables
At the heart of Abra there is a very special kind of static global construct called a look-up table (LUT). A LUT is what is used to transform input data values into output values.
You can view a LUT as a very general instruction, that can be programmed to do all kinds of operations. Because of this, there is no real need for a more complex set of predefined instructions, the way you will find with most assembly or intermediate level languages. A LUT can either emulate the results of such instructions directly, or multiple LUTs combined with some logic can be used to create an equivalent effect. And LUTs have an advantage that most ‘normal’ instructions don’t have: they can take part in data flow. This means that they don’t need an external clock signal to be able to perform. This is the key to a language that supports wave-pipelining. The only ‘clock tick’ is provided by the Supervisor when it provides input data to process.
Note that again, just as with trit vectors, it is up to the implementation or programmer to define the exact meaning of LUTs and the Abra specification is completely agnostic to that. All it does is specify how data flows through a LUT and how it transforms the data to calculate results in a way that will map to any hardware.
LUTs in Abra can take up to 3 input trits. The programmer can explicitly specify for each unique combination of input trit values what single output trit value will be generated. For any unspecified combination of input values, or when any input trit is null, the LUT will return null. Remember, null is absence of data flow. The importance of this will be discussed in the section about merge operations. But first we will look at the anatomy of Abra code.
The Abra code unit layout.
An Abra code unit consists of a sequence of blocks that is divided in 3 sub-sequences:
- A sequence of LUT definition blocks. A LUT definition block contains 27 entries, one for each of the unique possible combinations of 3 input trits. Each entry specifies the output trit value for the corresponding input combination, or a null indicator for undefined combinations. So to be clear, each of the 27 entries in the LUT definition block can assume 4 different values that indicate how the implementation needs to map that input combination to either an output trit or null.
- A sequence of branch definition blocks. A branch definition block is a sequence of sites (instructions) that form a single execution unit called a branch. We will discuss branches in the next section.
- A sequence of external reference blocks. External reference blocks refer to blocks in other Abra code units. Each external reference block specifies the hash of the specific Abra code unit and a sequence of indices of the blocks in that code unit that it refers to. This is how Abra can have library units that contain reusable code.
An Abra code unit is completely self-contained and in the context of Qubic it will be stored as a single trinary encoded data block that can be sent as payload in an IOTA message. The code unit can be identified uniquely by its hash value. This makes it impossible to tamper with code units once they are publicly stored. An Abra code unit and any other code units referenced will always be a self-consistent set, starting with the hash of the ‘main’ unit.
Anatomy of a branch
A branch is akin to a function as it is used in most programming languages. But as you will see there are also some important differences.
A branch is a sequence of Abra instructions called sites. Sites describe the outputs of their associated Abra instructions. They are akin to temporary local variables but they take up no storage space. It would be better to see them as labels referencing their associated instruction output. Sites can be used as inputs to one or more other site instructions. On FPGA implementations sites will be wired directly as inputs to one or more other sites. A site can only use sites that already have generated a value as inputs (data flow). The direction of the graph that is created in this way is always forward, in the direction of the data flow. Sites are referenced by their index in the branch’s sequence of sites.
A branch always takes a single fixed size trit vector as input and it always returns a single fixed size trit vector as output. Exactly what these trit vectors represent is up to the implementation or programmer as usual. Again, the Abra specification is completely agnostic to that, because it only specifies the data flow.
The sites of a branch are grouped in 4 sub-sequences:
- Input sites. They define how the input trit vector of the branch will be partitioned into sub-vectors. In fact, they are nothing more than a sequence of size definitions. Each input site starts where the previous one left off within the input trit vector of the branch. You can view the input trit vector as a concatenation of all sub-vectors as defined by the input sites. Note that when the actual input trit vector is too large the unused remainder will be ignored. And when the actual input trit vector is too small Abra will assume the rest of the necessary trits to be null. This allows for some interesting behaviors that we will discuss later.
- Body sites. They define the intermediate operations necessary for the branch to perform its function. Each site is associated with either a knot or a merge instruction. The input trit vector to these instructions is specified by a sequence of site indices. The interpretation of this sequence depends on the type of instruction.
- Output sites. These are exactly the same as body sites, except they define the final operations to complete the branch’s function. The results of each output site will be concatenated in sequence to form the output trit vector of the branch.
- Memory latch sites. These are also exactly the same as body sites, but they serve a special additional function. They are a way for a branch to store state between invocations. They are defined last, because memory latches will only be updated once the branch has performed its function. This means that, while performing the branch function, any time a memory latch site is referenced the value that memory latch held at the moment when the function was invoked is used. The initial value for a memory latch is always zero.
Remember we said the direction of the graph always being forward w.r.t. data flow? Well, memory latch sites already have a value upon invocation of the branch, which means you can use these sites as inputs anywhere even though they are only defined at the end. So the data flow is still forward. And only after the branch completes execution their site values will be calculated and latched, which again means that data flow is forward. Memory latch sites just have this way of being used in 2 different points of the graph: once at the start as inputs, and once at the end as outputs.
Abra’s knot instruction
The knot instruction is aptly named, because it ties branches together. It’s akin to a function call mechanism but with several differences. A knot instruction takes a sequence of input site indices and concatenates those sites together into an input trit vector that is passed to the knot’s target block, which is specified by an index in the code unit’s block sequence. There are only two types of blocks the concatenated trit vector can be passed to:
- A LUT block, in which case the input trit vector size has to be 1, 2, or 3 trits and the knot site will return the single trit that is associated with that specific combination of input trits for the referred LUT block.
- A branch block, in which case the input trit vector can be of any non-zero size. The concatenated trit vector will be passed as input trit vector to a new instance of the branch and the knot site will return the output trit vector of the branch after invocation.
Note that external blocks are simply indirect references to either LUT blocks or branch blocks, so there’s no real differentiation necessary there. You can use them in knots as if you were referencing them directly.
A branch knot is not a function call in the traditional sense, with a call stack and a single instance of the function that can be called from anywhere. Instead it literally invokes a fresh instance of the branch it refers to.
It helps if you think of a branch as a circuitry block. It can be wired to only one specific input and one specific output. Therefore, any branch referenced through a knot needs to be instantiated as a separate circuitry block and then the input sites of the knot must be wired to the input sites of the instantiated branch, and the output sites of the instantiated branch must be wired to the all the circuitry that uses the knot site as input.
This instantiation of branches has some important ramifications when the branch is stateful because it uses memory latches. Because each branch comes with its own separate set of memory latches. This means that to access the same branch state you need to follow the original instantiation path (or ‘call’ path), which requires some special mechanism that we will discuss in more detail once we get to the description of the Qubic Supervisor.
Abra’s merge instruction
The merge instruction takes one or more input sites that all need to be of equal size. Of all these input sites only one can evaluate to a non-null value. The merge site will return the single non-null input site value, or null when all input sites are evaluating to null. The return value is always of the same size as each input site. Note that having more than one non-null input site is considered a programming error and will cause an exception to be triggered and branch execution to halt without a result.
You can visualize the merge instruction as a wye connection pipe, where a number of input pipes coming together into a single output pipe. Imagine that each input pipe is connected to a different source of volatile chemical fluid. As long as no source tap is open (all null inputs) nothing will come out of the output pipe (null output). If only one source tap is opened (one non-null input), the corresponding chemical fluid will flow unimpeded through the connection into the output pipe without problem (non-null output, same value). But if multiple source taps are opened (multiple non-null inputs), their chemicals will mix in the merger and cause an explosion (exception).
The merge instruction will perform a key function in creating decision logic. Essentially you can set up multiple parallel execution paths (data flows) in a branch. The results of these paths end up as different input sites in the merge instruction, where only the desired result ‘survives’. Because it is essential that at maximum one execution path evaluates to a non-null value you need a way to introduce these null values. That is where LUTs come in handy. In each execution path a LUT can be used as a filter that only allows data to flow through when specific conditions are met. You just need to take care that the conditions that the LUTs are filtering on are mutually exclusive in each execution path.
By the way, this is the only mechanism that allows us to make a decision in Abra. Abra has none of the classical control flow statements. That means there are no if-then-else like decision statements, where only a single execution path is taken depending on a condition and the other is skipped over, nor are there any looping statements like for-loops and while-loops. Such constructs would impede data flow. The absence of these constructs also makes it easy to keep the functions pure and allows us to reason about their correctness when looking at it from the viewpoint of a single invocation of a branch.
Even with the limited amount of mechanisms described so far, it is still possible to create a looping construct. You can set up a branch that has two execution paths, one filtered on the loop continuation condition, and one on the loop end condition. The first path would call a branch that executes one iteration and then calls the loop function recursively. The second path would call a continuation function that resumes execution after the loop is done.
Of course recursion is not always the best solution, especially in the face of branches being instantiated each time when they are called. To be able to instantiate a branch everything that is called by that branch needs to be instantiated as well. This could become a prohibitively large amount of circuitry pretty quickly, depending on the depth of the recursion. For this reason, and also to make sure that it is impossible to get into an endless loop, Abra requires you to specify a maximum depth of branch invocations for a code unit. When this depth limit is reached, instead of invoking another branch Abra will return null for that invocation, so data flow will stop.
Thus far we have only looked at Abra code from the point of view of a single invocation of a branch. But of course it is possible to invoke a branch multiple times. Otherwise it would not make much sense to have the possibility to store state in memory latches. That only makes sense in the face of multiple invocations. So this is where we will start introducing concepts that are meant to facilitate interaction with the Qubic Supervisor.
Reactive dataflow in Abra
The dataflow in Abra is reactive. That means that computational entities respond to discrete changes in input data by recalculating the outputs of all entities that depend on these input data. We call such an event where input data changes an effect. An effect is always sent directly to an environment. An effect can be processed by any computational entity. For an effect to trigger being processed by an entity, this entity first needs to join the corresponding environment. By joining an environment, the entity indicates that it wants to be notified whenever effects happen in that environment, so that it can then process the effect data. We call this mechanism EEE (for Environment, Entity, Effect) and it governs how separate entities interact with each other.
In Abra, you can designate any branch as an entity and have it join one or more environments. An effect which arrives at such a joined environment will be dispatched as input data to the entity by the Supervisor. The effect data will then flow through the entity until an output result is produced. We call such a single flow of data through the entity a wave. A wave is therefore a single iteration through the code that starts by passing the effect data to the entity and ends when an output result is being produced by the entity.
What’s important to note is that a wave is an atomic unit of processing. This means that with Abra we don’t use preemptive task scheduling with its associated context switching overhead. The end of a wave forms a natural context switching point. The atomic nature of a wave also means that there is a much reduced need for synchronization because you don’t have to account for anything that could interrupt a wave.
The output result data produced by an entity can itself be sent as an effect to one or more environments for further processing in the next wave (unless it has been explicitly marked to be delayed). This means that a wave can trigger a cascade of other waves, that will continue until a new stable situation has been reached (i.e. there are no more pending queued effects to be processed in the current cycle). Such a cascade of waves, that is initiated by the arrival of an effect, is what we call a quant. Because the exact amount of waves in a quant and what happens during these waves depends on the implementation, a quant does not have a fixed time period associated with it. But it does embody a complete, consistent set of computations.
Within a quant, the only effects that can trigger a new wave are the effects generated by the computations within the quant itself. Any external effects that occur while processing takes place within a quant are postponed until the start of the next quant. Additionally, internal effects can be delayed any number of quants on purpose.
Note that Abra prevents an endless cascade of waves by requiring entities to specify an invocation limit. When that limit has been reached within a quant, any effect that would cause the invocation limit to be exceeded will be postponed until the start of the next quant. This guarantees that a quant cannot get into an endless invocation loop. At the start of the next quant the invocation counters will be reset and processing of postponed effects may resume.
Applications with time-critical components can use this invocation limit to reduce the length of a quant, or they can specify a specific amount of quants to delay resuming invocation. That allows their time-critical components to service their (external) effects in a timely manner.
Note that we can use EEE to achieve looping in Abra by having the entity generate an effect to an environment that it has joined, which means that the same entity will be triggered again by the effect it has generated. The data associated with the effect can be used as input data for the next iteration. The iteration state is sent as part of the effect data. Looping in this way is much more controlled, because every iteration is done in a single wave. It also is bounded by the invocation limits, which means that execution of loops with large amounts of iterations can be spread out over multiple quants. Note that this also means that infinite loops are definitely possible in this way, but since they are now governed by the Supervisor they will never be able to block the system.
Also note that invoking an entity through EEE means that it starts at the top of the branch instantiation path again, which mean that through careful programming you can make sure to end up in a branch that previously stored state in memory latches and process the effect in the context of that state.
The interactions between the Qubic Supervisor, Abra branch entities, other entities and the way effects are generated and interact with environments will be discussed in-depth after our discussion of how Qupla implements Abra. To facilitate this discussion we first introduce a human readable pseudo-code for Abra.
Abra pseudo-code.
We devised a psuedo-code for Abra so that we can have human readable instructions. Each block type in the code unit has its own specific layout.
// External reference block extern <unit name> hash <unit hash> blocks <blocks>
// <unit name> is the name of the external code unit and only // present for human readability. // <unit hash> is the 81-tryte hash value of the external code unit. // <blocks> is a comma-separated list of blocks to import from // the external code unit.
Note that we have not used external blocks yet because everything is pulled into a single code unit for now, and the above will probably not be the final notation. It’s just mentioned here for completeness.
// LUT definition block lut <table> <name>
// <table> is the lookup table string of 27 entries consisting // of 0, 1, -, and @ (the latter indicating null). // <name> is a symbolic name for the lookup table that can be // used to to reference it by. Which is easier to use // than the block index it actually represents.
// Examples: // -01-01-01-01-01-01-01-01-01 // input trit 0 // ---000111---000111---000111 // input trit 1 // ---------000000000111111111 // input trit 2 lut @10@10@10@10@10@10@10@10@10 not_0 lut 100010001100010001100010001 equal_0
The LUT outputs table is ordered by input combination from—,0–,1–, to 11-, 110, 111 (so counting from -13 to +13, in big-endian balanced trinary encoding). Note that because we always have 27 values in the table, 1 input trit LUTs will have their 3 values replicated 9 times, and 2 input trit LUTs will have their 9 values replicated 3 times.
// Branch definition block branch <name> [<site size>] in <site name> // 1 or more input [<site size>] <site name> = <operation> // 0 or more body [<site size>] out <site name> = <operation> // 1 or more output [<site size>] latch <site name> = <operation> // 0 or more latch
// <name> is the name of the branch, often encoding a return // length and sometimes an offset or other value in the // name to differentiate between different versions. // <site size> is the size of the site's trit vector. // This is only really necessary for input sites, but // it increases understanding of the code immensely. // <site name> is the name of the site, usually P<index> where // <index> is the number of the site within the // branch, starting at zero. But it can also be // a parameter/variable name from the implementation. // <operation> is a <lut knot>, <branch knot>, or <merge> operation // note how each operation employs different braces.
// Lut knot operation <lut name> [ <inputs> ]
// Branch knot operation <branch name> ( <inputs> )
// Merge operation merge { <inputs> }
// <inputs> is a list of site names. We use names instead of indices // because that helps understanding the code better
// Example: branch cmp_3 // compare two 3-trit vectors [ 3] in lhs // 1st 3-trit vector [ 3] in rhs // 2nd 3-trit vector [ 1] P2 = slice_1_0(lhs) // take 1st trit of 1st vector [ 1] P3 = slice_1_0(rhs) // take 1st trit of 2nd vector [ 1] val0 = cmp_0[P2, P3] // compare them [ 1] P5 = slice_1_1(lhs) // take 2nd trit of 1st vector [ 1] P6 = slice_1_1(rhs) // take 2nd trit of 2nd vector [ 1] val1 = cmp_0[P5, P6] // compare them [ 1] P8 = slice_1_2(lhs) // take 3rd trit of 1st vector [ 1] P9 = slice_1_2(rhs) // take 3rd trit of 2nd vector [ 1] val2 = cmp_0[P8, P9] // compare them [ 1] out P11 = sign_0[val0, val1, val2] // output the comparison sign
The above description is deliberately kept concise. You will find that the Abra instructions are so straightforward that understanding what’s going on is a no-brainer.
Conclusion
The Abra specification defines dataflow while staying largely agnostic of implementation by using a very limited set of building block instructions. Abra supports a cooperative multitasking mechanism. This mechanism works in conjunction with the Qubic Supervisor which we will discuss in a future article. We described a pseudo-code notation for Abra instructions that will help us get a feel for the Abra specification itself so that all parts will come together nicely. In the next part of this series we will delve deeper in the Qupla programming language, which implements the Abra specification.