Ever since the release of our open source single binary “dfuse for EOSIO“, we have received many requests to explain the foundational architecture that it encompasses. In this video, Alex continues his review of bstream – the block stream which is the low-level component of dfuse where all information passes through protobuf definitions to build dfuse block objects to be consumed.

 

Transcript:

What’s up everybody, welcome to another episode of the bstream machinery. In this video we’re gonna cover the forkable and a forkdb implementation which is sort of a crux in the platform. This is low-level. If you want to, feel free to go to the next more high-level videos.

This is about the internals of the dfuse architecture, and I’m gonna start right away with looking at the bstream repository. Where we see that there’s a few handlers, forkable notably, and in there lives the forkdb implementation.

Now, the forkdb implementation is something you have also natively in nodeos, but this one is out-of-band and it allows us to support all the different forks and the same sort of process, and navigate forks in and out. What that means is that, I’m gonna explain the situation of a forking situation. Ok, so we received a block, 7, 8. And then we receive 9a and 9b. This is crap. Oh magic! Okay, now all of a sudden we receive 11b, so we don’t know about that. But 11b says it’s based on 10b, and I will receive also at the same time 10b and 9b.

Okay, this just happened and this is now the longest chain. So what do we do? We need to undo this one. That’s what nodeos does in memory, we need to undo this one and undo this one and then apply that one in that one. So therefore we’re now on the consensus truth. Right? Everyone now agrees if this is the last thing you’ve seen you should be there because of the longest chain algorithm for consensus. But now in our forkdb implementation, we have all of these and this one is always, with the pointer right, pointing to the previous one.

And we have that and let me show you here. The forkdb has all these links – the block_id and the previous block_id. It has a map of all these objects which is really a block_id in there. And you can ask to AddLink in there to add a new one when you receive a new block like that, or this one or that one, you add that link. It’s also a way to do deduplication because when you receive a block you’ve already seen let’s imagine you have two mindreaders and you’re receiving the same block, if you added it once, you don’t need to add it twice, so forkdb also acts as a deduplicator.

But you can also ask questions like segment. “Give me the reversible segment up to block X.” In this case, you’re gonna ask, “ok, I’m at that block, what’s the reversible segment?” And the forkdb is also aware of the last irreversible block. And it’s going to also understand when this says: “Ok, my irreversible block is now 8,” then it’s gonna understand that this segment is now irreversible and can pipe out. So the forkable is gonna do that. But the forkdb allows you to query “What if this is now irreversible?” And it queries that.

So that’s the abstraction of the forkdb. Then there’s the forkable. forkable is a stream processor right? Remember the interface we’ve shown in the previous video? So forkable is itself… Ok, so here I’m in forkable.go and in there we have the forkable, which is a handler. And it receives as a parameter, then as a sub-handler, and it is also itself a handler, remember the interface we talked about yesterday. Well, it’s gonna process things, and what it will do is it’s gonna receive all the blocks on one end — imagine is receiving that from relayers or mindreaders — and it’s gonna recompute: “What’s the longest chain? What do I need to do from our internal process to undo the things that need to be undone if I discover suddenly for example, we’re triggering a new longest chain.” It queries the forkdb underneath. It’s gonna say “oh this block brings me 11b. This triggers the longest chain. This was the longest previously,” so I’m gonna go and check the the undo and redo segments.

If it triggers the longest chain I’m gonna check for the undo/redo segments and we’re gonna play them if we asked to do the undos here, we’re gonna go and call the sub-handler. This is going to call the sub-handler p-handler.ProcessBlock, and it’s going to give sufficient information like the the step which we’re in; if it’s an undo step, a reference to the forkdb if you need that, the previous object, whatever the step and how many of those are going to go through in the loop so that the sub-handler as you’ll see, will be able to take a decision. Do I remove a thing from the database if it’s an undo signal I’m receiving? If it’s because it just passed irreversibility forkable will send you also a copy of the block, reference to the block, with the signal saying step=irreversible so that you know that this just passed irreversibility. Let me update this database. Let me do that. Let me revert, whatever, right?

So the forkable transforms the incoming block gives it a lot of knowledge, and navigates things for you, sending again the blocks if they go through those lifecycle events.

So again low-level building block, very useful used throughout for deduplication for forwarding and it’s very customizable. So that’s how we do, like, when you ask for IrreversibleOnly, we just configure the forkable and all of a sudden that forkable does not send undo-redo signals. It waits until it passes irreversible and will send you like 12 blocks at a time so you can have it faster and do whatever you want, index or whatever, downstream.

So this is the last one. There’s forkable, forkdb that we have in bstream, a basic element. I hope you enjoyed, hope that it’s clear If you have any question go to the Telegram channel, and let us know, or here in the comments and don’t forget to star that repo. Hey, see you next time. Thanks for watching.