CIP-124: Recon (tip synchronization protocol)

0xEE3CA4dd4CeB341691 · July 9, 2023, 1:42am

@AaronDGoldman Could you explain your usage of lalrpop-util in GitHub - 3box/rust-ceramic: Implementation of the Ceramic protocol in Rust? Are there any golang implementation you are aware of ?

AaronDGoldman · July 10, 2023, 5:16pm

Lalrpop is not required for implementing recon.
We use lalrpop to parse a DSL for tests.

cat: [b,c,d,e]
dog: [a,e]
-> (b, h(c,d), e) [a,b,e]
<- (a, 0, b, 0, e) [a,b,c,d,e]
-> (a, 0, b, 0, c, 0, d, 0, e) [a,b,c,d,e]
<- (a, h(b,c,d), e) [a,b,c,d,e]

This lets us make test vectors that are a little more human readable.
[a,b,c] for a node with the events a, b, and c
-> (message)[receiving node's post message events] a message going from one node to another
(a, h(b,c), d) a message with EventID a the hash of events from a to d and then EventID d
h(a, b) is the hash of a and b
0 is the hash of nothing aka the identity hash.

You could write a simple parser for this DSL and reuse the test strings or translate the test strings to structs in your langwidge for your tests.

{
  "cat": ["b","c","d","e"],
  "dog": ["a", "e"],
  "messages": [
    {"direction":"->", "keys":["b", "e"], "hashes":[["c","d"]], "state":["a","b","e"]},
    {"direction":"<-", "keys":["a", "b", "e"], "hashes":[0,0], "state":["a","b","c","d","e"]},
    {"direction":"->", "keys":["a","b","c","d","e"], "hashes":[0,0,0,0], "state":["a","b","c","d","e"]},
    {"direction":"<-", "keys":["a","e"], "hashes":[["b","c","d"]], "state":["a","b","c","d","e"]}
  ]
}

AaronDGoldman · July 13, 2023, 9:15pm

In an earlier version of the EventID we had the sort_key before the sort_value.
/network/key/value/ctrl/stream/event but it was cut out in an attempt to get the IDs shorter.
now as @jthor pointed out we could use the sha256(sort-key + sort-value) in the place we have sha256(sort-value) now. This would require the EventID::new() to take in the sort_key but that data should be available any time we are creating an EventID since otherwise we would not know which stream set to put the EventID in to.

This also will make the EventIDs unique as without the sort_key an event with the same value in two different headers would have the same EventID in all stream sets. With the sort_key in the EventID the same event will have a different EventID in each stream set.

I don’t see a drawback to including the sort_key with the sort_value in the hash. It will not change the use of these bytes as a sharding key but will enable checking witch stream set the EventID is from.

As for how do we decide if an EventID refers to a Stream, Controller, or Model; The EventID will be referring to an individual event. When we need to refer to a larger range of events we will do so with a pair of fenceposts. The fenceposts are like EventIDs but instead of a CID they will have a 0 byte to indicate that there is no event there.
For a Model (network/sort_value/00, network/sort_value/FF)
For a Controller (network/sort_value/controller/00, network/sort_value/controller/FF)
For a Stream (network/sort_value/controller/stream/00, network/sort_value/controller/stream/FF)

0xEE3CA4dd4CeB341691 · July 24, 2023, 2:34am

Another question. After syncing eventIDs via Recon, how do we expect sync the actual data? If we still rely on Bitswap, will it be slow?

AaronDGoldman · July 27, 2023, 9:04pm

In the short term we will use Bitswap to fetch the events once we get the CIDs from Recon. In the longer term we plan to add want list to the Recon messages since the node we got the EventID from is likely to have the event bytes also.

expede · April 15, 2024, 9:06pm

Hey folks @cole pointed me at this thread. I see that CAR Mirror is in some of the earlier research, but I’m actually here to shill a new paper that Philipp and I have been combing through recently. Your project may be past the research phase, but the paper is interesting at minimum

https://arxiv.org/html/2402.02668v2

It’s a pretty clever approach and is fairly close to the theoretical minimum communication cost.