One of the most beloved features of git is the ability to create a fork of a repository where the history of past contributors is maintained. This is great because it allows new project maintainers to easily build alternatives products and still give credit to old contributors. It also allows a forked project to merge back changes from the original repository (this is common for forks of the Linux kernel for example). Forks in Ceramic streams could similarly allow communities to create alternative versions of documents without asking the original author for permission, opening up for new ways to collaborate.
Ceramic doesn’t currently have any protocol level features to fork a stream, instead this would need to be handled on the application layer. This approach has several drawbacks. When creating a fork, the application (and maybe even users) would need to keep track of and persist the stream that was forked. There is also no standard way of forking which could result in several incompatible implementations.
What follows is a proposal for how to represent stream forks in the protocol.
Technical background
There are two main fields that give the protocol information about which stream is being interacted with. Both of these fields are required in Data Events, but not allowed in Init Events.
id
- contains the CID of the InitEvent of the given streamprev
- contains the CID of the previous event in the given stream
Introducing Fork Events
A ForkEvent is simply an InitEvent (that is, and event that creates a new stream) which contains a prev
field. This field must be set to a CID of an event in the stream that is being forked. This could be the most recent event, or an historical event (if the fork is based on some past state of the stream). Note that the ForkEvent does not contain an id
field.
The streamId of the newly created forked stream is based on the CID of the ForkEvent. To some extent the ForkEvent can be seen as an InitEvent for the fork. Any DataEvent that is added to the fork would therefore use the CID of the ForkEvent as its id
.
Merging streams
If one or more forks of a stream has been created, the original author of a stream might be interested in merging the changes from one of the forks back into the original stream. This could be achieved simply by including the CID of an event from the fork in the prev
field of a new DataEvent. Currently this approach would be problematic because it’s only possible to reverence a single previous event. However, this is currently being solved with CIP-145.
DataEvents currently always need to include some data. It could make sense to also introduce a separate MergeEvent that is the same as a DataEvent, but without a data field. However, retaining the ability to merge streams with a DataEvent probably also makes sense.
Synchronizing forks
As with git, synchronizing a forked stream should not only synchornize all events back to the InitEvent of the fork, i.e. the ForkEvent. Rather it should synchronize the full history back to the InitEvent of the stream that was initially forked. This means that if a node only cares about a specific fork it also maintains the history of the stream that was forked, up until the point at which the fork happened.