Sharing a common question we (the Ceramic core developers) get asked a lot:
It seems that node operators can choose which Streams they want to maintain. There is no random sampling mechanism or incentives to make data more decentralized. Does this mean that nodes maintaining a particular stream can be malicious or take advantage of the system in some way?
2 Likes
It is true that each node operator chooses which Streams to pin/serve, and that means that in some cases there may be only a single node providing a given Stream to the network. It’s worth noting, however, that even if a given Stream is only being served by a single node, that node operator can’t actually change or lie about the contents of that Stream. This is because any other node that loads the Stream will validate the entire log and every commit in the log must be signed by the Stream controller to be considered valid. So it is impossible for a node to ever present a state for a Stream that wasn’t at some point a valid state for that stream and signed by the Stream controller.
The thing that a malicious node operator could do is withhold commits, performing a kind of censorship attack where they return a valid past state of the stream instead of the current state. This would allow the node to return valid, but out of date data. When there is only one node serving a given Stream, this type of censorship attack is fairly trivial to carry out. As the number of nodes serving the stream increases, however, it becomes exponentially more difficult to censor data. This is because there only needs to be one honest node connected to the network to provide the current state for any given Stream.
In the future there will be protocol-layer incentives for multiple node operators to store and serve data for Streams even if they don’t need that Stream data for their own application. This incentivized state availability mechanism will increase redundancy and decentralization, which in turn will improve availability and censorship resistance.
It is worth noting, however, that even without those economic incentives, the Streams that provide more popular and useful data are already likely to have multiple nodes pinning them. This is because each application that wants to use that Stream’s data has a natural incentive to pin that data on their application’s node. That means that even today, without protocol-level cryptoeconomic incentives, there are natural incentives that drive the most useful data to automatically become the most decentralized and censorship resistant naturally.
2 Likes