How does Compose DB run on a distributed/decentralized system?

Hi Ceramic Team,

This is a research question and I just would like to get some high-level answers.

Since Ceramic is running on blockchain, you probably would save a compose DB data model on multiple nodes, but would you do sharding or you save the entire data model on each node having the copy?

If you do sharding, how would you separate a graph database? Can App designers set up the rules of sharding? How much would it slow down the query?

Thank you,
Monica

Hey Monicaz! Tagging in a few very knowledgeable people to try & answer your question! @spencer @paul think you can weigh in on these questions?

Hey Monica, thanks for the great question.

Our architecture is different from a blockchain actually. Blockchains require global state sync for any new transaction - you need to know everyone’s account balances at each stages - which is great for financial transactions. Ceramic’s protocol is more of a “doc chain” - if a user posts a blog and then edits it, every update to the document is signed by the user’s DID, so it is verifiable and strictly ordered.

So the key scalability difference vs a blockchain is edits to one user’s document doesn’t require us syncing globally with another user’s document. This allows for much greater horizontal scalability.

Tactically, you can run your app’s DB on a single node today.

2 Likes

Hi @monicaz,
To elaborate on what Avi said, with ComposeDB data is organized into data models (e.g. “profile”, “blog post”, “blog post comment”, etc). Apps that use ComposeDB can instruct their node which Models they care about, and then their node will sync all the documents within those data models, but no other data from the rest of the Ceramic network. That way the node has all the data it needs indexed locally to enable it to satisfy sophisticated GraphQL queries over the data set it cares about, while keeping the network’s data very separate and easy to scale as each node only needs the data it cares about, but doesn’t need to store data for the entire network.

Hope that helps answer your questions!

2 Likes