Request for Architecture Diagram

block-ops.eth · October 27, 2022, 9:04pm

Hi all, I want to start by saying I’m a novice Ceramic and ComposeDB user. I have about 2 days of experience under my belt so please bear with me. My original message is here in the discord channel: Discord

I’m building a metadata logging service aimed at the data science community, and ideally on ComposeDB. I love the GraphQL interface. I have some questions regarding using Ceramic in production that I believe an architecture diagram would clear up for me and be extremely valuable on the “Running in production” guide.

I’m following the “Running in production” guide and I noticed that AWS S3 storage is recommended as the datastore. I’m guessing this is only for disaster recovery, but it raises the question of how often is data actually landing on IPFS if I’m using Ceramic? Say one of my users writes a JSON Object via GraphQL, does that data go straight to IPFS or is there a delay? In keeping with the goal of decentralization, I don’t want to own a storage object with my users data but I understand why it might be needed temporarily.

mohsin · October 27, 2022, 10:08pm

Thanks for your post, @block-ops.eth!

AWS S3 is optionally used as the IPFS backend (i.e. where IPFS data is actually stored), so data is always landing on IPFS, just being housed in S3 via the go-ds-s3 plugin.

You can change this via configuration and choose to store this data on a mounted filesystem instead.

Please note that user data must be stored somewhere. Even if data were stored on the filesystem instead, that filesystem is still a point of failure/control, whether it’s a mounted volume in AWS, etc. or a hard disk on a server in a basement, since each of these resources will be controlled by someone.

Decentralization is achieved by having multiple interested parties run their own nodes, with more than one node “pinning” important data so that even if all but one node with that data pinned were to go down, that data will continue to be accessible.

In fact, your users can themselves run Ceramic nodes to pin their own data so that it is never lost.

Here’s a simplified architecture diagram of what a Ceramic node looks like.

Hope this made sense Let me know if you have more questions.

block-ops.eth · October 27, 2022, 11:30pm

Absolutely, thank you for this explanation!

I was considering only hosting this service in AWS then figuring out how to fund it later, but given your explanation, it sounds like it may actually be better if I offer up a docker-compose.yml which includes a ceramic node, an ipfs daemon, and my application for the user to interact with. I think that serves not only my purpose but also adds more nodes to Ceramic which would be helpful?

This is actually how I’m developing now. Is there any reason you can think of I wouldn’t be able to carry this setup through to mainnet?

mohsin · October 28, 2022, 12:00am

Great idea!

Please note that eventually there will be an incentivized network and node operators will be required to fund their nodes through the network token.

Another thing to note is that having multiple instances of your application will not automatically sync relevant data across all instances of the app, but if you bake in indexing for the models you create, that should make any running instances of your app aware of the same data.

Lastly, I would suggest looking at DIDSession and CACAO so that users will own their data and not the app itself, wherever it is running.

block-ops.eth · October 28, 2022, 12:07am

That’s the goal, do you have any examples or docs on baking in indexing? I’ll admit you lost me there.

mohsin · October 28, 2022, 12:12am

Don’t think there are docs specifically about that but if you look at the ComposeDB docs here, you should be able to set indexing up for your data models. Then, you could publish the model IDs in your README and/or add composedb CLI commands to your docker-compose script that automatically set up indexing for those IDs on the Ceramic node at startup.

I’m thinking aloud here so some of these steps might be slightly different when you try them though don’t hesitate to ask if you run into any issues.

block-ops.eth · October 28, 2022, 12:13am

Fantastic, you’ve been a life saver here @mohsin !

mohsin · October 28, 2022, 12:14am

You’re welcome, I’m excited to see what you build!!

spencer · October 28, 2022, 9:14pm

Here’s another architecture diagram that’s a bit more detailed and shows the ComposeDB stack as well

mohsin · October 30, 2022, 4:35pm

Ooo yes!! I forgot about that one! Thanks, @spencer.