Is it best to create ComposeDB models as small as possible?

dysbulic · September 12, 2022, 6:17pm

I was wondering about model definitions, specifically profiles since those are at the heart of MetaGame’s leaderboard.

Does it make the most sense to define a model as narrowly as possible, like just the username in a model and just the description, etc., then combine those parts together in a composite?

That would reducing the number of potentially extraneous fields that a consumer has to index when compared to a monolythic model.

spencer · September 12, 2022, 9:52pm

This is a really good question and the answer unfortunately is “it depends”. There are tradeoffs between smaller vs larger Models. The advantage of smaller Models is they are more modular, easier to compose in different combinations, and allow apps to depend on exactly the data they need and no more. The primary disadvantage of smaller Models, however, is that data belong to two different Models cannot be updated atomically.

For example, consider if you wanted to store a user’s address. Once possibility is to have a single Model that stores the entire address, where documents within that Model would look something like:

{ 
  street: '123 main street',
  city: 'boston',
  state: 'MA'
}

Another possibility is to have 3 Models: a address_street, address_city, and a address_state Model. So the first Model would have the data {street: '123 main street'}, the second would have {city: 'boston'}, and the third would have {state: 'MA'}. This would allow an app to index all the users who live in the state of Massachusetts without needing to also index what city they live in or what their street address is. The problem with this structure, however, would be that if the user were to move to a new address in a new state, the update to each of the 3 address documents in the 3 Models would happen independently - meaning they could be reordered, an app could learn about one but not the other two, or one of the writes could even fail entirely while the other two succeed. This could lead to inconsistent views about where the user actually lives.

So the key tradeoff to consider when thinking about how narrow or broadly to define a Data Model is around the need to update various pieces of the data atomically. If there is a strong relationship between two fields of the Model such that it’s important to be able to update both of those fields in a way that apps will only ever see the change to both fields or neither, then those fields should be in the same Model. If there are fields that don’t need the ability to be atomically updated in this way, and especially if some apps might care about some of those fields but not others, then splitting them up into multiple smaller Models will provide developers with more flexibility around which pieces of the data they want to index in their application.

dysbulic · September 12, 2022, 10:22pm

That makes sense. I was thinking that combining models into a composition would give me a unified interface for updating fields where the composition figured out where the aspects of my update need to be written.

I was also hoping it was atomic, but alas it is not so.

What are the effects of creating a composition of models? I’m able to write queries that cross model boundaries? It seems like for that I’d need to be able to “mount” models within a larger structure.

Is the definition language GraphQL Schema?

paul · September 13, 2022, 9:03am

A Composite represents a set of Models that the ComposeDB client can interact with.
Essentially, it automatically generates a GraphQL schema with mutations for each individual model, and the ability to query data from multiple models at once, but underneath it will likely result in multiple requests to the Ceramic node.

spencer · December 20, 2022, 9:33pm

One other small downside of smaller Models that I forgot to mention in my original post is that they might lead to slightly more complex application code as each Model will correspond to a different object in the app code instead of a single unified object. For instance if your app wanted single concept of a “profile” but it was represented as 3 underlying ComposeDB models (ProfileName, ProfilePicture, and ProfileAddress, for example), each underlying Model would correspond to a separate object in the corresponding application code.

This could of course be mitigated with a simple profile abstraction that unifies all the smaller profile pieces. Such an abstraction, however, would need to think about how to handle updates to the profile that modify fields belonging to different underlying models, and the fact that those updates are not atomic and could fail independently.

FYI @dysbulic