RFC: Native support for single-write documents with field locking

rohhan · December 8, 2023, 8:09pm

Treat this as an opinionated draft proposing a solution to a problem. Please let us know your thoughts by providing comments below - any feedback is welcome!

Context and problem

Ceramic users cannot easily make statements about other data in Ceramic that they expect to be static. For example, if a user wanted to write a review (represented as a Ceramic stream) about a product (also represented as a Ceramic stream), they may have an assumption that the product’s details (e.g., name, quantity, condition, etc.) would not change. Currently however, when creating a review that references a stream, the reviewer has no guarantees that changes will not be made by the product owner in the future, thus making the review inaccurate or obsolete.

More specifically, Ceramic is currently not natively optimized to allow application developers to easily query an older version (or the original version) of updated documents, so in the given example, it would be difficult for the Review to be tied to an older version of a Product.

Inefficient workarounds are possible but may be a frustrating experience since they require application developers to write custom logic.

A better solution would be to natively enable developers to allow their users to author document types for which the values should never change, regardless of how those immutable documents are referenced by other data points.

Goals

Provide a way to ensure that the relationship of one document referencing a second document does not ever become invalidated by an update to the second document
E.g., if a user leaves a review for a specific version of an audit, that review should not apply to a newer version of the audit

Solution

The proposed solution is a feature that enables developers to lock field values within a model document. Locked fields cannot be changed once they are defined during the creation of the model instance document.

For example:

Auditors cannot update an Audit version within an existing stream to inherit earlier reviews.
Auditors must publish a new stream with a new Audit which must gather new reviews.
Each stream document represents a version of an Audit

Code sample: Defining a locked field during schema modeling (eventual syntax may vary)

## merely an example below - actual values likely would include a proof

type SoftwareSecurityAudit 
  @createModel(accountRelation: LIST, 
	description: "A software security audit")
  @createIndex(fields: [{ path: "issuanceDate" }])
  @createIndex(fields: [{ path: "approval" }]) 
  @createIndex(fields: [{ path: "softwareItemLocation" }])
{
  controller: DID! @documentAccount
  issuanceDate: @locking DateTime!
  approval: @locking Boolean!
  softwareItemLocation: @locking CID! 
}

Risks

This solution does introduce some potential risks. Notably, if a user is tricked into creating a document due to phishing, a hack or anything else, they’d have no recourse to change it. Similarly, organizations that delegate write access to their members e.g., using CACAO, are exposed to this risk at a larger scale since any member’s “mistake” would be unrecoverable for the entire organization.

Requirements

Feature	Details	Priority	Open questions
Ability to lock one or more fields	Ability to (optionally) define new schemas that have the ability to have locked fields. e.g., Audit schema	Must have	Should a model be locked at the global level or at the individual field level?
By the controller	Only the controller can lock fields and only at the time of schema creation	Must have
Irreversable locking	Locked fields can never be updated after they model instance has been created, and the fields are first defined.e.g., The Product that an Audit points to (and possibly also all the values of the Audit) are locked	Must have
Compatible with existing features and other new features (e.g., SET)	Further exploration is required to see how field locking interacts with newly proposed relationship types.	Must have	How does field locking interact with the newly proposed SET feature?

m0ar · December 19, 2023, 11:00am

Thank you for creating this RFC!

This is something I feel very strongly about, because its absence implies mutable relations, which are quite tricky to deal with when building Codex.

This makes sense, though I think about it mostly in the other direction. Going with the example: if there is a product, I should always be able to find the published security audits referring to it. It does not make sense to change the target of the audit retroactively, and removing the relation is endangering the transparency of the audit history.

For me, a big part of the ceramic appeal is the fact that we can use persistent references to mutable documents, because we can use that to build extremely powerful graphs. In some cases, like a social network, it makes sense for the structural integrity to be mutable (i.e., mutable edges). Example: a user unfollows a page.

In others, like tracking security audits, you need to be able to depend on the edges staying where they were defined (i.e., immutable edges). By indexing audits and products, I should be able to know that I’m seeing all audits that have been created about a particular product. This is easy if the reference is static: just ask the indexer for audit instances with xyz as product reference.

If they are not static, I need to sift through the entire history of every single audit instance to see if it at some earlier point in time was made against this particular product. The search complexity goes more or less constant to linear at best over every single MID following the audit schema. This is a practical example of the workaround mentioned in the post: it becomes wildly inefficient, perhaps even intractable at some point, to find these historical relations. I feel that “write custom logic” is underestimating the effort a little bit

Drilling into this a bit deeper: for this to truly hold, it must also be impossible to nullify the state of a stream if the model has any @locking fields. Otherwise, one can’t depend on the reference. This would be a simple rule in the commit data validation, as the schema is already in scope. Should this non-nullification be a separate requirement here perhaps?

I feel this argument is a bit of a straw man; even without locked fields, the data is still there in the history of the stream. Let’s say you accidentally paste your private key or some other secret, I’d say that is absolutely still compromised even if you push an editing commit. I think any other PoV is soliciting a false sense of security, but maybe that’s just me.

We have clear-cut use cases where we need to lock references (i.e., edges) while still being able to mutate documents (i.e., nodes). Basically, by having static edges, we can welcome the auditor to update the report as new information and patches arrive. They could confirm their concerns have been addressed, for example. This series of events would be irrevocably found from the product given the static edge, hence we reach full transparency of the sequence of events.

rohhan · January 12, 2024, 7:02pm

Thank you for the thoughtful comments, m0ar! I’m excited to hear this is a feature you may find valuable.

Our team has reviewed your feedback, and it sounds like we’re generally aligned on this proposal.

it must also be impossible to nullify the state of a stream if the model has any @locking fields

Agreed that it should be impossible to nullify the state of these streams (at least at a low level Ceramic network layer. Indexers may have their own rules).

I feel this argument is a bit of a straw man

Fair point; I wanted to be exhaustive, but your perspective is valid!

We have clear-cut use cases where we need to lock references (i.e., edges) while still being able to mutate documents (i.e., nodes).

This is great feedback and a good argument for locking at the field level instead of at the document level.

Thanks again for your comments!