Smart account support

jthor · July 29, 2024, 12:57pm

Smart accounts are on the rise. The ethereum community intention is to over time fully replace EOAs (externally owned accounts) with smart accounts (smart contract based wallets), so this trend is likely to continue. Unfortunately Ceramic doesn’t support smart accounts yet, and it’s not entirely trivial to implement. This post describes how it can be done.

The problem

Today Ceramic enables users to delegate write permissions to a session key using SIWE. The protocol can simply verify the SIWE signature (using the secp256k1 algorithm). This is very straight forward and simple to do. Smart contracts doesn’t really have a way to create signatures in the same way as a private key. However, this problem has been largely mitigated in the Ethereum community through ERC-1271: Standard Signature Validation Method for Contracts, which defines a standard function call on contracts which accepts a message hash and proof bytes. It is then up to each individual smart account impelemntation to chose how to represent a signature.

Validating these signatures can be done off-chain using the eth_call rpc method, which emulates a EVM function call without sending an onchain transaction. In theory we can sign a SIWE message, store the erc1271 proof in a CACAO, then just do the eth_call every time we need to validate it. There are however a few problems with this:

The state of the contract might change over time. This means that a proof that was valid at one point in time could become invalid later.
Making a call to an external resource (ethereum rpc) on every signature validation is very expensive and likely slow
How do we support multiple chains? Currently Ceramic only assumes that you have an ethereum rpc configured, but what if we want to support smart accounts across multiple different L2s and chains?

A potential solution

Unchanging proofs

The first problem (1) is largely solved. It involves generating something called a “chainproof” which makes it possible to verify an eth_call at a particular block height. This works by generating a merkle proof of all the EVM storage slots required to by the EVM function call to the erc1271 verification. You can use this data to verify locally that given a particular blockhash, the erc1271 signature is valid. However, you still need to call the Ethereum rpc to ensure that the blockhash is part of the blockchain.
We can then simply encode all of the data of the chainproof inside of a CACAO.

Async eth rpc calling

Since chainproofs already can be validated locally we have half of our problem (2) solved here already. We just need a way to validate that the blockhash of the chainproof is part of the blockchain and what the timestamp for that block is. Fortunately we can do this async. We can simply add a background process that loads the headerchain of Ethereum into a database and whenever we need to validate a blockhash when we are verifying a signature we can do so against the local DB.

It’s worth noting that we already have this problem today in the protocol when we are verifying TimeEvents. Unfortunately it can’t be solved as easily since as opposed to chainproofs, TimeEvents doesn’t encode information about which block they are in. See Appendix A for a discussion about how this could be resolved.

Multi-chain support

This (3) is the most challenging aspect of smart account support. The naive solution would just be to require nodes to have one rpc endpoint for every supported chain. However, this puts an unreasonable burden on node operators as they need to cover the cost of all of these RPC as well as the overhead of storing the header chain of all of these chains.

A better idea would be if node operators can chose which chains they’d like to support. The problem with this is that it becomes hard to communicate with other nodes that may support other chains. In theory this could be solved in Recon by allowing nodes to specify which chains they support as part of the interest registration, but it’s unclear exactly how efficient this would be.

Another way to approach this issue is to only support Eth L1s and L2s. In this case it should be possible to only use the L1 Eth RPC, and only store the header chain of the L1. L2s periodically submit data and state roots to the L1. L2Beat has stats on this Both Arbitrum and Optimism based chains submit data within 1 - 5 min. State roots (which we would need) are submitted with some delay (looks like l2beat are about to add this data soon). We could in theory extend chainproofs to also include the state root proof of an L2, as well as the erc1271 proof.

These two solutions are both viable, but provide different tradeoffs. The first idea is much easier to implement but requires extra work on the Recon layer. The second solution is more architecturally pure but also quite complex to implement. It would also restrict us to only support Ethereum L2s.

Appendix: Async TimeEvent validation

Currently the caip168 proof of a TimeEvent contains the following information:

{
  root: CID(bafyreiaxdhr5jbabn7enmbb5wpmnm3dwpncbqjcnoneoj22f3sii5vfg74)
  chainID: "eip155:1"
  txHash: CID(bagjqcgzanbud4sqdsywfp2mckuj57qsffsovgyjhh7sxebkqwr335hzy2zbq)
  txType: 'raw'
}

In order to validate it we need to first check if the root CID is part of the transaction payload. This can be done async because we have the CID of the tx, which could be used to address the tx payload (in principle, we don’t do this today). The main problem is however that we don’t have any information about which block this tx is included in. We thus have to make the rpc call eth_getTransactionByHash to get the block hash/number in which the tx was included in. Then we have to make another call to eth_getBlockByHash to get the block timestamp.

If we have a DB with the header chain which is retrieved asynchronously as described above, we could remove the need for the second RPC call. It is however possible to do better. Every block header contains a transactionsRoot. This is a merkle tree root, where the leaves contains all transaction included in that block. We can therefore extend the caip168 proof as follows:

{
  blockhash: CID(bafy...),
  txProof: Uint8Array(...),
  root: CID(bafyreiaxdhr5jbabn7enmbb5wpmnm3dwpncbqjcnoneoj22f3sii5vfg74)
  chainID: "eip155:1"
  txHash: CID(bagjqcgzanbud4sqdsywfp2mckuj57qsffsovgyjhh7sxebkqwr335hzy2zbq)
  txType: 'raw'
}

With this additional information we can validate the proof without needing a call to the eth_getTransactionByHash endpoint.

spencer · July 29, 2024, 3:26pm

The main problem is however that we don’t have any information about which block this tx is included in

We used to have the blockhash and blocknumber in Time Events. We removed it because sometimes txns would get reorged into a different block than where they were initially included, which would cause the Time Event and the actual state on the blockchain to diverge. Since the blockchain is the real source of truth on what happened, we removed the block information from TimeEvents and decided to just always go to the chain for the authoritative record of what happened.

jthor · July 30, 2024, 8:38am

Correct, and this decision had the unintended consequence that we can no longer verify TimeEvents without an explicit call to an ethereum rpc. It might be worth waiting for an extra few blocks before creating TimeEvents so that we can confidently include this information.
We would need to do some research on how many blocks are reasonable to wait before we can expect no forks.