Hi, since a couple of weeks ago, and no changes on our app, we’re getting an empty result from reading some IDX-related streams when calling a DIDDataStore .get(‘’), like the following one:
Note that the StreamID in the multiquery payload (k2t6wyfsu4pg2dngmumympowrsuz4yt7aw676zdwx33njyn4gat35l6pchuu5c) is different than the StreamID you sent in the cerscan link (k2t6wyfsu4pfx9q48o3dhegwe722ngmtfmupma7x7b0iwr5ui9v4rnp4z1xjne).
Regardless though, I can load both streams and access valid data from them, so I wouldn’t expect an empty response.
One thing I do notice though is that the Cerscan copy of the stream k2t6wyfsu4pfx9q48o3dhegwe722ngmtfmupma7x7b0iwr5ui9v4rnp4z1xjne contains two extra commits that the copy I am able to load is missing. Investigating further, it seems that the first of those commits (with IPFS CID bagcqcerazhq3ayshls5u2ml2n6unhp6kesdiib73ag5qu53jiqrgwelcbjzq) has a CACAO that was issued at 2022-10-29T22:44:32.982Z, which was during the CAS (Ceramic Anchor Service) outage that occurred over that weekend, which we disclosed on discord here: Discord. Those two extra commits were never anchored, which means the CACAO expired, rendering those commits invalid.
So it seems possible to me that some of the issues you are seeing are due to issues surrounding data corruption from this CAS outage. If you search your Ceramic node logs, do you see messages that contain the text “CACAO expired”?
In any case, I’d strongly recommend upgrading your Ceramic node to the newest version, as last month we rolled out a change to more proactively detect these types of data corruption and proactively throw errors (see Discord), which should help you detect and clean up any streams that were affected by the outage.
I want to apologize personally for the issues resulting from this outage. A major focus on the whole team right now is investing in the stability of the anchoring system to avoid any other data loss incidents like this in the future.
Shouldn’t the ceramic network detect the corrupt commits and fall back to the latest valid commit?
That’s effectively what the newer versions of Ceramic do, though they require a manual intervention step as we didn’t want to automatically discard data in case application devs wanted to try to remember the content and have users re-apply it on the newly repaired streams. New nodes will detect this corrupted state and throw an error with the suggestion to reload the stream with the sync flag set to SYNC_ALWAYS, which will force the node to discard the invalid commits and reset the stream state back to its last valid state
Hi @spencer
We can’t manage to retrieve a list of all of the corrupted streams from the node logs, can you help us check how this should be done precisely?
Thanks
We can’t really find all the streams that have this corrupted state, if some of those streams haven’t been referenced in a while. We can however identify the ones that have been loaded or updated since the CACAO expired, as there will be log messages about the CACAO timeouts in the Ceramic node’s log. @mohsin has a script he wrote to do that - Mohsin can you share that script so they can run it against their node’s logs?
Also, if you update to the newest Ceramic version, an error will be thrown anytime a corrupted stream is loaded or updated. You could have your application catch errors and check the error message to see if it’s due to this issue, and if so then it could automatically reload the stream with the SYNC_ALWAYS flag, resetting it to a valid state. So then whenever a user comes to your app and tries to interact with a corrupted stream it will be automatically repaired on demand.