Indexing to Elasticsearch using Apache Kafka

Hello!

We are currently indexing ComposeDB into Elasticsearch using the following pipeline: ComposeDB → PostgreSQL → KafkaConnect → Apache Kafka → Consumer → Elasticsearch.

Ideally, we would like to remove PostgreSQL and KafkaConnect from the pipeline and instead stream updates directly to Kafka. What are your thoughts or plans regarding the use of the native Apache Kafka indexer? As a fan of Kafka and Ceramic, I’m very excited about this possibility.

Also, here is the architecture of index.as. Any feedback would be greatly appreciated.

1 Like

Hey @seref thanks for sharing this! It’s a very interesting implementation. At the moment we don’t have plans for directly supporting Kafka for indexing. It is something we are discussing and looking into potential solutions, but it’s very early and we cannot guarantee that it will result into actual implementation. We will definitely keep the community up-to-date if we end up having more concrete plans for it.

1 Like

@seref

This is currently the best approach for information exposed by ceramic. We are looking at exposing a more streaming friendly API, which would notify of metadata (e.g. new models and documents), and events on documents. Would this be something that would solve your use case better?

2 Likes

Thank you @Justina and @dbcfd for the context.

Streaming friendly API and events would be great. By doing this, it would be much easier to connect to Kafka or any other 3rd party tool.

Just to be sure, the “Index to postgres” block in your diagram you mean the postgres that ComposeDB automatically indexes to?

Yes, it’s ComposeDB’s native PostgreSQL indexer. Diagram was a bit inaccurate in that part. And also, it works smoothly. :slightly_smiling_face:

2 Likes