Gartner Cloud DBMS Report Names MarkLogic a Visionary

Chicken Soup for the Event-Driven Soul

The title of the blog had been so appealing: “No More Silos: How to Integrate Your Databases with Apache Kafka and CDC“—but then I got hit with a power question:

Why introduce a database into an architecture if we could use a streaming platform such as Kafka instead?”

Because, as a Solution Architect at MarkLogic, I have to say I’m quite partial to adding databases into architectures. And, the blog had already raised some very good points, like this one:

It’s important to challenge assumptions about how systems are built.”

“Yes,” I agreed. But surely not the we-need-a-database-here assumption.

But it’s a good question, isn’t it? In an event-driven architecture, why should systems get data out of databases when they can get it straight from Kafka? Wouldn’t it be easier, quicker and cheaper to cut out the middleman? Do you really need a database if you’re already streaming data into Kafka?

This question posed a bit of a problem for me. And as they say, a problem shared is a problem two people have, so I decided to share that problem with some fellow MarkLogic folks.

David Gorbet, Engineering SVP at MarkLogic, wasn’t ruffled by it. Although he agreed and stated that “a message-/event-based system is a smart way to go for many problems, and I think Kafka is a good technology to use for this,” he made it clear that for many architectures, a database is essential. That’s because if a database is used to harmonize siloed data (including your event messages) then:

If there’s ever a question about the data, you can use persistence and indexing of messages to enable traceability for operational issues. You may not need to keep all messages indefinitely, but you should be thinking about keeping them around and queryable for long enough to trace errors.”

And it’s not just spotting errors that a database within an event-driven architecture can help with:

It’s also going to prevent data inconsistencies that are inevitable with individual microservices having separate, overlapping data stores, significantly simplify the security architecture, and provide one place to secure and apply policy to data for things like it being fit-for-purpose for GDPR reporting, anonymization, etc., as well as being a way to track sources and uses of data.”

David left me with something to ponder:

It’s not that an architecture without a database for persistence is wrong, it’s just incomplete; it’s basically an application integration architecture, not a data integration architecture.”

Now, it turned out that Ken Krupa, MarkLogic’s VP of Global Solutions Engineering, had heard that question (“Do we need a database in an event-based architecture?”) before. He explained how he’d found that customers had been unable to get an agreed-to, trusted, comprehensive view of things from messages alone. In fact, one customer referred to it as being:

… like trying to reconstruct the chicken from the chicken soup.”

As Kafka becomes ever more popular, and more architectures that span the whole enterprise are built using event-based patterns, the question of why databases should be introduced at all looks as though it is set to be one that a lot of people are going to be asking.

However, I think the answer to this really boils down to one key factor: If you’ll ever need to get an answer about an entity as it exists across the business as a whole (and in a hurry), you’ll need a persistent, harmonized representation of it that you can easily and quickly retrieve (for example, via indexes).

And if you’re not sure if you’ll need a database? Just remember the chicken.

Stuart Moorhouse, Solutions Architect | MarkLogic

Stuart is a Solutions Architect with MarkLogic, having joined in 2018. Prior to working at MarkLogic, he worked as a Content Architect for LexisNexis, one of MarkLogic’s first customers.

Start a discussion

Connect with the community




Most Recent

View All

Digital Acceleration Series: Powering MDM with MarkLogic

Our next event series covers key aspects of MDM including data integration, third-party data, data governance, and data security -- and how MarkLogic brings all of these elements together in one future-facing, agile MDM data hub.
Read Article

Of Data Warehouses, Data Marts, Data Lakes … and Data Hubs

New technology solutions arise in response to new business needs. Learn why a data hub platform makes the most sense for complex data.
Read Article

5 Key Findings from MarkLogic-Sponsored Financial Data Leaders Study

Financial institutions differ in their levels of maturity in managing and utilizing their enterprise data. To understand trends and winning strategies in getting the greatest value from this data, we recently co-sponsored a survey with the Financial Information Management WBR Insights research division.
Read Article
This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.