Progress Acquires MarkLogic! Learn More

Integrating MarkLogic with Kafka

Back to blog
2 minute read
Back to blog
2 minute read
Woman working at computer.

Apache Kafka is an open-source distributed event streaming platform used by the majority of Fortune 100 companies. As an increasing number of organizations have adopted Kafka for their data streaming needs, the need to seamlessly move data in and out of various data sources has become increasingly important.

The MarkLogic data platform, chosen by customers for its promise in achieving data agility, is the kind of data source that must be able to receive data from and send data to Kafka.

In this post, we will explore the benefits of MarkLogic Kafka Connector and highlight just how easy it is to get data from a Kafka topic to MarkLogic and vice versa. If you are already using Kafka and considering MarkLogic as a data platform for achieving true data agility, this post is for you!

Getting Data from Kafka to MarkLogic

The MarkLogic Kafka Connector uses standard Kafka APIs to publish and subscribe to Kafka topics. The connector reads records from a topic and writes them as new documents to a MarkLogic database. As messages stream onto the Kafka topic, the connector will bundle the messages and then push the messages into the database based on a configured batch size and time-out threshod.

The connector supports configuring collections, permissions, and URI construction to fit specific use cases. This allows users to efficiently and effectively store their data from Kafka into MarkLogic while maintaining full control over the process.

Combined with MarkLogic’s ACID transactions, the system has extremely high reliability. Additionally, the connector and MarkLogic are both easily scalable. New server nodes can also quickly and dynamically increase available bandwidth. As resources are maxed out, MarkLogic may be expanded to meet data flow requirements.

Getting Data from MarkLogic to Kafka

As data within MarkLogic is refined and connected, organizations will seek to share this high-quality, contextualized data across their enterprise. To facilitate this, the MarkLogic Kafka connector also acts as a source connector that retrieves data from MarkLogic as rows of either JSON, XML, or CSV. This ensures that your data is flexible and can be used in a variety of systems, making it easily accessible to everyone who needs it.

One of the key components of the MarkLogic Kafka Connector is its use of the Optic API. MarkLogic’s Optic API is a SQL-like API that can query data from any model in our database. Users can easily construct Optic queries to declare the data they wish to send to a Kafka topic.

The MarkLogic Kafka Connector also supports ensuring that it only retrieves new and modified data based on a user’s definition of that concept. This way, your data is always up-to-date, and you don’t have to worry about duplicates or outdated information.

Start Using the Connector Today

Integrating MarkLogic with Kafka has never been simpler. The connector provides an efficient way to move data between the two systems, without the need for custom code. It can be installed and configured easily, making it perfect for organizations to start using it right away.

Download from Github

Mitch Shepherd

Mitch joined the Product Management team at MarkLogic in 2021. He is responsible for developer tools, including client APIs, connectors, and integrators.

Mitch holds a Master’s Degree in Information Systems and an MBA from the University of Utah.

Read more by this author
Read More

Related Posts

Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.


Semantics, Search, MarkLogic 11 and Beyond

Get info on recent and upcoming product updates from John Snelson, head of the MarkLogic product architecture team.

All Blog Articles

Introduction to GraphQL with MarkLogic

MarkLogic 11 introduces support for GraphQL queries that run against views in your MarkLogic database. Customers interested in or already using GraphQL can now securely query MarkLogic via this increasingly popular query language.

All Blog Articles

Introducing MarkLogic 11

More ways to analyze your multi-model data and integrate with your data ecosystem. Easier to deploy, manage, and audit – including in the Cloud.

All Blog Articles

Sign up for a Demo

Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.

Request a Demo