Apache Kafka is an open-source distributed event streaming platform used by the majority of Fortune 100 companies. As an increasing number of organizations have adopted Kafka for their data streaming needs, the need to seamlessly move data in and out of various data sources has become increasingly important.
The MarkLogic data platform, chosen by customers for its promise in achieving data agility, is the kind of data source that must be able to receive data from and send data to Kafka.
In this post, we will explore the benefits of MarkLogic Kafka Connector and highlight just how easy it is to get data from a Kafka topic to MarkLogic and vice versa. If you are already using Kafka and considering MarkLogic as a data platform for achieving true data agility, this post is for you!
The MarkLogic Kafka Connector uses standard Kafka APIs to publish and subscribe to Kafka topics. The connector reads records from a topic and writes them as new documents to a MarkLogic database. As messages stream onto the Kafka topic, the connector will bundle the messages and then push the messages into the database based on a configured batch size and time-out threshod.
The connector supports configuring collections, permissions, and URI construction to fit specific use cases. This allows users to efficiently and effectively store their data from Kafka into MarkLogic while maintaining full control over the process.
Combined with MarkLogic’s ACID transactions, the system has extremely high reliability. Additionally, the connector and MarkLogic are both easily scalable. New server nodes can also quickly and dynamically increase available bandwidth. As resources are maxed out, MarkLogic may be expanded to meet data flow requirements.
As data within MarkLogic is refined and connected, organizations will seek to share this high-quality, contextualized data across their enterprise. To facilitate this, the MarkLogic Kafka connector also acts as a source connector that retrieves data from MarkLogic as rows of either JSON, XML, or CSV. This ensures that your data is flexible and can be used in a variety of systems, making it easily accessible to everyone who needs it.
One of the key components of the MarkLogic Kafka Connector is its use of the Optic API. MarkLogic’s Optic API is a SQL-like API that can query data from any model in our database. Users can easily construct Optic queries to declare the data they wish to send to a Kafka topic.
The MarkLogic Kafka Connector also supports ensuring that it only retrieves new and modified data based on a user’s definition of that concept. This way, your data is always up-to-date, and you don’t have to worry about duplicates or outdated information.
Integrating MarkLogic with Kafka has never been simpler. The connector provides an efficient way to move data between the two systems, without the need for custom code. It can be installed and configured easily, making it perfect for organizations to start using it right away.
Download from GithubLike what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.
Get info on recent and upcoming product updates from John Snelson, head of the MarkLogic product architecture team.
MarkLogic 11 introduces support for GraphQL queries that run against views in your MarkLogic database. Customers interested in or already using GraphQL can now securely query MarkLogic via this increasingly popular query language.
More ways to analyze your multi-model data and integrate with your data ecosystem. Easier to deploy, manage, and audit – including in the Cloud.
Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.
Request a Demo