When you need to integrate massive volumes of data, it is imperative to have a database that scales quickly, easily, and at low cost. But, it is also important to have elasticity—to be able to scale down based upon fluctuating demand.

MarkLogic is a massively scalable Enterprise NoSQL database that scales horizontally in clusters on commodity hardware to hundreds of nodes, petabytes of data, and billions of documents—and still processes tens of thousands of transactions per second.

When demand dissipates, MarkLogic can scale back down without having to worry about complex sharding. With these features, organizations can handle incredible volumes of data and run large scale web applications—all without breaking the bank.

MarkLogic’s agility, integrated search, and enterprise features allow us to deliver on a global scale to the most demanding customers. We have dramatically improved the customer experience and our clients’ ability to optimize their risk profiles.

Key Benefits

Hassle-Free, Elastic Scaling with MarkLogic

With traditional databases, scaling is extremely complex and often too expensive. With other NoSQL databases, scalability is more achievable but you sacrifice transactional consistency and they are a pain to scale back down. MarkLogic is a NoSQL database that scales like a NoSQL database should, but without all the compromises.
Scalability

From three nodes to hundreds of nodes, or 10,000 documents to 1 billion documents—MarkLogic clusters scale horizontally as your data or access demand grows and shrinks

Elasticity

Add or remove nodes in minutes and take advantage of automatic cluster rebalancing, helping you keep the database in line with performance needs without over-provisioning

Run on Commodity Hardware

MarkLogic doesn’t need “big iron.” You can run MarkLogic on cost-effective commodity hardware in any environment—in the cloud, virtualized, on-premises, or a combination

Shared Nothing Architecture

MarkLogic uses a shared nothing architecture with no master-slave relationships, which means there is no risk of data loss if a node fails. If one node fails, another node automatically picks up the workload

No Performance Degradation

MarkLogic was designed from the start to run large enterprise applications, and does not reach a limit where there are large performance cliffs while scaling

Fewer Nodes and Licenses

MarkLogic datasets and indexes do not have to fit in-memory, which means you can scale without the expense of dozens of boxes and licenses

MarkLogic is a cutting edge technology…”

“There aren’t many options when you’re looking for a commercial solution that can help store, search, analyze, and transform more than a billion documents. MarkLogic is a cutting edge technology which enables the LexisNexis development team to spend more resources building products and features and less time and money maintaining a technology platform. This focus on the applications is driving our customer satisfaction and is a critical component of our continued growth at LexisNexis.”

How Clustering Works in MarkLogic

MarkLogic is designed for extremely large data volumes, and scales to clusters of hundreds of machines, each of which runs MarkLogic. Each machine in a MarkLogic cluster is called a host, or node. Some hosts (Data Managers, or D-nodes) manage a subset of data in what are called forests (also known as shards). Other hosts (Evaluators, or E-nodes) handle incoming user queries and internally distribute queries across D-nodes to access the data. As you load more data, you add more D-nodes. As the user load increases, you add more E-nodes.

High Availability
Clustering enables high availability. In the event that an E-node should fail, there is no host-specific state to lose—just the in-process requests (which can be retried)—and a load balancer can route traffic to the remaining E-nodes. Should a D-node fail, that subset of the data can be brought online by another D-node.

Commodity Hardware

MarkLogic clusters across commodity hardware connected on a LAN. A commodity server can be anything from a laptop, to a simple virtualized instance, all the way up to a high-end box with two CPUs—each with 12 cores, 512 gigabytes of RAM, and either a large local disk array or access to a SAN. A high-end box like this can store terabytes of data.

 Learn More

Blog Posts

Relational databases are designed to run on a single server in order to maintain the integrity of the table mappings and avoid the problems of distributed computing. We’re at a tipping point with data volume. In my last post, I showed the stat from EMC about how the digital universe is expected to grow from […]
Gone are the days of single app databases. As MarkLogic product manager Justin Makeig says, "Applications are ephemeral—data is forever."

Become a Master of Scale

Want to learn more about how to scale MarkLogic? Here are resources for Architects and Administrators.

Blog
Scaling Your Database Doesn’t Have to Be Hard

Is it true that databases don’t scale? Is it easier to scale services than the database? As with many things, “it depends”…

Technical White Paper
Scalability, Failover, and High Availability Guide

This guide describes some of the features and characteristics that make MarkLogic Server scale to extremely large amounts of content.

Documentation
Fundamentals of Resource Consumption

The whitepaper introduces basic MarkLogic terms for those readers who might be new to the product and concepts. This guide views MarkLogic through the lens of resource consumption and infrastructure planning.

Powered By MarkLogic

Customers Scaling Big with MarkLogic


To support its growing user base and multi-platform distribution, the BBC built its iPlayer TV-streaming service using MarkLogic. After launching iPlayer, the system handled three billion requests within the first year of production, all on the cloud.

Learn More


Hannover Re runs their next generation, automated underwriting solutions with hr | ReFlex, an innovative app that combines point of sale and risk assessment systems. The system handles over a decade of data that integrates data from hundreds of offices.

Learn More


Autoliv’s MarkLogic built Centralized Safety Data Hub ingests data from all of its 80 manufacturing facilities in 28 different countries. It scales for new data, and handles changing queries so that Autoliv can conduct traceability studies in minutes, not days.

Learn More


The bank chose MarkLogic to build their operational Trade Store for regulatory compliance. The Trade Store has elastic provisioning for 40+ million records and growing. By moving off relational, they achieved flexibility and success in meeting regulatory deadlines.

Learn More


MarkLogic serves as a trade data hub processing data for a multi-trillion dollar derivative business. Moving off Sybase, they gained the ability to scale quickly and efficiently. According to their CTO, “We needed to get away from relational… MarkLogic offered us horizontal scalability, the ability to just add more and not have do a big infrastructure replacement.”

Learn More


DHL Parcel Benelux used MarkLogic to launch a new, rapid-response, consumer facing Track and Trace system. “The Proof of Concept showed such incredibly fast response times, even at peak loads. It was also immediately apparent that the technology is fully scalable and that response times will not be unduly affected as we grow to meet the rising demand for online shopping.”

Learn More

Feature-Rich and Built for the Enterprise

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.