Progress Acquires MarkLogic! Learn More
BLOG ARTICLE

TDWI Hadoop Readiness Assessment and Guide

Back to blog
11.16.2015
3 minute read
Back to blog
11.16.2015
3 minute read

Research organization TDWI recently published an online survey and assessment guide to help organizations determine how ready they are to adopt and implement Hadoop.

In the past few years, the number of organizations using Hadoop—or contemplating using it—has grown astronomically. Each organization has common questions about whether they are really ready to implement Hadoop, and what the best practices are for being successful.

For these reasons, TDWI developed an Online Hadoop Readiness Assessment and Guide to help organizations as they start working with Hadoop. TDWI is an organization that provides research and advice for everything data related. The Assessment they created is free and it provides a great way to analyze each dimension of readiness, including organizational readiness, Big Data readiness, data management readiness, analytics readiness, and IT readiness.

hadoop maturity model

Example of how the TDWI Hadoop Assessment scores results


Our Take on the Evolving Hadoop Ecosystem

One of the initial challenges that people have when getting started with Hadoop is simply navigating the myriad of components that have popped up in recent years. I was at the Strata Hadoop conference in New York a month ago and based on what I saw, I can understand the confusion around Hadoop with all of the crazy names being advertised: Mahout, Ambari, Avro, Datafu, Oozie, Tez, Chukwa, Trafodion, etc.

popular hadoop projects

A few of the more popular Hadoop projects shown here

The quickly changing landscape of the Hadoop ecosystem is what makes Hadoop planning ever more critical today. Hadoop is no longer just HDFS and MapReduce (MapReduce seems to actually be fallign quite a bit in popularity), but a family of tools that all fall under the broad umbrella of Hadoop and are at various levels of maturity ranging from “University lab side-project” to production use at large companies.

hadoop ecosystem

We need resources to navigate the growing complexity in the Hadoop ecosystem


Why Use MarkLogic and Hadoop?

There are many customers that we talk to that are already using Hadoop, and so the question comes up quite frequently, “Why do we need MarkLogic if we’re already using Hadoop?”

To put it simply, MarkLogic provides an enterprise-class, operational database and Hadoop does not. Hadoop has many benefits, but it currently lacks some enterprise features that organizations require for production environments (e.g., Hadoop does not have robust security, and it does not carry the necessary integrity constraints for ACID transactions).

Typically, customers rely on MarkLogic to provide a persistent, operational database for low-latency transactions and they use Hadoop as a low-cost place to store data and do batch analytics. Integrating both systems is quite easy because there is a MarkLogic connector for Hadoop. And, there is a lot of parity in how MarkLogic and Hadoop handle data, and both systems actually rely on MapReduce for loading data and doing analytics.

MarkLogic and Hadoop Architecture

Customers such as KPMG, McGraw Financial, and a top investment bank have all found this division of labor between MarkLogic and Hadoop to work quite well. Below is a graphic that shows at a high level how these customers are using MarkLogic and Hadoop. Actual production system vary greatly due to the number of different Hadoop components, but the general architectural pattern is shown here—MarkLogic is the database, and Hadoop provides a low-cost storage option for structured and unstructured data. More info on MarkLogic and Hadoop can be found here.

hadoop ecosystem

The MarkLogic Connector for Hadoop provides a seamless integration


Start Taking the Online Assessment

So, with that introduction, we encourage you to try out the online TDWI Assessment Tool, download the Guide, and see whether your organization’s readiness for Hadoop.


Matt Allen

Matt Allen is a VP of Product Marketing Manager responsible for marketing all the features and benefits of MarkLogic across all verticals. In this role, Matt interfaces with the product and engineering team and with sales and marketing to create content and events that educate and inspire adoption of the technology. Matt is based at MarkLogic headquarters in San Carlos, CA and in his free time he is an artist who specializes in large oil paintings.

Read more by this author

Share this article

Read More

Related Posts

Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.

Architect Insights

What Is a Data Platform – and Why Do You Need One?

A data platform lets you collect, process, analyze, and share data across systems of record, systems of engagement, and systems of insight.

All Blog Articles
Architect Insights

Unifying Data, Metadata, and Meaning

We’re all drowning in data. Keeping up with our data – and our understanding of it – requires using tools in new ways to unify data, metadata, and meaning.

All Blog Articles
Architect Insights

When a Knowledge Graph Isn’t Enough

A knowledge graph – a metadata structure sitting on a machine somewhere – has very interesting potential, but can’t do very much by itself. How do we put it to work?

All Blog Articles

Sign up for a Demo

Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.

Request a Demo