Gartner Cloud DBMS Report Names MarkLogic a Visionary
| June 23, 2021

Understanding Metadata

At MarkLogic, we talk a lot about metadata: information about information. It’s what our customers use to make connections and gain new insights about things they care about.

That being said, I’ve been in meetings where someone starts talking about metadata, and others are clearly puzzled. What is this all about?

I think we could do a better job explaining what metadata is, how it works, and why it’s so darn important. Metadata is not a new concept, it’s been around in various forms for a very long time.

Metadata is what helps you find things of interest.

Libraries Are a Great Example of Metadata

For those of us of a certain era, we likely spent some serious time inside libraries. If you’re not of that era, libraries are big reading rooms, all well-organized and well-indexed for the most part.  

A library without an index is much less useful.

You might even be familiar with the Dewey Decimal system, the different types of library index cards, and so on.  

A library index card cabinet is a physical instance of collected metadata. You look for books by title, subject, or author. It then gives you a number that you then use to minimize time spent traipsing around vast bookshelves.  

Simply put, you use it to find relevant things quickly. But wait, there’s more.

The metadata collection itself — even without the actual publications they reference — can be used to answer all sorts of really interesting questions.  

Who are the most prolific authors by subject? When were they most prolific? How likely is it that they go on to write on other topics? These sorts of questions have absolutely nothing to do with specific book locations.

My point? Rich, searchable metadata — like library index cards — can have value far beyond any original intent.

The Example Extended

Now, let’s say you are now in charge of the library index card catalog.

There’s a standardized format for index cards that’s been around for a very long time. Don’t want to change that much, right? But now you want to index and catalog all sorts of entities that didn’t exist in 1876: digital media, websites, and so on.

If it were me, my primary concern would be — in a fully digital world — this library cataloging system designed to organize physical objects might lose its rationale entirely. What’s the new mission?

In the meantime, not only do your metadata users want to ask deeper and deeper questions around your indexes, they want to make connections across other metadata stores.

Who uses the library, what are they interested in, and what do we know about them so we can serve them better?  

If you’re in charge of the library itself, that is a very important question.

A key part of the answer lies in the index card catalog, and measuring how it’s used.  

But you’d need other searchable metadata about your library users, and you’d want to connect that with maybe some demographic and location information, and so on.  

You’d want to search, discover, and make connections across very different kinds of data — and metadata.  

What you’re really doing is building is a model. You want to use it to inform all sorts of decisions about the library you are responsible for, and how it functions.

Why Did I Drag You Through This Example?

Just about everyone is aware that information — raw data — can be powerful stuff. Just like vast, untapped mineral reserves — there’s a lot of value there, if only we could get to it.

Metadata — information about information — can be exponentially more powerful. It can be used to build logical models that describe how things behave, and can inform us on a wide range of important decisions.

Human beings tag, search, and connect their experiences intuitively. It’s part of how we think and reason. Organizations are in the process of learning to do basically the same thing via software.

If you follow all the popular IT business leadership memes — things like digital transformation, knowledge-centric organizations, next-gen industry models, pervasive machine learning, and the like — they all will depend utterly and entirely on rich, searchable metadata layers.

So it’s pretty clear that the incentive is there to take the whole topic quite seriously, regardless of what words you use to describe your motivation.

We all know that good things aren’t easy, and easy things aren’t always good things. There is documented heavy lifting along the way.

That being said, there is ample evidence that MarkLogic is uniquely qualified to help organizations that want to build rich, active metadata layers to inform — and ultimately drive — key aspects of their operation.

The good news is that — now that we all have a baseline around metadata and why so many people are interested in it — we can move on to the journey and the process, and how MarkLogic fits.

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.