Progress Acquires MarkLogic! Learn More

Extracting Hidden Value from Your Content

Back to blog
7 minute read
Back to blog
7 minute read
Woman working on business strategy.

The deciding factor between the success and failure of a project is not the volume of knowledge at your disposal, but rather its discoverability.

U.S. National Aeronautics and Space Administration

NASA (National Aeronautics and Space Administration) clearly articulates the importance of discoverability to leverage an organization’s hidden value in its content.

Why is discoverability so important? I’m glad you asked!

Discoverability enables:

  • Re-Use – Lets you use your content in different contexts quickly and for different use cases
  • Audit – Allows you to stay ahead of your competition and FULLY access the volume of knowledge in your content
  • Create – Helps you focus resources on what is missing

If you are looking to get value from your content, it is all about discoverability. Markets are shifting fast, and expectations from customers are always growing. For this reason, organizations need to stand out from the crowd. If not, they will eventually be replaced.

I hear someone, in the distance, asking, How do we get to the discoverability phase? Before we get to that, allow me to set the stage.

Usually when people hear of “content creators” they think of the publishing and media industries. These are not the only areas where content creators exist! A content creator is anyone who makes and publishes digital content — whether it is created from scratch or re-used from existing content. Anyone in any industry can be a content creator. This means anyone in any industry who creates and publishes digital content should be thinking about new ways to use their content.

The question remains, how successful is your organization at extracting and using the hidden value in your content?

MarkLogic has customers in Manufacturing, Publishing, Media & Entertainment, Finance, Healthcare, and other industries who are all content creators.

  • The unified MarkLogic data platform allows our customers’ content creators to build their mission-critical applications for their content, as the platform allows easy storage, indexing, and validation of content
  • MarkLogic can manage and deliver data in a variety of different formats due to its multi-model approach to data
  • Semaphore by MarkLogic auto-classifies, tags your content, and extracts information to create new use cases, applications, and user experiences

Customers typically start getting control over their content with a structured authoring process. Structured authoring is an authoring workflow that lets you define and enforce consistent organization of information in documents. It is also flexible because it is often done online and can be understood by software as well as humans. In structured authoring, content can be broken down into smaller bite-sized pieces and re-used elsewhere.

Smarter Content Roadmap

While content creators are on a journey to smarter content, structure is not enough. They want content that delivers more. Through the power of enriched metadata and semantic relationships, content creators can make their content more useful to humans and computers – and, in turn, power new applications. This means you can transition from machine-readable data to machine-interpretable data. Unfortunately, many are not accessing these tools – which means their content isn’t fully discoverable.

In addition, creators are often poor at tagging their content and often wind up re-creating it rather than re-using it. The beauty of MarkLogic’s unified platform is that it manages, indexes, stores, and searches content with ease AND builds upon that content with powerful new use cases.

Many content creators in enterprise organizations have already:

  • Created structured or semi-structured content
  • Stored their content in powerful repositories that can index and search
  • Published their content online to allow their customers to access, purchase, or use some of their content

While these steps are beneficial, this is simply not enough to fully leverage their content.

As we noted earlier, getting the most value out of content means making it discoverable —which requires a number of steps that many organizations have not yet achieved. In our work with content creators in enterprise organizations, we’ve found those additional steps to be:

  • Create or adopt models (ontologies, taxonomies, vocabularies) or entities to define content
  • Use those models to drive Classification, Fact Extraction, and Entity Extraction to enrich your content with metadata and understand it in different contexts. This… is… Key!
  • Combine models, metadata, and data to build a knowledge graph
  • Use that knowledge graph to find new ways to create, recommend, consume, and monetize your content

What Is Semantics?

Semantic technology is a set of methods and tools that provide advanced means for categorizing and processing data, as well as for discovering relationships within data sets.

Semantic searches can use an alignment to a term to navigate relationships in a hierarchy, returning more relevant results.

Ultimately, semantics provides context, meaning, and insight to your data. Once again — making your data machine-interpretable.

Knowledge Graphs

Knowledge Graphs model content that is important to your organization – the entities, concepts, topics, and relationships between them.

The knowledge graph creates a complete picture of the information, that organizations can use to gain new insights, drive business processes, govern information, and create new revenue streams.


We have discussed the importance of metadata, semantics, and knowledge graphs for content enrichment. Now let’s discuss Classification as a key element in providing metadata so it is machine interpretable. Since content creators usually have a lot of content, we want to make sure we are semantically serving the right content. Doing this manually would take too much time. A classifier will allow you to read and understand your document and offer up a correct classification.

Semaphore by MarkLogic does exactly this!

  • Within the MarkLogic data platform, Semaphore will classify your documents against the model and MarkLogic Server indexes the content for search.
  • The taxonomies and ontologies are quickly expanded for more details for classifications and extractions. These can be internally produced or externally sourced.

Entity & Fact Extraction

Content can be classified and you can extract entities and facts that are important to the document.

Entity Extraction lets you design applications around real-world concepts, or entities, such as Customers and Orders, or Trades and Counter Parties, or Providers and Outcomes. The entity extraction process in Semaphore aligns the business analysts who define the entities and the developers who combine them in application code.

When a user understands their content — the documents, their structure, and the data and facts they contain — they can model specific aspects of that content to allow them to extract relevant facts. The Fact Extraction capability in Semaphore generates units of information (“facts”) contained within content that have a specific meaning (paragraphs, sentences, or terms) and can have a simple or complex structure.

This is important because the linked entities can provide additional information. Again, by classifying, you are ensuring you are matching the correct documents to concepts.

Entity extraction and fact extraction are key capabilities to improve search and discovery — so you can provide a better service to your customers. They allow end users to use search terms THEY understand – this is HUGE!

“I’m fifty! Fifty years old!” — A Use Case Example

Several years ago, NBCUniversal decided they wanted to do something special for Saturday Night Live’s 40th year anniversary by highlighting its content. The challenge was they had so much wonderful content from different eras and weren’t sure which content to highlight. So they decided to have an application built to connect fans to all their content and provide an entirely new experience consuming SNL content.

When I heard about the application, I was so excited and wanted to try it immediately. There was a set of skits that I so enjoyed watching, but I couldn’t remember the actress’s name or her character’s name. But I had the skit’s catch phrase engraved in my memory. Have you ever had a phrase stuck in your head that you couldn’t stop thinking about? This is exactly what I was experiencing! In fact, every time I thought of it, I would walk around my home, kick my leg in the air and say, “I’m Fifty! Fifty years old!” And yes, it would bring me great joy! (Don’t judge me.)

On this day, I would find out if I could benefit from this phrase that was stuck in my head. I typed “I’m Fifty” in the search bar and, to my delight, ALL of Sally O’Malley’s skits that were played by Molly Shannon popped right up! I spent multiple hours discovering skits I never even knew existed. All because it provided maximum discoverability.

NBCUniversal - SNL40 app success story

THIS, my friends, is the power of fully leveraging your content!

The question remains, how successful is your organization at extracting and using the hidden value in your content?

If your organization has content that needs to be accessed by internal or external people, are you confident they are discovering the content they need as easily as possible? If not, you should consider if your organization has:

  • Created or adopted models (ontologies, taxonomies, vocabularies) or entities to define content
  • Used those models to drive Classification, Fact Extraction and Entity Extraction to create metadata that helps you understand your content in different contexts
  • Combined models, metadata, and data to build a knowledge graph
  • Used that knowledge graph to find new ways to create, recommend, consume, and monetize your content

If not, I assure you that your organization is not getting the most value from your content — and may be leaving money on the table or creating frustrated consumers who might be looking at other solutions!

Explore Your Options

To learn more about smarter content, read our eBook, Content Evolution: 4 Digital Innovations for Content Creators

To speak with someone about getting more value from your content, email the MarkLogic Customer Success team


La-Verne Chambers

La-Verne is a Customer Development Representative in the Customer Success Management team. She serves as a trusted advisor to our customers, supporting their continued use of MarkLogic by ensuring customers have current knowledge of new developments, assisting our customers to be successful and streamline their businesses by using our products and sharing new use cases, and being a liaison between our customers and MarkLogic resources

When she is not working, La-Verne loves deep sea fishing and usually averages about 25 – 30 catches per fishing trip. She also loves organizing events and spending time outdoors with her dog, Brooklyn.

Read more by this author
Read More

Related Posts

Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.

Use Cases

Using MarkLogic for Cyber Threat Analytics and Awareness

MarkLogic helps agency decision-makers connect to their IT data quickly, make better decisions, become more agile, and solve their most difficult cyber challenges.

All Blog Articles

Standardizing Internal Data Models on FHIR

Learn about MarkLogic’s work on a FHIR-based standardized data model to support persisted payer data for our Medicaid Accelerators.

All Blog Articles
Use Cases

3 Steps to Deliver a More Intelligent Scanning System for Managing Corporate Announcements

Find out how to quickly deliver a data repository for automating corporate announcements information management.

All Blog Articles

Sign up for a Demo

Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.

Request a Demo