Making the Case for Semantic Metadata
I did a webinar a couple of weeks ago on Metadata is the New Gold! It was a great conversation with Brian Cross from Condé Nast, whose team is pushing the boundaries of tagging using crowd sourcing; Mike Green from Avalon, whose comprehensive work in the entertainment space on metadata has left no stone unturned in detailing how critical metadata is; and, hosted by the entertainment industry’s ringmaster, MESA Alliance’s Guy Finley himself.
The conversation covered a lot of ground – starting with just how critical metadata has become. Every part of the media industry is creating and relying on metadata. Production teams and (in Brian’s case) photographers are generating metadata on the cameras; editors and product creation teams are sifting through mountains of content looking for the ideal clip or image; and, distribution teams are tailoring metadata to reach specific audiences for their products on the digital shelf. It’s all adding up to make metadata one of the vital resources for media companies. You know: people, capital, real estate . . . and metadata!
As the conversation turned to what is next and we got into how the Semantic Web is going to impact metadata you could just feel the excitement of what’s possible – because semantics really is cracking open some of the toughest problems with metadata: how to record the categories and relationships.
Enterprise NoSQL brought schema flexibility that doesn’t give up security and data integrity and solved some of the many nasty metadata problems including storing multiple types of metadata, not having to pre-define every element and adjusting as data changes without having to rebuild the entire system.
But there are still challenges that we were working on:
- Categories and lookup lists are still very, very complex and don’t handle complex hierarchies
- Because of the importance of metadata to every part of the business purpose-specific categories are now needed making it even more compiled
- Every item of metadata needs to have all the attributes and look up items tagged in that record pushing metadata attributes into the thousands (from the already unwieldy 100s)
These are big problems because this is the part of metadata that is becoming most valuable. These very precise categories and relationships let you quickly find a specific character or designer that is used for a specific type of asset in a specific genre. And when you find it, you need to then deliver it with specialized metadata so that it can be tailored to your specific audience. Without it you are stuck back looking for that needle in the metadata haystack.
These challenges are why we’re so excited about Semantic Metadata. Semantic data models information in a simple format called RDF triples. An RDF triple is a subject, predicate, and object, such as ‘John lives_in London.’ This simple fact can be combined with another triple such as ‘London is_in England,’ and a new fact can be inferred: ‘John lives_in England,’ which was not in the original data set. By storing hundreds of billions of triples, you have a sophisticated graph structure through which you can see and discover new information!
To help manage the metadata you use ontologies. These describe the data you will be managing (places like England and London) and the way they are related (lives in, is in).
Ontologies can also contain key facts . . . and this is what has us so excited. John can now be part of the ontology. And if he is an actor and has played several characters, that can also be part of the ontology. What’s more we can create ontologies that describe all the categories we need and how they related to each other . . . and to the characters and works. What’s more, we can have ontologies that contain the purpose-specific information for the different audiences for metadata – precise content creation categories for production teams, archival search categories for editors and market ready categories for delivery.
The power comes when you combine these ontologies and semantics with the flexibility of Enterprise NoSQL. MarkLogic now natively stores RDF in addition to documents, to make just this type of combination possible and the results for metadata are fantastic:
- You can fully leverage ontologies not just to create models – but as part of the active operational database!
- You can use several ontologies and mix them together to fully describe the asset and provide the right data to all the audiences
- You can change and add items – because the ontologies are independent of the data you can add new categories and the users can immediately see the benefits without having to retag the entire database.
But the real kicker is that, because MarkLogic has combined the data management and Semantic Web technology, you don’t need to tag every possible item in the metadata itself. With just a just a few semantic links, all the data about the categories can be linked to the metadata. For instance, using our example, we can add the fact that John is an actor, and then add to the metadata record for a film that John played a the role of Archie, the film is a comedy and it is about criminals. Now we can use everything we know about those facts to help people find the asset, give recommendations and tailor it for delivery. For instance that John used to be in a comedy troupe and that therefore fans of British absurdist comedy will probably like this film. And then, when the asset is selected we can use the same data to tailor the distribution metadata for that audience – for instance the Latin American genres and descriptions instead of those for Europe.
In almost every corner of the media world people like me, Brian, Mike and Guy are getting excited about this. It’s like manna for metadata junkies. We can finally take all that work we’ve been doing to model the complex universes we work in (and can you imagine what this means for some of the big media companies – take Cat Woman alone …) and use it to create systems for end users that will actually deliver what they are looking for.
The Semantic Web is bringing big things to the world of metadata – and I’m happy to be along for the ride!