Government Data and Knowledge-Based Information Management

During the last decade, government agencies have collected various data sets, the more the merrier. Inundated with database schemas, documents, emails, web content and XML, agencies are left with daunting integration challenges. Attempts to use existing tools to systematically harvest and correlate data and turn it into insightful information have not been fruitful. Complex manual transformation operations are draining budgets by needing to mobilize IT resources to make sense of existing data instead of delivering insights to decisions makers.

For instance, a large non-governmental organization (NGO) wanted to expose its institutional knowledge by integrating 27 data sources and stores and incorporating advanced discovery and search tools to its staff, partners and global audiences. After years of effort and multiple, costly IT projects, the NGO could not unify its data — and it never got to tackling how to capture institutional knowledge in a usable form. A year ago, the NGO decided to try a new approach – one that combined data integration with a semantic layer.


Wrapping Data With a Semantic Layer

My team worked with the NGO team to deliver a knowledge-based information management platform. The main idea was to wrap MarkLogic’s data integration capability with a semantic layer in order to expose advanced knowledge services.

As the illustration shows, it all starts with self-descriptive data tagged with pedigree, provenance, confidence, security and privacy markings. The resulting model represents the organization’s information, which is securely accessible independently from the higher function services. The organizational knowledge is then modeled as an ontology and mapped to the Information model. Semantic concepts, vocabularies, entities and their relationships are captured and can be modified dynamically to respond to new requirements.

Knowledge services leverage the semantic model to provide high-value cognitive operations such as concept identification, advanced search, knowledge-based entity extraction and cross-assets discovery. This approach is providing unmatched flexibility and insight.

This resulted in drastically more accurate entity extraction which in turn allowed the most powerful, facet-driven searches. After years of trying, the firm’s institutional knowledge is now accessible (no matter your frame of reference) from any query. As information and data changes — in response to the constantly changing world — the system can be updated, real-time, without being re-engineered. And this was done literally in months not years.

Here’s a Government CIO article on how semantic technology is transforming government data and productivity.