Progress Acquires MarkLogic! Learn More

MarkLogic Releases the Healthcare Starter Kit v1.0

Back to blog
3 minute read
Back to blog
3 minute read

MarkLogic is focused on dealing with data complexity, and what better place to build a new framework than in healthcare? The MarkLogic Healthcare Starter Kit (“HSK”) is a runnable, starting-point toolkit to integrate, query, validate, master, and deliver healthcare data.

What is the Starter Kit?

The toolkit is a layer on top of MarkLogic’s industry-leading data processing platform, which pairs a NoSQL database with data integration technology. This additional layer includes many specifics that people need to ingest, curate, and distribute claims, member, provider, and other related data. It comprises models, mappings, query techniques, Fast Healthcare Interoperability Resources (FHIR) conversion services, and more.

We call the HSK a “starter kit” (or sometimes an “accelerator”) because the goal is not to have a generic product that works in all healthcare contexts, or even for all healthcare payers, but rather to lay the groundwork for anything our customers or partners may need to do so that it works faster, easier, and with higher quality by providing examples and key techniques we know are critical to healthcare data processing and access.

The broad categories of features or content in the HSK include:

  • Specific data models (claims, providers, member/patient, and location)
  • A general data modeling technique for any data model, illustrated by the four provided models, and which is closely linked to the FHIR specification
  • De-identification rules to create de-identified or “limited” data sets
  • Data cleansing rules to show how to correct known types of dirty data
  • Data mastering rules to identify and merge duplicate or related member records
  • Data flows to run all these mappings, cleanup steps, and transforms
  • Security policies for PHI data, employee data, and sensitive diagnoses
  • Semantic, ontology-based query services using CPT codes and SNOMED
  • Data mapping examples to canonicalize data into the provided starter models
  • An FHIR data service, and examples for real-time data services generally
  • DevOps code and plumbing to wire it all together (the project runs stand alone, but is designed to be altered and extended for different projects)
  • Unit tests and other best practice examples

Healthcare Data like a Utility

MarkLogic data hubs generally provide clean, highly-available data in challenging, complex environments. Complex, scattered, siloed, or legacy data goes in (like untreated water in the analogy), and cleaner, faster, more highly-available data comes out. The output forms span real-time, batch export to downstream, text search, SQL, semantic/RDF, and API-based access. Any data format in, and any access form out is our goal.

The Healthcare Starter Kit is particularly geared toward Medicaid Enterprises (MES), but can be used in almost any context. In fact, it is a good sample project to get started with MarkLogic data hubs even if your data is not healthcare-related at all.

A Good Place to Start with MarkLogic

We encourage people to download and tweak the HSK to solve their own data problems. If you are a healthcare payer, our provided “starter” models may be used out of the box, but if you are in manufacturing, finance, media, or some other domain, you’ll still be able to ramp up quickly once you write initial models for your own domain. The techniques will still work.

To make extension and change easier, the HSK includes a “cookbook” that breaks out typical tasks such as adding new data sources or elements, or changing outputs, and gives specific instructions on how to perform each task.

The HSK is runnable OOTB to allow developers to tweak and play, rather than build something from scratch. To be runnable, HSK includes sample data, mappings, and outputs that people will ultimately replace once they have their own data, mappings, and outputs.

As people use and extend the system, please reach out to us and add Github tickets, so we can support you as you adopt it. As a community resource, we are happy to help you get started, and we welcome your contributions back to this open-source project.

Download the HSK on the MarkLogic Healthcare Starter Kit Github page.

Damon Feldman

Damon is a passionate “Mark-Logician,” having been with the company for over 7 years as it has evolved into the company it is today. He has worked on or led some of the largest MarkLogic projects for customers ranging from the US Intelligence Community to to private insurance companies.

Prior to joining MarkLogic, Damon held positions spanning product development for multiple startups, founding of one startup, consulting for a semantic technology company, and leading the architecture for the IMSMA humanitarian landmine remediation and tracking system.

He holds a BA in Mathematics from the University of Chicago and a Ph.D. in Computer Science from Tulane University.

Read more by this author

Share this article

Read More

Related Posts

Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.


Semantics, Search, MarkLogic 11 and Beyond

Get info on recent and upcoming product updates from John Snelson, head of the MarkLogic product architecture team.

All Blog Articles

Integrating MarkLogic with Kafka

The MarkLogic Kafka Connector makes it easy to move data between the two systems, without the need for custom code.

All Blog Articles

Introduction to GraphQL with MarkLogic

MarkLogic 11 introduces support for GraphQL queries that run against views in your MarkLogic database. Customers interested in or already using GraphQL can now securely query MarkLogic via this increasingly popular query language.

All Blog Articles

Sign up for a Demo

Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.

Request a Demo