Gartner Cloud DBMS Report Names MarkLogic a Visionary

Mainframe Copybook to NoSQL

My colleague Ken Krupa recently found himself time-traveling between his early grunt years as a COBOL programmer and his current role as chief architect of an Enterprise NoSQL company. A meeting with a (huge) organization, heavily reliant on (massive) amounts of data generated from legacy mainframes had him playing with COBOL “copybooks” to bring them into MarkLogic.

COBOL has been around a few more years than Ken and was at the heart of many mission-critical systems across virtually every sector, from banking to manufacturing and healthcare to retail. Much of the structure lives in those aforementioned copybooks — sometimes embedded inside of the COBOL programs themselves. Getting this data into a NoSQL database would take the tarnish off this legacy data and spin it into a valuable commodity.

Legacy data structure from COBOL copybook.
Legacy data structure from COBOL copybook.

How hard was it to leapfrog from one architecture to another? “With MarkLogic and a set of open source Java libraries it was easy to ingest such mainframe data and also maintain the structure from the copybooks in a self-describing way via XML,” Ken told me. “More important however, having the data fully indexed in MarkLogic, regardless of the shape or structure of the data allows companies to glean new insights into such data like never before.”

Conversion of mainframe data files as defined by copybooks into a document-store like MarkLogic is a natural progression that was never well-suited in relational technologies.

Elegant XML is ingested into MarkLogic for easy analysis of legacy data.
Elegant XML is ingested into MarkLogic for easy analysis of legacy data.

“Copybooks are hierarchical by nature, as is XML,” Ken explained. “Also repeating items that might appear in an OCCURS clause of a copybook are handled easily with XML but are troublesome with relational. Additionally the poly-schema capabilities of NoSQL (e.g. different shapes of things with the same name) map well to the REDEFINES clause of copybooks. And then all of the typical stuff like not having to create tables or pre-define a landing place for the data just highlights the advantages further.”

An avid blogger, Ken chronicled his journey down memory lane — and  finding the right libraries to work his magic (and impress the prospect). In part 2 he shows the steps every step he took — including obfuscating actual data  by creating  a sample data set using custom copybook creators, the configuration of MarkLogic that anyone can download, mapping from COBOL to XSD. Questions? feel free to reach out to Ken via Twitter @kenkrupa.


Chief Content Strategist

Responsible for overall content strategy and developing integrated content delivery systems for MarkLogic. She is a former online executive with Gannett with astute business sense, a metaphorical communication style and no fear of technology. Diane has delivered speeches to global audiences on using technologies to transform business. She believes that regardless of industry or audience, "unless the content is highly relevant -- and perceived to be valuable by the individual or organization -- it is worthless." 

Start a discussion

Connect with the community




Most Recent

View All

Digital Acceleration Series: Powering MDM with MarkLogic

Our next event series covers key aspects of MDM including data integration, third-party data, data governance, and data security -- and how MarkLogic brings all of these elements together in one future-facing, agile MDM data hub.
Read Article

Of Data Warehouses, Data Marts, Data Lakes … and Data Hubs

New technology solutions arise in response to new business needs. Learn why a data hub platform makes the most sense for complex data.
Read Article

5 Key Findings from MarkLogic-Sponsored Financial Data Leaders Study

Financial institutions differ in their levels of maturity in managing and utilizing their enterprise data. To understand trends and winning strategies in getting the greatest value from this data, we recently co-sponsored a survey with the Financial Information Management WBR Insights research division.
Read Article
This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.