The term “technical debt” is often used to describe decisions made during software development that result in future rework. This is typically because shortcuts are taken instead of more thoroughly vetted approaches. The term can also be applied to unintended consequences resulting from “not knowing any better” and/or as a result of following “accepted wisdom.”
In the case of today’s data integration challenges, it has become clear that much of IT has participated in an accrual of technical debt, simply by using accepted tools and techniques.
If technical debt is not “repaid,” it accumulates “interest,” which makes it hard to make changes later on. This creates a snowball effect. In theory, the debt should be paid off by completing the work, right? But the more debt that accumulates, the more deadlines are missed, as there’s too much uncompleted work than there is time to finish it.
Let’s examine specific technology categories and assess their respective contributions to technical debt—and how an Operational Data Hub (ODH) can help solve technical debt troubles.
ETL Tools and Processes
Purpose. These automate the transformation of data between differing database models across systems. For instance, data stored in multiple OLTP models might need to be integrated and stored in a separate OLAP model in a data warehouse. In these cases, the ETL processes take care of the data manipulation and movement.
Technical debt. ETL processes result in high costs from the proliferation of data copying and transformation that ultimately have impeded the enterprise data discovery process. It is estimated that upwards of 60 percent of data warehouse costs are associated with ETL.
Service-Oriented Architecture (SOA)
Purpose. SOA connects operational business applications to each other to support cross-line-of-business activity as part of enterprise application integration (EAI). Messages are passed between cooperating end-points represented as coarse-grained services.
Technical debt. SOA causes a proliferation of point-to-point application interchange throughout the enterprise. Because the integration is more function-focused and often not data-focused (except for a small amount of message interchange), the result has been to accelerate duplicated data living across data silos.
Master Data Management (MDM)
Purpose. Achievement of a “single source of truth,” or golden copy, for critical business entities through a set of processes and programs.
Technical debt. MDM sets up an ambiguous quest to have one data standard to govern all silos. Many challenges have been compounded by data transformation and data copying resulting from ETL. The irony here is that many solutions call for creating yet another “golden” copy of certain data, also dependent on ETL.
Purpose. Provide the ability to do cross- line-of-business analysis, typically quantitative in nature.
Technical debt. Stale information results from the heavy dependency on ETL. Also, because of the difficulty integrating disparate data sets, data warehouses contain a small subset of source data entities and attributes, precipitating the need to duplicate data even more by way of data marts.
Purpose. Store narrowly-focused analytical data (unlike the broader data warehouse). Data marts often containa subset of data warehouse data, combined with other business or domain-specific data not contained in the data warehouse.
Technical debt. Data marts create proliferation of data silos and data copying. As a reaction to the compromises of an enterprise-wide data warehouse (slow pace of delivery, subset of line-of-business data), data marts have proliferated in many enterprises.
Data Distribution and Operational Decision Making
Purpose. Deliver pertinent information to data-dependent stakeholders. Stakeholders include internal decision makers and may also include recipients external to the enterprise such as paying customers, B2B partners and regulators.
Technical debt. Because these functions are the farthest downstream in a traditional data integration architecture, they are the most negatively impacted by the challenges of integrating data in business silos. These negative impacts manifest as issues associated with time-to-delivery, data quality, and data comprehensiveness.
ODH to the Rescue
The net result of all of the above for nearly every large enterprise is an accumulation of technical debt, most of which is related to attempts to integrate data from various business and technical silos. This technical debt manifests itself in many ways, oftentimes resulting in negative impacts on either the day-to-day business operations and/or hindering much-needed IT innovation. For most large enterprises, the problems seem intractable, leaving many wondering how we got to such a point in the first place.
The good news is, MarkLogic has a solution: the ODH. It’s a new (and proven) enterprise architecture pattern created specifically to address data integration. It is a blueprint type of pattern, meaning that implementations will vary according to context, but it has a set of core principles and guidelines that distinguish it from other architecture patterns.
An ODH is an authoritative multi-model data store and interchange, where data from multiple sources is harmonized as-needed, to support multiple cross-functional analytical and transactional needs. It functions as an authoritative integration point, as opposed to a “system of record” for all data.
In the next blog in the ODH series, we will cover specifically how the pattern accomplishes the above—and what kind of advantage it gives your data management efforts.
For more on ODH and technical debt, check out our new eBook, “Introducing the Operational Data Hub.” It basically contains all things ODH—what it is, how it works, why you need it, data management problems it solves, and more.