As more organizations have come to rely on big data, the data lake has held the promise of a flexible, scalable, integration point. Unfortunately, this very “flexibility” has made it easy to “pollute” the data lake with duplicate, stale, incomplete, and even wrong data. As a result, there is more data overall, but less reliability in its value. Add to this a lack of organization, governance, and security, and data lakes become a barrier to analytics and operations that rely on the data.
Fortunately, your investment in a data lake doesn’t need to be considered a waste. In this on-demand webcast hosted by O’Reilly Media, Damon Feldman, Solutions Director at MarkLogic, discusses how an operational data hub can be combined with the data lake — maintaining the agile, flexible functions that are characteristic of a lake, with the added ability for fast, secure, governed data access. Watch now and learn how an operational data hub can utilize a Hadoop-friendly, multi-model database, without reducing it to relational structures and applying ETL transformations.
In this webcast, you will learn how to:
- Facilitate progressive transformation and mastering of data as operational data needs change
- Provide real-time, operational data access through strong indexing
- Make sense of the data lake — in an iterative fashion
- React to events that occur in the data hub, such as updates to mastered data elements
- Secure a portion of a data lake’s data in a data hub, to conform to internal security policies and regulatory requirements