Donald Soares is CTO of Retail and Consumer at MarkLogic and is based out of our Chicago office. With over 20 years of experience working with Retail, Consumer and E-Commerce clients, Donald has a deep understanding of the unique needs of this vertical. That’s why he is so passionate about championing MarkLogic’s value proposition as the only Enterprise NoSQL database – a technology specifically designed to surmount the Big Data obstacles currently hindering the Retail and Consumer industries.
Few industries have greater access to data around consumers, products, and channels than the Retail and Consumer Industries. Data and insights is at the heart of what drives this business. Yet beyond the hype most attempts at using Big Data for business transformation and building a competitive advantage have been dismal failures. The horror stories about the multi-million dollar Big Data projects that didn’t move beyond PowerPoint are true!
Today the industry has made major strides in tracking, reporting, and analyzing data on what’s selling via POS systems and improved Syndicated data from Nielsen and IRI. But it’s still hopelessly weak in capturing and analyzing live data on consumers, products and trends that come from emerging data sources. Simply put, today most retailers have a handle on the “What’s selling?” question, but their true challenge lies in analyzing “Why” consumers are buying and responding to them in such a way that influences sales.
Why Do Big Data Initiatives in the Retail Industry Fail?
In Retail today it is estimated that 80% of new data sources stemming from consumers, products, online and in-store are not even considered for analysis. Retailers face an immense challenge with the sheer volume, velocity and variability of the data that they need to manage. To expound, here are a few examples of complex Big Data projects and the problems they attempt to solve:
A Consumer company that wants to gain insights from the one billion consumers who buy its products each week and respond to them with the right personalized messages. The challenge is volume.
A Retailer wants to manage its global supply chain and analyze data in real-time from sensors, RFID chips, shipment notifications and receipts. The challenge here is velocity.
Another Retailer wants to manage a loyalty database of 70 million consumers – that combines demographics, purchase history, preferences and location – and leverage it for live consumer purchase transactions and promotions both online and in-store. The challenge to overcome is variability.
Existing retail systems were just not built to solve these problems!
Further, just getting content into a relational database is a daunting task. To fit within a pre-defined row and column structure, users need to first analyze the details of their content to identify the schema and map it into rows and columns. This is a costly, time-consuming first step that many find to be nearly insurmountable. Many a Big Data project has met with premature death at just this stage. It makes sense. Imagine trying to develop theperfect schema for a data model that works for a billion consumers, across multiple countries, brands and channels – and you’ll get the picture.
Finally, there is the question of integrating and analyzing the data to derive insights. In Retail, having a real-time operational and transactional system is critical because you will want to make live promotional decisions online or in-store, check product availability by channel, and respond to consumer queries immediately and accurately. This is not so easy when data is stored across different legacy systems that were never designed to “talk” to each other. This is illustrated by the fact that consumer loyalty data is almost never linked to social media data, and online purchase data is rarely linked to store location or loyalty card data. It amounts to flying blind.
Flexible Data Models End Disconnects in Consumer Information
These data disconnects have created a need for a more flexible and scalable database that can easily operate in today’s modern infrastructure. Traditional mainframe or RDBMSs lack the flexibility and scalability to handle the volume, velocity, and variability issues inherent to Big Data. NoSQL (Not Only Structured Query Language) technology represents a transformational change in perspective. Instead of getting the schema just right before doing anything else, NoSQL advocates loading up the data first and then seeing where the problems lie. This problem-oriented approach focuses on how the data will be used (queried) rather than how the data must be structured to fit within a traditional RDBMS.
In this complex and ever-changing world of modern Retail, this shift means you would not have to spend a year trying to figure out the right data model and perfect schema to analyze and store data on a billion consumers upfront. Instead, you can load the data, have it indexed automatically, and then search and query it for emerging trends and signals. This works because NoSQL models data in a way that’s easier for mere mortals to understand. The NoSQL document model makes it easier to ascertain what the data is about from a human perspective, and reduces the transformation required for moving data between tiers. By the way – the term “document” does not necessarily mean a PDF or Microsoft Word document. The document can also be a single block of XML or JSON.
Enterprise NoSQL Opens Up a World of Opportunity in Retail
Over the next few weeks I will publish a series of articles on how Retail and Consumer companies can use Enterprise NoSQL as a solution to their Big Data challenges. The reality is that the Retail and Consumer industries have fallen behind other industries. It’s time to play catch up!
Please be sure to read my other articles in this series:
1. Big Data, Little Insight: Challenges for the Retail and Consumer Industries