Cybersecurity: Shifting Focus to Insider Threats, Locking Down Data

Far-reaching breaches and hacks have put security at the top of the MUST pile. Conventional wisdom would suggest that outsider attacks are the prime culprit, but according to the SANS Institute, almost half (46%) of security breaches are from the inside. What’s more, the survey of 196 security professionals revealed that “change is slow in happening across the financial services industry while threats continue to get harder to detect and defeat.”

With employees scattered all over the globe, in physical offices and remote, at clients’ sites, it becomes harder and harder to focus solely on “just” the perimeter. According to the CERT website, “Insiders pose a substantial threat to your organization because they have the knowledge and access to proprietary systems that allow them to bypass security measures through legitimate means.” And perhaps more importantly, as one U.S. state CISO tells me, “Insiders have trust. They have already been vetted and now they are on the inside.”

Further, rogue employees know what is valuable and how to inflict pain, including:

  • Data exfiltration – to copy or move sensitive or valuable data out of the company
  • Sabotage – directed at either your organization or an employee

Former lead of Deutsche Bank’s cyber security, Inego Merino agrees. In an interview with Digital Guardian, Merino, now CEO and founder of Cienaga Systems said, “Industries at the forefront of security understand that insiders present a very clear threat because they have legitimate access to company information, and because it is difficult to ascertain their intentions at any point in time.”

Organizations may know that insiders are a threat, but the security systems they employ are largely looking out – guarding the perimeter. The reality is if you want to protect yourself from threats on the inside – you need similar access controls and monitoring capabilities as that which you implement for outside threats. So, how can you run a flexible, data-sharing organization – where people are motivated and empowered – yet data is still protected? Borrow from what the defense agencies do.


Cyber Situation Today

Cyber Situational Awareness today is focused on logs and read-only reports — void of any alerts or advisories. Access controls on are at the application or the system level — rather than at the data level. But understanding internal threats requires establishing baselines of behaviour including:

  • What systems truly contain valuable information
  • Who has access to which data
  • How often do people access that data
  • What is “normal” pulling of data (by individual)
  • What are the parameters like time, day, collection, and data type around access?
  • Which facilities, locations, organizations, teams are associated with the individual and the data?

Anything that falls outside of established baselines should trigger alerts of anomalous activity.

However all of these data sources are stored in a wide variety of systems and formats: LDAP directories that manage role-based access controls (RBAC) and attribute-based access controls (ABAC); system documentation –- including bug, patch, fix-notices from manufacturers; CERT messages; even, HR databases. All this data is in both tabular and document form, is highly variable, and always changing; modeling all this data is a nightmare.

No sooner do you get it modelled and a new source system is added starting the whole process again. Further, bringing in these various data types can mean various specialty indexes. Finally, associating any anomalies to a specific user can be quite difficult as there is not a common User ID across all of these different systems.

The armed forces are no stranger to building a threat management system that handles poly-structured and highly variable and volatile data. They call it All-Source Intelligence < https://definitions.uslegal.com/a/all-source-intelligence/. To accomplish this they chose a multi-model database precisely for its ability to handle volumes of disparate data – including geospatial, documents, binaries and csv files.

Using a multi-model database platform as an Operational Data Hub (ODH), data is integrated in real-time and all data is automatically indexed upon ingest. The hub can store and enrich data, entities can be stored as documents, and relationships as triples. This unified repository then can provide a single access point for search and discovery capabilities. The use of semantic metadata links information together — while keeping all of the pedigree, provenance and attributions that are already attached to the information.

The flexibility of multi-model database allows both real-time (run-the-business) and analytic (observe-the-business) data to be ingested in weeks not months, providing an ideal platform all-source threat management dashboard.


Securing at the Data Level

Of course, despite the best proactive vigilance, insiders (and outsiders) through mistake or malfeasance still can pose a risk to firms. The best way to safeguard your data is to focus security at the data level. Data-centric security should have role-based, compartment-level security settings as well as “encryption at rest” to ensure that data is only shared with individuals or organizations that have consent from the citizen to whom the data pertains.

In turn, data-centric services are layered on top to support run-the-business integration — complementing function-first SOA approaches. At the same time, as the best, “first place” to integrate enterprise data, the ODH complements downstream reporting and visualizations tools, which provides a real-time view of data lineage — regardless of how much and how frequent the business changes over time.


Characterisitics of a Data-Centric, Multi-Model Database

A secure, all-source intelligence system requires putting data at the center of the data architecture, with both real-time and analytic data flowing into, so the multi-model database has to be:

  • Flexible: Allowing multiple models including triples http://developer.marklogic.com/blog/making-new-connections-ml-semantics and geospatial http://developer.marklogic.com/blog/exciting-times-in-marklogic-geospatialdata to be harmonized instead of settling on a compromised single model.
  • Convergent: Providing a smarter and more agile bridge between operations (run-the-business) and analysis (observe the business) that harmonizes data inside of a powerful and flexible database.
  • Contextual: Combining data with metadata for live and query-able lineage
  • Data-centric: Integrating at the data level, not just functionally
  • Cost-effective: Minimizing ETL, data copying, business silos, technical silos and people-centric integration
  • Secure: Providing government-grade security, including encryption-at-rest, and tested in the most demanding environments.
  • Scalable: Providing enterprise capabilities tested and hardened at scale.
  • Complementary: Leveraging existing assets without assuming a rip-and-replace strategy

The multi-model database platform, MarkLogic, allows data to be ingested as is, which is crucial for integrating all those disparate data types and sources that need to be merged together. The end result is data comes in faster – without ETL or multiple, disconnected data stores – and goes out faster as RESTful services that include real-time, on-demand transforms to the desired data format and/or various view for business intelligence.

The job of a CISO has never been harder — especially if they continue to only implement security strategies that are geared exclusively to manage outsider threats. Today’s CISO needs to build an all-source threat management platform – as well as lock down data at a very granular level.


For more information on this topic

Solving the Cybersecurity Puzzle Federal Computer Week report on how organizations can safeguard against internal threats through situational awareness.

MarkLogic Solutions for National Security Gathering all-source intelligence aids in creating a threat management dashboard. Borrow a page from national security agencies.

Operational Data Hub An operational data hub allows you to integrate and index real-time and analytic data. The hub can store and enrich data, store entities as documents, and relationships as triples.