Progress Acquires MarkLogic! Learn More
BLOG ARTICLE

Can It Be Searchable — But Not Readable?

Back to blog
07.18.2013
2 minute read
Back to blog
07.18.2013
2 minute read
Person using a tablet

Working with brainiacs, great discussions spontaneously occur. The following situation arose: A client wanted  all data to be searchable – but not all data readable. The instance they gave was an HR person knowing that specific forms were there — but not allowing them to be read. What were best practices? Our engineering and field teams weighed in:

  • Generally speaking, database permissions do not distinguish between letting you know a document exists and letting you read the document. The default is to have security role-based so that if you cannot read a document, you aren’t even allowed to know it exists. You could make a function that would have elevated permissions to see documents that the calling user may not otherwise be able to see, but the function would only return to the caller only whether or not documents exist.
  • We implemented something like this for a national lab. The search was elevated to a higher user context  to return back a set of search results and limited metadata for an asset.  When users attempted to view the asset and they lacked permission, they could fill out a form to request access to the asset.
  • While this is a capability we can deliver well, the prospect should think through the security implications of even letting users know that documents exist that match a query. For example, if you allow full-text search on HR docs and store comp plans in a standard format, it would be easy to brute-force everyone’s salary by searching for names and possible salary values (“john smith ‘salary: $1,000′”,  “john smith ‘salary: $2,000′”, etc.) until you get a hit back, even if you can’t see the matching document.
  • In media, clients will make documents searchable – so that the asset is discoverable — but gate the documents behind a paywall if the reader does not have proper credentials. 
  • In some cases people require parts of a document be secured for reading them, but are fine with them being in the search index. E.g. national security – ‘Show me every document pertaining to Organisation X.” Knowing it exists is a GoodThing[™] as it then allows intelligence personnel to request access to the full content. In some cases, some users will be able to read the information in an intelligence report, but not the report section saying ‘Future surveillance,’ for example. It is possible to embed security within document content and use MarkLogic to redact these documents on the fly. This relies on the application assigning a standard in-document security tagging mechanism though.

According to my colleague Adam Fowler, “There really is no easy answer – it depends entirely on the data, what you need to index — and the organizational mandates and restrictions that exist.”

Diane Burley

Responsible for overall content strategy and developing integrated content delivery systems for MarkLogic. She is a former online executive with Gannett with astute business sense, a metaphorical communication style and no fear of technology. Diane has delivered speeches to global audiences on using technologies to transform business. She believes that regardless of industry or audience, "unless the content is highly relevant -- and perceived to be valuable by the individual or organization -- it is worthless." 

Read more by this author

Share this article

Read More

Related Posts

Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.

Developer Insights

Multi-Model Search using Semantics and Optic API

The MarkLogic Optic API makes your searches smarter by incorporating semantic information about the world around you and this tutorial shows you just how to do it.

All Blog Articles
Developer Insights

Create Custom Steps Without Writing Code with Pipes

Are you someone who’s more comfortable working in Graphical User Interface (GUI) than writing code? Do you want to have a visual representation of your data transformation pipelines? What if there was a way to empower users to visually enrich content and drive data pipelines without writing code? With the community tool Pipes for MarkLogic […]

All Blog Articles
Developer Insights

Part 3: What’s New with JavaScript in MarkLogic 10?

Rest and Spread Properties in MarkLogic 10 In this last blog of the series, we’ll review over the new object rest and spread properties in MarkLogic 10. As mentioned previously, other newly introduced features of MarkLogic 10 include: The addition of JavaScript Modules, also known as MJS (discussed in detail in the first blog in this […]

All Blog Articles

Sign up for a Demo

Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.

Request a Demo