Best for Big Data Search

A large non profit cut their hardware and operations costs by 70% while increasing customer satisfaction when they switched from Lucene/SOLR to MarkLogic.

When building systems to manage Big Data applications, search cannot be an afterthought. From day one, MarkLogic has been built to support the fastest, highest quality search in the industry.

MarkLogic’s scale-out, real-time platform is more than a search engine linked to a content repository – it is the most complete platform for building search-oriented applications.

  • Search all data for more value. Bring all relevant content back to users – unstructured and structured, internal and public.
  • Real-time updates. Real-time results. When documents are updated or inserted, they are available for search immediately.
  • Able to query all types of data. Structured, semi-structured, and unstructured content are all supported within the same queries.
  • Real-time alerts for fast response. MarkLogic has the highest performance alerting engine available, capable of running millions of custom queries on each and every change to the document repository – no polling required.
  • Search you can bank on. Businesses that count on revenue through paid content search and retrieval trust MarkLogic to deliver.

Comprehensive Full-text Query

To ensure even the largest organizations can find exactly what they’re looking for, MarkLogic provides search and retrieval of documents or parts of documents with enterprise-class, full-text query. The capabilities include word and phrase, stemming, Boolean, wildcards, case sensitivity, punctuation sensitivity, diacritic sensitivity, and weighting.

Results That Are Relevant

Document-level relevance boosting to support “page ranking” ensures the most relevant results. Options for relevance algorithms include term frequency/inverse document frequency, term frequency only, and simple term match. 
Results can be ordered by relevance or content metadata (in string, numerical, or date/time formats). Proximity boosting ensures that terms found close to each other are scored higher than terms found farther apart.

Highlighted Matches For Ease

Scanning for search terms in document results is no longer necessary because text snippets that match the query are highlighted for ease of discovery. This is typically used for results lists, but also can be used as a search and replace function.

Global Language Support

Basic language support for more than 200 languages is included in MarkLogic. Advanced language support includes stemming, tokenization, and collation rules for 14 languages to enable more precise, language-specific search.

Optimized Search Development

Developers can take advantage of the Search API that supports popular functionality such as automatic query text parsing, constrained, faceted navigation, snippeting, and search term completion.

Lexicon, Custom Dictionary, and Thesaurus

Easily build lexicons for all words in the database, support custom dictionaries that affect stemming behavior, customize a thesaurus to suggest synonyms by user or user group, and search on distinctive terms to ensure you get the best results.

Range Queries For Speed

Ensure high-speed lookups of values in elements using range indexes. This enables fast range querying and popular search functionality such as faceted navigation and term auto-suggestion.

Search Security

Search results will return only the records that end-users are authorized to see. If permissions are updated on documents, those updates are reflected automatically and immediately in the indexes and in subsequent searches.

Location Awareness in Content

In many industries, location data is an important aspect of information search. By combining geographic data and content such as text, imagery, and video, users can more easily analyze, exploit or assimilate information. By narrowing the scope, this combined analysis leads to greater accuracy, new knowledge, and better decisions.

Geographic Boundaries to Filter Data

Users may need to know which regions are completely within the boundaries of a specified containment region. Polygon intersection and region intersection filters let users identify which geographic areas touch, or intersect another specified area. The containment region is often a polygon specified by a series of vertices, but also can be a circle or a box.

Be the First to Know With Alerts

In order to ensure the right people are alerted to critical information in real-time, the real-time alerting framework for efficient reverse queries applies a large set of saved queries to incoming documents in an optimized way. Alerting enables capabilities such as instant delivery of newly discovered information and automatic categorization.