Search and Query

MarkLogic’s search and query capability makes it easier to find better answers in today’s complex heterogeneous data. As the only Enterprise NoSQL database, MarkLogic gives organizations the ability to accelerate virtually any query over all of your data, thanks to sophisticated, best-in-class indexes. These same indexes also power full-text search, and MarkLogic is consistently chosen to power enterprise search applications over other offerings from the world’s largest search engine companies.

Search and Query

Better Answers in Today’s Data

Enterprise search that’s built-in, not bolt-on

MarkLogic has enterprise search built-in, enabling organizations to turn “big data” – petabytes of information stored across multiple systems – into useful results, without the need to shred the data. MarkLogic indexes data on load and makes it immediately searchable.

Powerful, complex query capability

MarkLogic’s customizable indexing provides the capability to run complex queries across all of your data using JavaScript, XQuery, and SPARQL—all right in MarkLogic. You can even utilize multiple indexes at once rather than just one or two indexes as with most databases.

MarkLogic search application

Build Advanced Search Applications

MarkLogic’s full-text search engine makes it an ideal platform to power advanced search applications. MarkLogic’s full-text search includes faceting, real-time alerting, type-ahead suggestions, snippeting, language support, and much more. Search applications are in production that have hundreds of billions of documents and hundreds of Terabytes of data—and provide relevant, filtered search results that are returned in microseconds. For a longer list of MarkLogic’s specific search features, download the Built-in Search Datasheet.

JavaScript Zoom In

Run Complex Queries Across All of Your Data

MarkLogic’s indexes provide the ability to run complex queries across multiple data types. And, you can run your queries quickly and easily via Query Console. MarkLogic’s Query Console is an interactive web-based query development tool for writing and executing ad-hoc queries in XQuery, SQL and SPARQL. In MarkLogic 8, JavaScript was introduced, extending MarkLogic’s powerful query capabilities to the emerging language of the Web. With MarkLogic’s server-side query capabilities, developers have a friendly API to express queries, aggregates, and data manipulation while automatically distributing the query evaluation across a cluster where it is run in parallel, close to the data.

To learn more about MarkLogic’s interfaces, read through the guides on the Java API, REST API, and XQuery Search API that provide data access for applications.

Universal Index

The Underlying Search Technology

The underlying technology beneath MarkLogic’s search and query capability is the Universal Index. The Universal Index helps MarkLogic function like a search engine. When new documents are loaded, the database immediately compiles a list of words or numbers that appear in each document. As more documents are added, each word is associated with a list of documents. These are called term lists because they list all documents associated with a particular term. An index is composed of these term lists, and the Universal Index is a compilation of the key indexes in MarkLogic.

The Universal Index keeps track of words, phrases, and values in documents. It also indexes the structure of documents—thus providing context for search. By indexing like a search engine, queries become really fast. But, the Universal Index does more than just speed up queries. It makes it possible to determine schema later, reducing application time-to-market and facilitating agile development.

The Universal Index is what gives MarkLogic a clear speed and cost advantage over what is offered by traditional relational databases and even newer NoSQL databases that limit what can be indexed and the number of indexes that can be queried. For a deeper understanding of MarkLogic’s indexing capabilities, read the whitepaper, Inside MarkLogic.

Additional Indexes

Range Index

The range index is useful for searching values like dates quickly and returning the results or extracting information from the documents in the result set. It is also good for sorting information, and is the index that enables facets—one of the key MarkLogic search features. Range indexes are also used for bitemporal queries across valid and system time axes, a new feature in MarkLogic 8 that allows you to track information “as it actually was” in combination with “as it was recorded.”

Geospatial Index

The geospatial index is similar to a range index, with built-in support for point, box, circle, linestrings, and complex polygons. MarkLogic also supports multiple geospatial data types such as GML, KML, and GeoRSS. MarkLogic integrates with a variety of geographic-aware products such as Esri ArcGIS, Google Earth, Google Maps, Yahoo Maps, and Microsoft Bing Maps to help visualize the data. Download the Geospatial Search datasheet to learn more.

Triple Index

The triple index is what powers the semantics capabilities for storing and managing RDF triples. Triples are facts–a subject, predicate, and object–that are stored natively in MarkLogic and can be queried with SPARQL. Both RDF and SPARQL are W3C standards for linked data. Download the Semantics datasheet to learn more.

Tuning the Database

Most of MarkLogic’s indexes can be toggled on and off in order to tune the database to your content and optimize search performance. For example, an application may only need 15 of the 30 different text indexes that are available to finely tune search to handle different languages, wild cards, lexicons and collections, etc.




Search Developer’s Guide

Dig into the details by reading through the documentation on search, learning how to use the Search API and more



Search API in 5 Minutes

Do a quick run-through of the Search API so you can start doing flexible, Google-style searches today



Search, Relevance, Context

Hear from a search expert on how to build a search application and the tools you can use to do it