Do you want MarkLogic to be integrated with your Single Sign-On (SSO) solution? Does your organization, like others, have security standards that do not allow passwords to be sent? And, of course, you do not want to require users to type in passwords multiple times. You’ve reviewed the out of the box external security features (LDAP, Active Directory, and SAML) and see that they all require the password and the username to be sent to MarkLogic. This blog post will go over options that will not require the password of the user for integrating with SSO solutions.
Before integrating MarkLogic with an SSO solution, we first need to understand that there are three main ways that MarkLogic can reside within a system’s architecture. The first has requests directly hitting the MarkLogic server or they go through some load balancer that directly hits MarkLogic. This means that all logic that would need to be secured would be within that MarkLogic Server.
The final way that MarkLogic can reside in a systems architecture is where another server does some business logic and queries MarkLogic through one of the API connectors such as the Java API. This means that most or all of the logic is handled in a system outside of the MarkLogic Server and MarkLogic just executes the queries sent to it.
In order to leverage MarkLogic’s security features and simplify the security requirements, we recommend you do your SSO integration at the MarkLogic server level, using a rewriter.
A rewriter is something that you are probably more familiar with as a feature with an application like Microsoft’s IIS or Apache. The rewriter intercepts all the requests sent to an application server and based on code or configuration, dispatches the requests to the desired module file. What I will be discussing is a rewriter as a database feature. MarkLogic interprets the URL of the incoming request and rewrites it to an internal URL that services the request. A rewriter can be implemented as an XQuery module as described in this chapter, or as an XML file.
If you are using out of the box REST end points, stick with the XML rewriter — I included information about this at the end of this piece. But if you are NOT using out of the box APIs, you will need to use the Interpretive Rewriter, which has been available since MarkLogic Server version 8. This will allow you to execute code that integrates with the SSO solutions.
What we have seen that works best is to send the user’s information to MarkLogic in the http headers. This means you need to have something in between MarkLogic and the user’s request. This can typically be done with a load balancer, but can also be done with a reverse proxy or a middle tier. Sometimes there is more information that you want passed to MarkLogic than a load balancer can easily do, and that’s when you’d need to use a reverse proxy or a middle tier.
Once you have the information sent to MarkLogic, you’ll want to read that information and give the current user more privileges. You can do this by using xdmp:login. Xdmp:login’s fourth parameter takes role names that will get added to the user you login with. This means that for the current user’s session, you can elevate the MarkLogic users’ privileges.
(: gets the header userID and gives them an extra role if its there :) let $userID := xdmp:get-request-header('userID') let $roles := if (fn:exists($userID)) then ( ("secret","public") ) else ("public") return xdmp:login(xdmp:get-current-user(), (), fn:false(), $roles) Securing the Integration
By passing MarkLogic the user’s information using xdmp:login, which elevates privileges, you are adding a point of attack. We can secure this potential vulnerability by using a few features that MarkLogic has. We can limit the default user of the Application Server, limit the roles that can be used during xdmp:login function, use LDAP to lookup the roles, use an amp when calling the xdmp:login function, and finally, restrict the application server to only allow request from machines that have certs that we have authenticated.
The first step in securing the SSO integration is to give the default user of the application server as few privileges as as possible. Since we are using xdmp:login, this requires us to set the application server authentication type to “application-level” and this requires you to have a default user. The default user is the user that will run before we call xdmp:login. This user needs to have read access to rewriter and any library files that are referenced in it. It will also need privileges to any functions you are calling that require privileges, such as xdmp:login.
To make it more secure, the privileges for the default user can be given in an amp. Amps in MarkLogic give us a way to elevate privileges inside a function. So we can create a role that will have the needed privileges to call functions like xdmp:login and xdmp:get-request-header. This means that someone cannot use the default user in an unexpected way, like in Qconsole, and have the xdmp:login privileges.
Since xdmp:login can take any role we give it, we have to be careful that we don’t give the users too much power. We can do this by limiting the roles that can be sent to it. The best way to do this is to assign the roles and not have them passed in. For example, you might hard-code roles with an if or switch statement, or there might be a role mapping document, e.g. user ID, to MarkLogic roles. You might have something that passes in the role to xdmp:login, maybe from the middle tier. Since MarkLogic has so many roles, it’s best only to allow roles that you expect. This can easily be done by putting a predicate on the fourth parameter of xdmp:login.
(: only allows roles that are expected :) let $expectedRoles := ("public","classified","secret","top-secret") (: would likely be passed in or dynamic:) let $roles := ("admin","public") return xdmp:login("default-user", (), fn:true(), $roles[. eq $expectedRoles]) (: This would filter out the admin role passed in and only give the user the public role :)
To further secure the integration, you can move the owning of the role assignment to LDAP/Active Directory. During the authentication part of the rewriter, you can use xdmp:ldap-lookup. You’ll need credentials in order to authenticate to the LDAP server, as this is typically done with a service account. You’ll then pass some way to identify the user that needs to be authenticated into MarkLogic, typically their cn.
This will give you results of user payloads; if you use the cn, there should only be one. The payload has memberOf LDAP attribute elements. You can xpath to these and filter out the ones that you want to add as roles. It helps to have a prefix such as marklogic-role-name. You would then take those roles and pass them into the xdmp:login function. Here is some example code for what that would look like:
(: $cn would be passed from the SSO system, but could come through as a header that marklogic would look up You would need to change the path, username password and server-uri :) $ldapResults := xdmp:ldap-lookup( "CN="|| $cn || ",CN=Users,DC=MLTEST1,DC=LOCAL", <options xmlns="xdmp:ldap"> <username xmlns="http://www.w3.org/1999/xhtml">admin</username> <password xmlns="http://www.w3.org/1999/xhtml">admin</password> <server-uri xmlns="http://www.w3.org/1999/xhtml">ldap://dc1.mltest1.local:389</server-uri> </options>) $groups := $ldapResults/ldap-attribute[@id eq "memberOf"] (: gets the groups that just have the prefex we want :) $MLgroups:= $groups[fn:starts-with(., “CN=marklogic-”)] (: only takes the group cn :) let $roles := $MLgroups[fn:substring-after(fn:substring-before(., ","), "CN=marklogic-")] (: only allows roles that are expected :) let $expectedRoles := ("public","classified","secret","top-secret") return xdmp:login("default-user", (), fn:true(), $roles[. eq $expectedRoles])
Finally, we can restrict what machines are allow to send requests to the application server running the code. This can be done by setting “ssl require client certificate” to true on the application server configuration. Typically, an SSL certificate has a Fully Qualified Domain Name; however, you can also give it IP addresses. This means you can add another layer of security on top of the SSL certificate, since you can restrict requests to certain IP addresses. They will now have to also do IP spoofing over HTTPS with 2-way SSL.
Here are some common questions that get asked when integrating MarkLogic with SSO solutions:
It can be time consuming to login on every request. To save time, you can have MarkLogic save sessions for the users. By default, xdmp:login, does not save the login/user and roles. You can tell it to do that with the 3rd parameter, the set-session parameter. It defaults to false; if you set it to true, it will keep the role information in a session. This session is tied to whatever MarkLogic server handled the request.
If you are going to save the sessions, you’ll want to update the login code to check to see if there is a session. You can do this with xdmp:set-session-field and xdmp:get-session-field. These functions require extra execute privileges, so you’ll want to update the role that is being amp to have them. You will also want to configure your load balancer to have sticky sessions. This will make it so that all requests within a session are sent to the same MarkLogic server.
(: checks to see if there is a login session before logging in :) if (xdmp:get-session-field("login")) then ( ) else ( let $expectedRoles := ("public","classified","secret","top-secret") (: would likely be passed in or dynamic:) let $roles := ("admin","public") (: sets the login session so we don't have to login again :) let $_ := xdmp:set-session-field("login", fn:true()) return xdmp:login("default-user", (), fn:true(), $roles[. eq $expectedRoles])
The process to integrate MarkLogic with an SSO system using the Declarative XML Rewriter is very similar to the Interpretive Rewriter with just some extra steps. First, you need to know what “dispatches” you want to have integrated. Most of the time, people want to integrate the out of the box REST endpoints and any custom extensions.
To do this, it’s best to copy the global XML configuration to your project code and edit it there. A global XML rewriter is shipped with MarkLogic. However, this cannot be easily changed and really shouldn’t be messed with. What you can do is copy it and set the application servers rewriter to the project XML. The global XML rewriter is located at (MarkLogic-Install-Directory)/Modules/MarkLogic/rest-api/rewriter.xml. Just edit the “dispatches” that you want to change. You’ll need to change the URL rewriter to a local module file that does the same code as before, but also does the login logic talked about in the “Integrating at the MarkLogic Server level” section of this document.
For example, if you wanted to change the custom extensions and the out of the box REST endpoints, you’d copy the global rewriter and find the part of it that matches that dispatch you want. See example below. The important part to look for is the matches attribute. You’ll want to change the dispatch element to your own code.
<match-path matches="^/(v1|LATEST)/resources/([^/]+)/?$"> <match-query-param name="database"> <set-database checked="true">$0</set-database> </match-query-param> <add-query-param name="name">$2</add-query-param> <match-method any-of="GET HEAD POST"> <match-query-param name="txid"> <set-transaction>$0</set-transaction> <set-transaction-mode>query</set-transaction-mode> </match-query-param> <dispatch>/MarkLogic/rest-api/endpoints/resource-service-query.xqy</dispatch> </match-method> <match-method any-of="PUT DELETE"> <match-query-param name="txid"> <set-transaction>$0</set-transaction> <set-transaction-mode>update</set-transaction-mode> </match-query-param> <dispatch>/MarkLogic/rest-api/endpoints/resource-service-update.xqy</dispatch> </match-method> </match-path> (...) <match-path matches="^/(v1|LATEST)/documents/?$"> <match-query-param name="database"> <set-database checked="true">$0</set-database> <match-query-param> <match-method any-of="POST"> <match-query-param name="txid"> <set-transaction>$0</set-transaction> <set-transaction-mode>update</set-transaction-mode> </match-query-param> <match-content-type any-of="application/x-www-form-urlencoded"> <dispatch>/etc/sso/resources-wrapper.xqy</dispatch> </match-content-type> </match-method> <match-method any-of="GET HEAD OPTION"> <match-query-param name="txid"> <set-transaction>$0</set-transaction> <set-transaction-mode>query</set-transaction-mode> </match-query-param> <dispatch>/etc/sso/resources-wrapper.xqy</dispatch> </match-method> <match-method any-of="PUT POST DELETE PATH"> <match-query-param name="txid"> <set-transaction>$0</set-transaction> <set-transaction-mode>update</set-transaction-mode> </match-query-param> <dispatch>>/etc/sso/resources-wrapper.xqy</dispatch> </match-method> </match-path>
You’ll then want to do the login code and whatever else it was already doing, like the example below:
xquery version "1.0-ml"; import module namespace ssolib = "example.marklogic.com/sso-lib" at "/ext/sso/lib.xqy"; declare private variable $MODULE-ENDPOINT-RESOURCES-READ := "/MarkLogic/rest-api/endpoints/resource-services-query.xqy"; declare private variable $MODULE-ENDPOINT-RESOURCES-UPDATE := "/MarkLogic/rest-api/endpoints/resource-services-update.xqy"; declare private variable $MODULE-ENDPOINT-DOCUMENTS-READ := "/MarkLogic/rest-api/endpoints/document-item-query.xqy"; declare private variable $MODULE-ENDPOINT-DOCUMENTS-UPDATE := "/MarkLogic/rest-api/endpoints/document-item-update.xqy"; declare private variable $MODULE-ENDPOINT-SEARCH-READ := "/MarkLogic/rest-api/endpoints/search-list-query.xqy"; declare private variable $MODULE-ENDPOINT-SEARCH-UPDATE := "/MarkLogic/rest-api/endpoints/search-list-update.xqy"; declare private variable $MODULE-ENDPOINT-SUGGEST := "/MarkLogic/rest-api/endpoints/suggest.xqy"; declare private variable $MODULE-ENDPOINT-TRANSACTIONS := "/MarkLogic/rest-api/endpoints/transaction-item-default.xqy"; (: logic that we have shown before :) let $_ := ssolib:login-User() let $context := map:map() let $params := map:map() (: puts all the rs feilds in the params map :) let $_ := for $field in xdmp:get-request-field-names()[fn:starts-with(.,'rs:') or . eq ("uri")] let $key := replace($field, "rs:", "") return map:put($params, $key, xdmp:get-request-field($field)) let $method := fn:lower-case(xdmp:get-request-method()) return if(fn:starts-with($original-url,'/v1/resources/')) then if($method = "put" or $method = "delete") then xdmp:invoke($MODULE-ENDPOINT-RESOURCES-UPDATE) else xdmp:invoke($MODULE-ENDPOINT-RESOURCES-READ) else if(fn:starts-with($original-url,'/v1/documents/')) then if($method = "put" or $method = "post" or $method = "delete" or $method = "patch") then xdmp:invoke($MODULE-ENDPOINT-DOCUMENTS-UPDATE) else xdmp:invoke($MODULE-ENDPOINT-DOCUMENTS-READ) else if(fn:starts-with($original-url,'/v1/search/')) then if($method = "delete") then xdmp:invoke($MODULE-ENDPOINT-SEARCH-UPDATE) else xdmp:invoke($MODULE-ENDPOINT-SEARCH-READ) else if(fn:starts-with($original-url,'/v1/suggest/')) then xdmp:invoke($MODULE-ENDPOINT-SUGGEST) else if(fn:starts-with($original-url,'/v1/transactions/')) then xdmp:invoke($MODULE-ENDPOINT-TRANSACTIONS)
When you go to upgrade, keep in mind that we have copied the global rewriter. You will likely have to merge any changes when upgrading.
I strongly suggest integration at the MarkLogic server level; however, if you have to integrate in the middle layer, there is a way to do it. What you need to do is make an eval call to the MarkLogic Server before any other calls are made to it. In this eval call, you will call the xdmp:login function with the extra roles and have set-session parameter set to true. This will create a session with the extra privileges. Remember to have the load balancer configured to have all requests for each session sent to the same MarkLogic server. You will also want to do all the steps talked about in the “Securing the Integration” section.
Download the Integrating with Single Sign-On white paper today.
Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.
In this post, we dive into building a full five-card draw poker game with a configurable number of players. Written in XQuery 1.0, along with MarkLogic extensions to the language, this game provides examples of some great programming capabilities, including usage of maps, recursions, random numbers, and side effects. Hopefully, we will show those new to XQuery a look at the language that they may not get to see in other tutorials or examples.
If you are getting involved in a project using ml-gradle, this tip should come in handy if you are not allowed to put passwords (especially the admin password!) in plain text. Without this restriction, you may have multiple passwords in your gradle.properties file if there are multiple MarkLogic users that you need to configure. Instead of storing these passwords in gradle.properties, you can retrieve them from a location where they’re encrypted using a Gradle credentials plugin.
Apache NiFi introduces a code-free approach of migrating content directly from a relational database system into MarkLogic. Here we walk you through getting started with migrating data from a relational database into MarkLogic
Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.Request a Demo