Progress Acquires MarkLogic! Learn More
BLOG ARTICLE

Resolving “Unresolvable” Deadlocks

Back to blog
05.01.2018
7 minute read
Back to blog
05.01.2018
7 minute read

MarkLogic can resolve most deadlocks on its own. When two updates depend on each other’s locks, MarkLogic detects the deadlock and resolves it by restarting the update with the fewest locks. However, there’s one scenario where this solution doesn’t work: when one update nests another update in a separate transactional context and the deadlock happens between the two updates. A restart would only cause the same issue to happen again. These “unresolvable” deadlocks are essentially code bugs. They can happen when using xdmp:eval, xdmp:invoke, or xdmp:invoke-function, and in this article we’ll show techniques to avoid these problematic deadlocks.

When using these invoking functions, you can choose to use the same transactional context (same-transaction, in which case locks are shared and deadlocks between the two won’t happen) or a different transactional context (different-transaction, and this is where you need to be careful). The default is to use a different transactional context, so by default you need to be careful.

The following situations are examples of when a programmer risks unresolvable deadlocks:

  1. A REST call implementing PUT and/or DELETE doing an internal invoke. These two functions execute under update transaction mode.
  2. An update statement wants to audit the user’s action and record the action even if the outer update fails. If it’s to a separate database, it’s fine, but if it’s to the same database, there can be problems.

The example below is a REST extension that allows an update of a record at most once every minute:

sample-update.xqy – a REST API extension
module namespace ns = "http://marklogic.com/rest-api/resource/sample-update";
declare default function namespace "http://marklogic.com/rest-api/resource/sample-update";

declare variable $tracker-file := "/tracker.json";
declare variable $content-file := "/content.html";

(:
 : http put by default operates on update transaction mode.
 : more information about transactions are available at
 : https://docs.marklogic.com/guide/app-dev/transactions
 :)
declare function put(
    $context as map:map,
    $params  as map:map,
    $input   as document-node()*
) as document-node()?
{
  if (check-tracker()) then (
    xdmp:document-insert($content-file, $input)
    ,
    update-tracker()
    ,
    document{ fn:true() }
  ) else (
    document{ fn:false() }
  )
};

(: 
 : limit update to once per minute 
 : other applications may want to deduct a certain balance for each 
 : transaction made
 :)
declare function check-tracker(
) as xs:boolean {
  let $tracker := doc($tracker-file)
  let $age := fn:current-dateTime() - $tracker/timestamp  
  return not(fn:exists($tracker)) or $age gt xs:dayTimeDuration('PT60S') 
};

declare function update-tracker(
) {
  xdmp:invoke-function(
    function(){
      xdmp:document-insert($tracker-file, object-node{'timestamp' : fn:current-dateTime()})
    }
  )
};

If we try to invoke this operation via curl, it will fail as follows:

$> curl -X PUT --anyauth -uadmin:admin "http://localhost:9999/v1/resources/sample-update" 
-H "Content-Type:application/json" -d "{"new" : "content"}"

{"errorResponse":{"statusCode":500, "status":"Internal Server Error", "messageCode":"INTERNAL ERROR", 
"message":"SVC-EXTIME: xdmp:document-insert("/tracker.json", 
object-node{"timestamp":text{"2018-04-11T18:40:51.4589003+08:00"}}) 
-- Time limit exceeded . See the MarkLogic server error log for further detail."}}

The above error will yield a lot of “Notice” level entries in the MarkLogic log. For MarkLogic 8 and below, this would all be at ErrorLog.txt. For MarkLogic 9 and above, where each app server gets its own ErrorLog, the following log would be at 9999_ErrorLog.txt:

2018-04-18 21:18:29.580 Notice: SVC-EXTIME: xdmp:document-insert("/tracker.json", 
object-node{"timestamp":text{"2018-04-18T21:08:29.47+08:00"}}) -- Time limit exceeded
2018-04-18 21:18:29.580 Notice:+in /marklogic.rest.resource/sample-update/assets/resource.xqy, at 46:6,
2018-04-18 21:18:29.580 Notice:+in function() as item()*() [1.0-ml]
2018-04-18 21:18:29.580 Notice:+in /marklogic.rest.resource/sample-update/assets/resource.xqy,
2018-04-18 21:18:29.580 Notice:+in xdmp:invoke(function() as item()*) [1.0-ml]
2018-04-18 21:18:29.580 Notice:+in /marklogic.rest.resource/sample-update/assets/resource.xqy, at 44:2,
2018-04-18 21:18:29.580 Notice:+in update-tracker() [1.0-ml]

The above logs provide us with the following information:

  1. The transaction trying to acquire the lock is located at /sample-update/assets/resource.xqy, at 46:6 which is our sample-update.xqy
  2. It involves the document “/tracker.json”
  3. And it was spawned from /sample-update/assets/resource.xqy, at 44:2

By reviewing our sample-update.xqy to see what involves “/tracker.json”, we discover that the activity involved is reading the document, via [fn:]doc in check-tracker(), before the call to xdmp:invoke in update-tracker(). Since “PUT” runs in update mode, the read activity initiates a read lock on the document “/tracker.json”. The child transaction can no longer acquire a read-write lock on the same document to proceed with the update. (Read more information about locks and transactions)

There are several options on how to resolve this issue. Each of them can be implemented independently or collectively. Let’s go through them one by one.

Solution 1: Use same-transaction

We modify update-tracker() as follows:

declare function update-tracker(
) {
  xdmp:invoke-function(
    function(){
      xdmp:document-insert($tracker-file, object-node{'timestamp' : fn:current-dateTime()})
    }
    , 
    <options xmlns="xdmp:eval">
          <isolation xmlns="http://www.w3.org/1999/xhtml">same-statement</isolation>
        </options>
  )
};

This minor change will allow the child transaction to share the lock that has been acquired by the main transaction. However, this approach may not be an option if you want the child transaction to execute regardless of the success or failure of the outer transaction, e.g. creating an audit trail of attempts.

Solution 2: Read “lock-free”

We modify check-tracker() as follows:

declare function check-tracker(
) as xs:boolean {
  let $tracker := xdmp:invoke-function(
    function(){
      doc($tracker-file)
    }
    , 
    <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode>
        </options>
  )
  let $age := fn:current-dateTime() - $tracker/timestamp  
  return not(fn:exists($tracker)) or $age gt xs:dayTimeDuration('PT60S') 
};

This approach moves the doc call into a separate read-only transaction thus allowing access to the document content without holding onto any read lock for the rest of the main transaction.

However, this approach will not work if done within a multi-statement transaction as the invoke transaction will not see the temporary changes that are only available inside the multi-statement transaction.

Additionally, the query call runs at a higher timestamp than the source transaction and all other transactions before it. So this kind of implementation can become unpredictable:

let $query := cts:word-query('agent smith')
let $result1 := xdmp:invoke-function(
  function(){
    cts:search(/, $query)[1]
  }
  ,
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode>
        </options>
)
let $_ := xdmp:invoke-function(
  function(){
    xdmp:document-insert(concat('/item.',sem:uuid-string(),'.json'), object-node{'name' : 'Agent Smith'})
  },
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">update-auto-commit</transaction-mode>
        </options>
)
let $result2 := xdmp:invoke-function(
  function(){
    cts:search(/, $query)[1]
  },
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode>
        </options>
)
return document-uri($result1) = document-uri($result2)

$result1 and $result2 will have different results. To help address this, we acquire a timestamp value and pass it consistently to all invoke. See example below:

let $query := cts:word-query('agent smith')
let $timestamp := xdmp:invoke-function(
  function(){
    xdmp:request-timestamp()
  },
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode>
        </options>
)
let $result1 := xdmp:invoke-function(
  function(){
    cts:search(/, $query)[1]
  }
  ,
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode>
          <timestamp xmlns="http://www.w3.org/1999/xhtml">{$timestamp}</timestamp>
        </options>
)
let $_ := xdmp:invoke-function(
  function(){
    xdmp:document-insert(concat('/item.',sem:uuid-string(),'.json'), object-node{'name' : 'Agent Smith'})
  },
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">update-auto-commit</transaction-mode>
        </options>
)
let $result2 := xdmp:invoke-function(
  function(){
    cts:search(/, $query)[1]
  },
  <options xmlns="xdmp:eval">
          <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode>
          <timestamp xmlns="http://www.w3.org/1999/xhtml">{$timestamp}</timestamp>
        </options>
)
return document-uri($result1) = document-uri($result2)

This makes both search transactions execute in the same timestamp. The second transaction remains ignorant of the insert that happened a step before.

The following guidelines can be used as reference when developing your applications:

  1. Keep your queries/reads lock-free as much as possible. They run faster, especially when retrieving a large number of documents.
  2. xdmp:eval, invoke, and invoke-function by default runs with update set to auto, isolation set to different-transaction, and prevent-deadlocks to false, thus increasing the risk of deadlocks.
  3. Specify an update mode of query, i.e. <update>query</update>, specifically if you intend to only do queries. This guarantees the transaction executes lock-free and will throw an exception if it does any update. Relying on auto can lead to a query running on accidental update mode.

Share this article

Read More

Related Posts

Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.

Developer Insights

Multi-Model Search using Semantics and Optic API

The MarkLogic Optic API makes your searches smarter by incorporating semantic information about the world around you and this tutorial shows you just how to do it.

All Blog Articles
Developer Insights

Create Custom Steps Without Writing Code with Pipes

Are you someone who’s more comfortable working in Graphical User Interface (GUI) than writing code? Do you want to have a visual representation of your data transformation pipelines? What if there was a way to empower users to visually enrich content and drive data pipelines without writing code? With the community tool Pipes for MarkLogic […]

All Blog Articles
Developer Insights

Part 3: What’s New with JavaScript in MarkLogic 10?

Rest and Spread Properties in MarkLogic 10 In this last blog of the series, we’ll review over the new object rest and spread properties in MarkLogic 10. As mentioned previously, other newly introduced features of MarkLogic 10 include: The addition of JavaScript Modules, also known as MJS (discussed in detail in the first blog in this […]

All Blog Articles

Sign up for a Demo

Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.

Request a Demo