MarkLogic has always tried to ensure that well-designed XML performs well “as is” in MarkLogic Server. For example, if your schema uses descriptive and unique element names, that is not only going to make your application code clean and readable, but fast as well. On the other hand, if your schema contains a lot of generic element names (such as “item”) used in multiple ways, then it’s going to make for harder-to-read code (in XQuery or XSLT), and it might also require you to do some extra leg work to get the best performance.
For example, consider a schema that has a lot of elements named <group> (or <section> or <item> or some other generic name) but which play very different roles—in this case indicated by the value of an attribute:
Since MarkLogic indexes elements by their name, it is not automatically going to make a distinction between the various <group> elements you have, because they have the same name. That being said, certain queries will still run maximally fast, such as when you want to restrict your results to a particular attribute value, using a simple XPath expression like this: //group[@type eq 'widget']
. MarkLogic Server will use its Universal Index to avoid reading any documents that don’t have a <group> element whose “type” attribute is equal to “widget”. So we’re okay so far.
But there are still a few issues here. For one thing, your code will not be very readable. This expression:
is pretty noisy compared to, for example:
//widgets/sprocket
which is what your code would look like if you used more descriptive element names.
The other issue is that you may run into some problems when you want to start doing more advanced, for instance, word search in subsets of your documents. Specifically, if you want to restrict your search results to all group elements except widget groups, that will be challenging. (Fields can help you do the converse, but in that case you may have to enumerate all the ones you are interested in getting results for.)
Another issue with the above design is that, despite the potential benefit of being data-driven and extensible, it’s not possible to apply schema constraints that are unique to specific classes of <group> elements (at least in W3C XML Schemas). You can’t, for example, restrict the content of <group> elements to <sprocket> and <gear> elements only when its type attribute is “widget”. If you want different content models, then you need to use different element names. Starting off with generic <group> elements may lead you down a slippery slope. You’ll find yourself using other generic names like “item”, and even then you won’t be able to effectively restrict the “type” values to only the applicable ones.
Here’s what an arguably better (and more readable) schema design would look like:
To conclude, there are lots of good reasons to use descriptive, unique element names whenever possible, and doing so plays nicely with human readers, XQuery, XSLT, XML Schemas, and MarkLogic Server.
Like what you just read, here are a few more articles for you to check out or you can visit our blog overview page to see more.
In this post, we dive into building a full five-card draw poker game with a configurable number of players. Written in XQuery 1.0, along with MarkLogic extensions to the language, this game provides examples of some great programming capabilities, including usage of maps, recursions, random numbers, and side effects. Hopefully, we will show those new to XQuery a look at the language that they may not get to see in other tutorials or examples.
If you are getting involved in a project using ml-gradle, this tip should come in handy if you are not allowed to put passwords (especially the admin password!) in plain text. Without this restriction, you may have multiple passwords in your gradle.properties file if there are multiple MarkLogic users that you need to configure. Instead of storing these passwords in gradle.properties, you can retrieve them from a location where they’re encrypted using a Gradle credentials plugin.
Apache NiFi introduces a code-free approach of migrating content directly from a relational database system into MarkLogic. Here we walk you through getting started with migrating data from a relational database into MarkLogic
Don’t waste time stitching together components. MarkLogic combines the power of a multi-model database, search, and semantic AI technology in a single platform with mastering, metadata management, government-grade security and more.
Request a Demo