In the blog Building a MarkLogic Docker Container, we created a Docker container and installed MarkLogic. We used a Dockerfile to build the MarkLogic container and run the MarkLogic installation. Also, we used Docker Compose to automate building a 3-node MarkLogic cluster for learning purposes. In each of these examples, we manually completed the installation of MarkLogic. If not creating a standalone MarkLogic server, we also joined a MarkLogic cluster.

Using MarkLogic’s REST Management API, we can automate this process further. With only one docker-compose command, we can have a 3-node (or more) MarkLogic cluster created and ready to use with no manual intervention required.

MarkLogic officially supports Docker containers with version 9.0-5 or later. Please use discretion when working with these examples using older versions of MarkLogic.

What is MarkLogic Initialization?

After installing and starting MarkLogic on a single host, or as the first host in a cluster, we normally use a browser to connect to port 8001 on the host. MarkLogic then does the following to initialize the new MarkLogic server:

  1. MarkLogic first creates initial databases and application servers. For example, a Security database must be created to store user data, roles, and other security information. Application servers provide Query Console on port 8000, the Administrative Interface on port 8001, Monitoring on port 8002, and other features.
  2. Next in the intialization process is joining a cluster. Since this is the first host, we skip this step and proceed to creating an administrator account.
  3. When we initialize the first [or a standalone] MarkLogic server, an administrator account must be created. When additional hosts join a cluster, they use the existing administrator account.

Automating Initialization

We can automate the initialization steps and create an administrator account for a single MarkLogic server by using MarkLogic’s REST API. Also, we can use MarkLogic’s REST API to add a single, initialized MarkLogic server to a cluster. Before we discuss automating the full installation process, the previous blog had an example Dockerfile for single MarkLogic server installations, as well as an example docker-compose.yml file for a 3-node MarkLogic cluster. Both needed additional steps to complete the installations once the MarkLogic containers were created. Let’s review.

Creating MarkLogic Containers with Manual Initialization

The Dockerfile installs MarkLogic and exposes ports from the Docker container. Finally, the CMD line starts MarkLogic when the Docker container is created. We connect to port 8001 internally in the Docker container by pointing the host computer’s browser to the exposed port. We manually proceed with the initialization steps. Here’s part of the Dockerfile from before concerning MarkLogic.

For the complete discussion of the example MarkLogic Dockerfile and example MarkLogic docker-compose .yml file, please see the previous blog, Building a MarkLogic Docker Container. Also, download the examples from GitHub at https://github.com/alan-johnson/docker-marklogic.

FROM centos:centos7

...

# Copy the MarkLogic installer to a temp directory in the Docker image being built
COPY MarkLogic-RHEL7-8.0-5.5.x86_64.rpm /tmp/MarkLogic.rpm

# Install MarkLogic then delete the .RPM file if the install succeeded
RUN yum -y install /tmp/MarkLogic.rpm && rm /tmp/MarkLogic.rpm

# Expose MarkLogic Server ports
# Also expose any ports your own MarkLogic App Servers use such as
# HTTP, REST and XDBC App Servers for your applications
EXPOSE 7997 7998 7999 8000 8001 8002

# Start MarkLogic from init.d script.
# Define default command (which avoids immediate shutdown)
CMD /etc/init.d/MarkLogic start && tail -f /dev/null

Manually Creating the Cluster

Using docker-compose and a .yml build file (available on GitHub and in the previous blog post), we create three MarkLogic servers. Docker networking links these containers to each other enabling them to communicate with each other over HTTP. Once the containers are created and linked, we use a browser on the host computer to connect to port 8001 in each container. We manually initialize the first MarkLogic server (ml1.local) as the first node in the cluster and create an administrator account. Then we initialize the second (ml2.local) and third (ml3.local) MarkLogic server nodes and join the first MarkLogic server, creating the cluster.

Automating the Process

We can automate the initialization and cluster joining process by implementing the following steps. Note: all files, including the automation scripts, can be downloaded from the GitHub repository.

Step One: Create two shell scripts

The first shell script, initialize-ml.sh, uses MarkLogic’s REST API to initialize the MarkLogic Server and create an administrator account.

  • Using curl, we call the MarkLogic REST API to intialize MarkLogic after starting. This calls the REST API of /admin/v1/init on the MarkLogic server in the ML_HOST shell script variable. The ML_HOST variable is set to the container’s hostname in the script.
TIMESTAMP=`curl -X POST -d "" http://${ML_HOST}:8001/admin/v1/init`
  • We use curl again to call MarkLogic’s REST API to create the administrator account, using the /admin/v1/instance-admin endpoint. The variables USER, PASS and SEC_REALM are set from command-line arguments to the script.
TIMESTAMP=`$CURL -X POST -H  
  "Content-type: application/x-www-form-urlencoded" \  
  --data "admin-username=${USER}" --data "admin-password=${PASS}" \  
  --data "realm=${SEC_REALM}" \  
  http://${ML_HOST}:8001/admin/v1/instance-admin`
  • The script also includes code to iterate through the passed-in arguments and verify that an administrator username and password have been given. If the arguments are missing, the script returns an error and prints a usage message out. Additional code initializes variables and uses the MarkLogic REST API to wait for MarkLogic to restart.Did you know you can access what is displayed to the standard or error output from within a Docker container by using the command: docker logs <containername or id>

The second shell script, add-to-cluster.sh, uses the MarkLogic REST API’s /admin/v1/server-config endpoint to retrieve the cluster configuration from the initial MarkLogic server and merge it with the configuration from the second and third MarkLogic servers.

  • Use curl to get the current server configuration from the MarkLogic server that will be joining the cluster. The variable, JOINING_HOST, is set in a loop in the script to each of the passed-in MarkLogic server names that will join the cluster. Store the returned configuration in the variable, JOINER_CONFIG.
JOINER_CONFIG=`curl --anyauth --user admin:admin -X GET \  
  -H "Accept: application/xml" \  
  http://${JOINING_HOST}:8001/admin/v1/server-config`
  • Use curl again to send the configuration data stored in JOINER_CONFIG to a MarkLogic server already in the cluster. Since we are creating a new cluster, this would be the first MarkLogic server created and initialized. This REST API call returns cluster configuration information that is saved to a file, cluster-config.zip.
curl --anyauth --user admin:admin -X POST \  
  -o cluster-config.zip -d "group=Default" \
  --data-urlencode "server-config=${JOINER_CONFIG}" \
  -H "Content-type: application/x-www-form-urlencoded" \
  http://${BOOTSTRAP_HOST}:8001/admin/v1/cluster-config
  • Finally, we use curl to send the new cluster configuration to the MarkLogic server joining the cluster. The variable JOINING_HOST is set in a loop in the script for each MarkLogic server name that is to join the cluster.
TIMESTAMP=`curl --anyauth --user admin:admin -X POST \  
  -H "Content-type: application/zip" \
  --data-binary @./cluster-config.zip \
  http://${JOINING_HOST}:8001/admin/v1/cluster-config`

This puts all MarkLogic servers in the same cluster as the initial MarkLogic server.

Step Two: Change the Dockerfile to call our initialize-ml.sh script after starting MarkLogic

The changes to the Dockerfile include:

  • Copy the shell scripts to the same directory in the Docker container as the MarkLogic installer.
  • Change the CMD line to run the initialize-ml.sh script after starting MarkLogic. This initializes MarkLogic and creates an administrator account.

Updated Dockerfile:

FROM centos:centos7

# Get any CentOS updates then clear the Docker cache
RUN yum -y update && yum clean all

# Install MarkLogic dependencies
RUN yum -y install glibc.i686 gdb.x86_64 redhat-lsb.x86_64 && yum clean all

# Install the initscripts package so MarkLogic starts ok
RUN yum -y install initscripts && yum clean all

# Set the Path
ENV PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/MarkLogic/mlcmd/bin

# Copy the MarkLogic installer to a temp directory in the Docker image being built
COPY MarkLogic-RHEL7-8.0-5.5.x86_64.rpm /tmp/MarkLogic.rpm

# Copy the shell scripts to a temp directory in 
# the Docker image and change permissions to make
# them executable
COPY initialize-ml.sh /tmp/initialize-ml.sh
COPY add-to-cluster.sh /tmp/add-to-cluster.sh
RUN chmod +x /tmp/*.sh

# Install MarkLogic then delete the .RPM file if the install succeeded
RUN yum -y install /tmp/MarkLogic.rpm && rm /tmp/MarkLogic.rpm

# Expose MarkLogic Server ports
# Also expose any ports your own MarkLogic App Servers use such as
# HTTP, REST and XDBC App Servers for your applications
EXPOSE 7997 7998 7999 8000 8001 8002

# Start MarkLogic. After MarkLogic has started
# successfully, run the initialize script.
# initialize-ml.sh usage:
#   initialize-ml.sh -u <desired admin username>
#                    -p <desired admin password>
#                    -r <realm for the password> 
#                       Realm is optional, 
#                            the default is "public"
#  
# Also, execute tail such that it waits forever. 
# This prevents the container from automatically
# stopping after starting MarkLogic.
CMD /etc/init.d/MarkLogic start && ./tmp/initialize-ml.sh -u admin -p admin -r public && tail -f /dev/null

Step Three: Change docker-compose.yml file to add calling the add-to-cluster.sh script for the second and third MarkLogic servers

Updated docker-compose.yml file:

version: '2'
services:
  mlnode1: 
    build: .
    image: ml8:build
    expose:
      - "7997"
      - "7998"
      - "7999"
    ports:
      - "8000:8000"
      - "8001:8001"
      - "8002:8002"
      - "8010:8010"
    hostname: "ml1.local"
    container_name: "ml1.local"
  mlnode2:
    image: ml8:build
    expose:
      - "7997"
      - "7998"
      - "7999"
    ports:
      - "18000:8000"
      - "18001:8001"
      - "18002:8002"
      - "18010:8010"
    hostname: "ml2.local"
    container_name: "ml2.local"
    links:
      - mlnode1:mlnode1
  mlnode3:
    image: ml8:build
    expose:
      - "7997"
      - "7998"
      - "7999"
    ports:
      - "28000:8000"
      - "28001:8001"
      - "28002:8002"
      - "28010:8010"
    hostname: "ml3.local"
    container_name: "ml3.local"
    links:
      - mlnode1:mlnode1
      - mlnode2:mlnode2
    command: /bin/sh -c "/etc/init.d/MarkLogic start && ./tmp/initialize-ml.sh -u admin -p admin -r public && ./tmp/add-to-cluster.sh -u admin -p admin ml1.local ml2.local ml3.local && tail -f /dev/null"

The changes to the docker-compose.yml file include adding a command: option to the last MarkLogic node to join the cluster. The command: option overrides the default CMD line in the Dockerfile but only for the node under which it appears in the docker-compose.yml file. By doing so, we replace the command that is executed when the container is created, allowing us to:

  • Start MarkLogic on the last node to join the cluster, as the new Dockerfile also does.
  • Initialize MarkLogic and create an administrator account. Again, same as our new Dockerfile.
  • After successful initialization, call the add-to-cluster.sh script passing in the following arguments to the script.
  • Last, just like our new Dockerfile, call the tail command such that the container doesn’t exit after the it is created.
Argument Meaning
-u admin The admininstrator username created in the initial MarkLogic server.
-p admin The administrator password.
ml1.local The Docker hostname of the first MarkLogic server.
ml2.local The Docker hostname of a MarkLogic server to join the cluster of the first MarkLogic server.
ml3.local The Docker hostname of a MarkLogic server to join the cluster of the first MarkLogic server.

Wrap up

MarkLogic server initialization is required after MarkLogic has been installed and started. Unless the MarkLogic server is joining a cluster, an administrator account must also be created. These steps are normally completed manually.

We can automate this process by using the MarkLogic REST API. Servers can be initialized, administrator accounts created and clusters can be joined by scripting. We used curl to do these steps from shell scripts, which are a good fit with Docker.

Enjoy MarkLogic and Docker!

Additional Resources

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.