Elsevier is a leading publisher and information provider for medical, academic and health-related organizations. Elsevier prides itself on supplying customers with the information they need to conduct research, perform experiments, aid patients, and achieve mission-critical objectives.
Despite Elsevier’s significant investments in search technology, their users found it increasingly time-consuming to extract the information they needed from this mountain of data. Facing Elsevier were four formidable challenges:
Lack of central repository. Each body of content existed in a separate database – either in a relational database format or a proprietary one – with several applications on each database.
Huge range of file formats. Normalizing content was extremely time-consuming. For one application project alone, there were 35 different document formats involved.
High cost. New functionality was time-consuming and expensive to build. The complex logic needed to deconstruct a document and analyze relationships between documents had to be built application-by-application.
Massive amounts of content. The final content repository was estimated to exceed 5 terabytes in size. It included: More than five million full-text journal articles across 1,800 journals; over 60 million citations and abstracts (separate from the articles); 20,000 in-print books; 9,000 out-of-print books; and thousands of informational pamphlets.
Recognizing the potential of tagged search elements, Elsevier kept pace with the evolution of descriptive signature technologies by investing in the benefits of XML. By the year 2004, Elsevier had reengineered their products along the lines of web service architectures, creating an XML repository offering new efficiencies to their IT staff and higher functionality for users. But the apron strings of relational database technology still tied the company down to long, expensive product development cycles and less than optimal performance.
“We offered to show Elsevier how the MarkLogic Server could leverage their investment in XML to deliver on Elsevier’s vision,” recalls MarkLogic Co-founder and Chief Technologist Paul Pedersen. “Our promise was simple. Hand us any amount of data, as is, from your archives. We’ll hand you back an entirely new application based on that content.” The application MarkLogic delivered in just a few days was more flexible than anything Elsevier had online at the time. This accomplishment was all the more remarkable considering that the 0.5 terabytes of content loaded into MarkLogic Server was comprised of over 35 different formats. Impressed, Elsevier engaged MarkLogic and used MarkLogic Server to consolidate all of its archives, rapidly build new applications, and create value-added services from its repository.
The majority of our time on a project is consumed by deciding exactly how the content will be used and preparing it for the database. With MarkLogic, we’ve now cut that time in half.
MarkLogic has dramatically accelerated the deployment of Elsevier’s products and services, while greatly reducing the costs of content loading and design – translating into even faster research cycles and clinical diagnoses – thanks to a new generation of solutions for helping professionals find exactly the information they need, when they need it most.