R.R. Bowker is an ISBN (international standard book number) agency for the U.S. The company receives data feeds from publishers, wholesalers, and distributors. Bowker’s Global Books in Print database contains bibliographic information on millions of book titles, with more than 20 million documents accessible to a customer base comprising the entire book industry, from publishers and libraries to bookstores and distributors.
I now control when the content gets out to the customers. Now we run a day behind, but that’s a choice we made; not because we were limited by the tool. The search response time is sub-second; where the search response time in the Verity world was around two and a half to three seconds.
The MarkLogic XML content server essentially combines full-text search with the W3C-standard XQuery language. The platform can load, query, manipulate, and render content. Employing the MarkLogic Server enabled Bowker to improve its search capabilities through a combination of XML element query, XML proximity search, and full-text search. It took only about 4 to 5 months for MarkLogic and Bowker to develop the solution and implement it.
Heinzelman says that the way in which MarkLogic stores the data makes it easier for Bowker to make changes in document structure and add new content when desired. “It was very difficult to add different types of content into the Verity world,” says Heinzelman. “It almost invariably led to us having to rebuild the whole database and that would take 3 to 4 weeks. Now we can drop in new document types very quickly.”
Beyond helping Bowker solve its immediate need for a better search engine, MarkLogic also assisted the company with its long-term goals for a solid content repository that can grow with it. “Initially, I started looking at it as just a search engine, not a content repository,” says Heinzelman. “At the time, we had a very strong Oracle database, where we were storing content. Since then, as I started to roll out future plans for 2008, 2009, and 2010, our plan is to move all of our content into MarkLogic as a content repository.”
Another key benefit is the cost savings Bowker has realized as a result of the initiative. Bowker needed a full-time employee on staff to manage Verity. Now, the company has an employee who spends, at best, one-quarter of his time managing the current infrastructure.
Heinzelman says the flexibility of the MarkLogic solution has Bowker already contemplating next steps; steps that will include making the most of full book content. They plan to use the technology “as a tool to mine the content and create tools to come up with ways to sell relevancy,” says Heinzelman. “We want to reduce the amount of time people need to look. We think we can do that by mining content and selling information and metadata around relevancy; and with the sales data that we have, be able to tie that into it too.” Heinzelman says the Verity platform solution did not provide the flexibility to allow for full book content. As long as Bowker can collect such data, the MarkLogic server will be able to handle it.