Royal Society of Chemistry
When the content experts at the Royal Society of Chemistry (RSC) found themselves struggling to manage millions of buried data files, they partnered with MarkLogic to build a new solution. Using the MarkLogic Enterprise NoSQL database, the RSC has made over a century’s worth of information accessible to entrepreneurs, educators, and researchers around the world.
Founded over 150 years ago in the United Kingdom, the RSC is Europe’s largest organization dedicated to furthering awareness of the chemical sciences. With more than 48,000 global members, the RSC is the heir and successor of four renowned and long-established chemical science bodies—The Chemical Society, The Society for Analytical Chemistry, The Royal Institute of Chemistry, and The Faraday Society. The RSC’s headquarters are in London and Cambridge, UK with international offices in the USA, China, Japan, India and Brazil.
To strengthen knowledge of the profession and science of chemistry, the RSC holds conferences, meetings, and public events, and also publishes industry-renowned scientific journals, books and databases.
Adding to its wealth of content, the RSC recently acquired the rights to The Merck Index. Widely considered as the worldwide authority on chemistry information, this renowned reference book has been used by industry professionals for over 120 years.
It’s a tall order to manage a single year’s worth of data—so how about 170 of them? Since the 1840s, the RSC has gathered millions of images, science data files and articles from more than 200,000 authors. All of that information was stored in a wide range of formats at multiple locations and was growing by the day.
In 2010, largely due to the huge growth of social media and digital formats, the RSC launched an initiative to make its data more accessible, fluid and mobile.
David Leeming, strategic innovation group solutions manager for RSC, sums up the society’s goal: “We needed an integrated repository that would make all of our content accessible online to anyone—from teachers to businesses to researchers. The key was finding the right technology.”
After evaluating several major providers, the RSC chose MarkLogic as the best platform for its needs, and built three sites on it, RSC Publishing: http://www.rsc.org/publishing, Learn Chemistry: http://www.rsc.org/learn-chemistry, and Chemistry World: http://www.rsc.org/chemistryworld.
Given the society’s wide range of information media—books, emails, manuals, tweets, metadata, and more—the data does not conform to a single schema, which means a traditional relational database can’t accommodate it. MarkLogic’s document-based data model is ideal for varied formats and hierarchical metadata. The RSC can simply load its information as-is, without having to conform to a rigid format.
As Leeming points out, “A book chapter is very different from a journal article. A relational database can’t combine the two. MarkLogic is flexible enough to handle all types of unstructured content in a single delivery mechanism, from spreadsheets and images to videos and social media comments.”
MarkLogic offers many key benefits, including the ability to store content as XML documents. The database also enables logical associations between different types of content. Each image, video, and article is automatically tagged, allowing users to find, understand, and process the information they need. As shown in the image below, searching RSC publications is a quick, intuitive process using a standard web browser.
The new MarkLogic platform will be a significant benefit in the RSC’s acquisition of The Merck Index. “We’re eagerly looking forward to developing The Merck Index for the digital future,” says Dr. James Milne, RSC Publishing Executive Director. The schema-less MarkLogic database will help to ensure the continued growth of the publication’s online format.
Sharing the Knowledge
With the greater data accessibility afforded by the new MarkLogic database, the RSC’s publishing division has become much more productive, publishing more than 20,000 articles in 2011. “We can now publish three times as many journals and four times as many articles as we did in 2006, and get them to market faster,” says Leeming. “And we have the ability to build new educational programs to spread chemistry knowledge among more people.”
In addition, since implementing the integrated MarkLogic database, the RSC has seen a 30 percent increase in article views, a 70 percent traffic boost on its educational websites, and a spike in research activity in India, China, and Brazil.
Although the integrated data repository has been the biggest game-changer, the MarkLogic technology has enabled other opportunities. Leveraging MarkLogic’s Enterprise NoSQL database, the RSC has launched many new research journals, mobile applications, social media forums, and applications for children.
Dr. Robert Parker, RSC Chief Executive, sums up the major role MarkLogic has played in this successful transition. “Using MarkLogic’s big data platform has allowed us to open up the world of chemistry to a much wider audience, whilst increasing the volume and quality of the research that we publish.”
The RSC continues to innovate in the way it delivers content—and to rely on MarkLogic. “We need to move into an online community and provide the information in the way that end-users want to access that information,” Leeming says. “MarkLogic is key to providing content to our subscribers through mobile devices, Web browsers—whatever technology they choose.”