We’ve joined forces with Smartlogic to reveal smarter decisions—together.

Big Data On A Crowded Train

When a guy drags his overcoat across your sandwich, a crowded train is usually a bit overwhelming. However, I was recently riding a packed Acela from a customer’s site in Boston and had a good chat with my seatmate, who worked in a non-profit organization. Her job was to guide first-generation college-students through an unfamiliar landscape of academics and financial aid. “It’s hard for some of the families,” she said, “who might be supportive of their kid’s aspirations, but who don’t have the experience with the system.” I’ve got two kids in school myself and though my family’s sent a couple of generations through college, I still find it tough to help my kids keep up with their coursework and deal with all the bureaucracy. “Two kids at once!” people sympathize. Although maybe they’re just talking about the tuition.

“What’s your caseload?” I asked my seat mate, figuring it must be five or six. Her answer shocked me and revealed something fundamental about the role of big data in education.

Her caseload is 40 kids. She helps 40 kids navigate through college. Suddenly my travails with two did not seem so difficult.

“Yes,” she said, “and I have a dozen colleagues each with 40 kids too, across hundreds of colleges.”

“How do you keep track of them all?” I said.

“I don’t know,” she said. “We put them in a spreadsheet.”

“Ah,” I said. “The old pivot table.”

“I guess,” she said.

In a recent post I talked about building a lightweight Learning Management System using the MarkLogic architecture you’ve deployed to support your other business objectives. A suite of apps provides the storage, search and assembly you need for a viable learning experience. When I wrote that post, I focused on the educational content, the facts you might extract from the articles, books, or other items you already publish.

But what I realized on this train is that big data in education is not only about educational material — narratives, tables, equations, and figures. Big data in education is also about students. Maybe even primarily about students — lots and lots of them. What if you, as an educator, could find groups of similar students among thousands or millions, use information about them to assemble compelling curricula, and discover learning trends in a single student’s history so you can suggest her next step?

There is a new technology approach to testing and course progress called TinCan. TinCan is described as an “experience-based api” because it defines how a learning system can store a student’s lifelong learning activities. It expects to receive statements like “Frank completed exercise 5 on chapter three of The Essentials of Interaction Design.” For many people who’ve been involved with the Semantic Web, this statement is familiar as a subject-predicate-object pattern. It can therefore be expressed as a triple and stored in MarkLogic 7’s new triples index. So let’s do that. Let’s build an implementation of TinCan on MarkLogic. While we’re at it, we’ll use MarkLogic’s Java or REST api to store that experience triple. In fact, we’ll store all the experiences of all the students my train companion’s organization helps. Now she can use a visualization tool like d3 to see that many of the experiences in their TinCan streams are related. Some of them perhaps form a cluster around the concept of interaction design. Further, she can see that interaction design is related to user interfaces, and that a scholarship competition for user interfaces closes in sixty days. Since we’re me, we’ve implemented this on MarkLogic, and MarkLogic’s capabilities have helped my seat mate provide better service to her students.

What do you need to know about the people you service? Can you query for it now, on your existing system? Can you improve your students’ day by storing their learning experiences in MarkLogic?

Frank first joined MarkLogic in 2006 after a ten year career as a Computer Scientist at Adobe Systems, building collaboration, XML, and data-driven features for Creative Suite. At MarkLogic he was a Senior Principle Consultant, working for customers like Pearson, HMH, Publishers Press, McGraw-Hill and Congressional Quarterly. He left MarkLogic to serve as CTO at Spectrum Chemical & Laboratory Products, where he led an Oracle EBS migration, and an e-commerce website re-architecture that used MarkLogic for content-marketing. After Spectrum, he was Executive Director of Technology and UX at Kaplan Publishing, where he built a mobile content delivery platform for 200,000 students. In 2011, he rejoined MarkLogic and took a Solutions Director role, where he enjoys a mix of development, architecture, and sales projects. He tweets at @xmlnovelist.

Start a discussion

Connect with the community




Most Recent

View All

Why Data Agility Is Essential for Your Business

Data agility is the ability to make simple, powerful, and immediate changes to any aspect of how information is interpreted and acted on.
Read Article

Facts and What They Mean

In the digital era, data is cheap, interpretations are expensive. An agile semantic data platform combines facts and what they mean to create reusable organizational knowledge.
Read Article

Truth in ESG Labels

Managing a portfolio of investments for your client has never been simple - and doing so through an ESG lens raises the complexity to an almost mind-boggling level. Learn the signs your team has hit the wall with current tools - and how a semantic knowledge graph can help.
Read Article
This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.