High-profile hacks and cyber espionage by nation states (or those working on their behest) have been in the spotlight of late. But a far more insidious computer threat lurks much closer to home: criminals perpetrating fraud. According to a 2011 report by public accounting firm UHY Advisors, fraud is a whopping $1 trillion dilemma in the US alone — and that number could be considerably higher. It is hard to put an exact number on electronic fraud as, 1) companies are loath to report that they have been had, and, 2) there is no clean definition of it. Because criminals, like virtually all humans, have learned that computers make most endeavors more efficient, electronics are part of the criminal ecosystem – one that includes social engineering – in addition to a computer.
Credit card companies like Visa measure fraud in terms of identity theft and stolen cards: the cardholder is NOT the person making the purchase, and thus the transaction is fraudulent. In Medicaid, unscrupulous medical providers, sometimes in collaboration with compromised beneficiaries, are billing for services NOT rendered – or for patients who don’t exist. In both of these situations, the key is not in finding the computer network that abets the fraud – but finding the social network that masterminds it. And the key to that is in Big Data.
With the ubiquity of computers, it is hard for humans to avoid leaving electronic footprints. However the complexity of human behavior means tracking the right footprints to find intersections and correlations. This is no easy task for risk managers because integrating hundreds of datasets into relational models that could reveal fraudulent patterns has been expensive and time-consuming. According to The Wall Street Journal it took Visa years to expand its analytic engine from 40 aspects in 2005 to now handle 500. That expansion has significantly cut the risk of fraud. But it took years to get there. A pilot between MarkLogic and The Centers for Medicare & Medicaid Services (CMS) suggests there is a much better way and that’s good news for taxpayers.
Fraud in healthcare is massive. The FBI estimates that CMS was ripped off $80 billion in just 2012. CMS was able to recover a paltry $4.2 billion – and it cost $1.2 billion to do that. Fraud discovery has been nearly impossible because of all the handcuffs on the process — imposed by databases and regulations. Claims, Beneficiaries and Provider info all sit in different relational databases – each set up and maintained by individual states and territories. Medical billing is beyond complex. You can have 1000-line-item claims that get broken up into a hundreds of tables in a relational database, which then need to be joined. Congress mandates that bills be paid within 30 days – and so it becomes a “pay and chase” routine. This traditionally high gain/low risk of being discovered has empowered criminals – but all that (finally) seems set to change.
Recently CMS teamed up with MarkLogic to find new more efficient means to detect fraud. MarkLogic ingested over 600M records comprised of multiple heterogeneous claims types and related data for a two year period. Because MarkLogic does not require data to conform to a single schema and uses a hierarchical tree structure to describe relationships the team was able to quickly load the various claims types and easily add data from external datasets, such as Dunn & Bradstreet Reports, National Provider Identifiers database, Facebook, multiple claims, extracts claims, DEA schedules, list of excluded individuals and entities (LEIE), diagnostic codes, indictment documents, SAS fraud algorithm reports and state policy documents. All of this was incorporated into rich hierarchical Provider profiles that presented a picture of many aspects of a providers behaviors and history. An interesting picture began to emerge.
As a result of the database and the application we built we found a number of aberrant behaviors. As my colleague Mike Doane likes to emphasize, MarkLogic developers had “no fraud experts on the team.” Bad “stuff just jumped out,” he said in a presentation on Innovative Fraud Detection at CMS given at MarkLogic World. “We found doctors with very high billings per year, a high percentage of high-cost procedures, too many people operating in one geospatial area, providers who showed up on exclusion lists (indictment documents).
“By using D&B data, they could see one owner of several different provider companies – all located at the same address, right down to the suite. Further, those provider companies had a high concentration of shared beneficiaries between those entities. Those shared beneficiaries may be innocent — or part of the network of thieves,” he explained.
With a successful pilot behind them, CMS is set to roll out a more comprehensive fraud discovery system. And Doane concluded, “while there’s lots of room for improvement to reduce and recover fraud – there is even more money CMS can save in reducing waste.” Stay tuned.