In data we trust, or do we?

The world of BI and analytics is abuzz with lots of action today.

Speed is the mantra. Analyze data in real-time, in memory. Disk reads too slow; got to shave off those pesky milliseconds.

Everyone wants to have a go at it. Used to be when you needed something, you had to reach out to IT. Now the message: Do it yourself (you still need IT though!). Which is where data discovery tools come in: A picture paints a thousand words after all. In this case, it is probably thousands of data points and more.

Clearly the tools have gotten better and better. We have been pushing the boundaries of technology pretty much to the edge. Hardware acceleration through Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs) perhaps is right at the pinnacle of the “faster and faster” movement.

What about the data itself?

After all what good is lots of intense analytics based on suspect data? That brings up the subject of data quality. However data quality appears to be harder to get your arms around. Deploying cool technology sounds easier.

Pointing to the prevalence of gaps in the data quality realm is a recent statistic: “44 percent of companies don’t have a formal data governance policy, and 22% of firms without a data policy have no plans to implement one.” And we are talking about data in general.

But turn attention to big data, which is presumably where a lot of the technology and speed really comes in handy. Especially external big data: Social media data, sensor data, and so on. Clearly not a lot of control can be had over it.

Implementing data governance on internal data stores is challenging enough. Come to big data and it likely becomes a much tougher nut to crack. Yet implementing solid data governance processes along with data quality filters and MDM in the context of big data looks to be the way to go.

A key benefit appears to be trust. How do you trust what your big data analytics is telling you? Can you go ahead and take action based on the insight being delivered? Knowing that the technology is backed by meaningful processes certainly should help.

Deploying any form of technology without robust, well thought out processes seems akin to driving a car in a place where the rules of the road are not clearly laid out or enforced. We all know how that will turn out.

In data we trust, but we verify (and verify) is surely a useful strategy to hang on to.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s