By Arun Dendukuri
I recently saw a television show about mining gold. Miners used huge bulldozers and excavators to scoop up hundreds of tons of dirt and rock. They poured it through a series of machines that bumped and rumbled as they discarded the bigger, lighter debris and allowed the finer, heavier gold to pass through to be collected. In the end, they extracted a few ounces of gold from a mountain of rubble.
Mining for useful information from a mountain of data is a similar problem. The sheer volume of data available from sources like the Internet, social media, business transactions, search and call data is staggering. Businesses want the gold that’s there—consumer trends, customer sentiment, customer behavior, geolocation, and sensor data—but they must sift through mountains of information rubble to find it. Most of the data is unstructured human information—text, emails, images, and videos that lack the well-defined, meaningful fields used in relational databases. Traditional storage systems and databases quickly break down.
I’m doing a demo at HP Discover 2012 that shows how our HP Technical Services storage consulting team has combined business intelligence techniques with storage hardware and analytic software to help our customers find the gold in “big data.”
The first challenge is simply volume. Big data applications can consume terabytes or even petabytes of storage, so we need new techniques. We’re using NAS scale-out to meet the storage volume needs. Then you must be able to access it, so we’ve applied the HP Vertica Analytics Platform to reduce query times on structured data from hours to minutes or seconds. But it’s more than just volume. Most of the data held by enterprises is unstructured, and unstructured data is growing at three times the rate of structured data. So we rely on Hadoop and Autonomy pattern matching technology to extract meaning from documents, emails, social media communications, and even images, audio, and video to access relevant information faster. That helps you make timely, wise and intelligent decisions, and that’s where the gold is.
Success with big data depends on having a proactive approach rather than constantly trying to catch up.
Each organization must develop a strategy and roadmap. To help you do that, we’ve put together a three-day Big Data Strategy Workshop. In six sessions we present the opportunities, challenges, technologies, and approaches available. It’s a great way to get started finding the gold and monetizing big data. I’ll explain more about it in my HP Discover 2012 demo. Hope to see you there.
Read more about big data analytics from HP.
Read about the latest HP Converged Infrastructure announcements, including the news about big data.
Follow the latest Discover 2012 blogs at the HP blog hub feed.
Arun Dendukuri is an accomplished IT management professional with more than 20 years’ experience. He has been involved in analyzing, designing, consulting, benchmarking, and deploying infrastructure-related custom solutions. Arun has performed numerous high availability implementations of Linux, Oracle, and Microsoft® cluster. He has been a trusted advisor to numerous organizations, and has provided assessment, evaluation, and optimization services. He has benchmarked iSCSI and FC storage for optimal database performance and has developed best practices to build such solutions. He is very familiar with HP ProLiant and Blade technology and with scale-out NAS, an integral component of a storage cloud.