Journey through Enterprise IT Services
In Journey through Enterprise IT Services, Nadhan, HP Distinguished Technologist, explores the IT Services industry, and discusses technology trends in simplified terms.

Using the right Big Data tool helps win presidential elections—Vertica Analytics Platform

Big Data analytics played a significant role in Barack Obama's presidential campaign as outlined in this featured story in the MIT Technology Review by Sasha Issenberg. In order to understand the underlying voter sentiment that drove their strategies in various voter segments, the campaign needed to informationalize their 180-million-person voter file, as well as the data about volunteers, donors and online constituents. And they needed to do this fast with two very simple objectives—get 2008 Obama voters to do it again; and register and mobilize new voters. To do this, they needed a robust analytics platform.

 

The Obama Campaign strategy was to build out a holistic view of each constituent based upon their data as well as their voting pattern and campaign interactions.  And they wanted to make this information available to the campaign as a whole.  Like many other political campaign infrastructures, knowledge about the constituents and the campaign interactions were in disparate databases.  They needed to analyze and extract an integrated perspective on this large volume data to realize valuable information.  Enter HP Vertica Analytics Platform.

 Vertica.png

Here are the top 5 defining characteristics of Vertica, along with an analysis of its enabling capabilities in this context:

 

1. Storage: Analytical processing involves large volumes of data. Aggressive encoding and compression of the data stored allows for high-volume storage and retrieval of data in a timely manner, enabling more views. Vertica supports 13 different types of encoding with compression ratios of up to 60 to 1. Several operations can be performed on the native encoded data. Deferred decoding of the data ensures its timely materialization.

 

2. Query. The platform must enable high-speed query of data that matters. Traditional relational databases tend to retrieve all the columns in a row—pertinent or otherwise. The purpose behind the query should drive the columns retrieved for analysis. Vertica supports columnar storage, which effectively enables high-speed queries on data that matters.

 

3. Processing. Massively Parallel Processing (MPP) is critical to leveraging data projections that enable distributed storage and workload. MPP enables real-time analytics on large volumes of data. Vertica’s asynchronous tuple mover process enables concurrent load/query with very low data latency (seconds) and full context (years of detailed history).

 

4. Scalability: In this environment, the only constant is the continuous growth of data to be analyzed. The ability to add more resources on the fly through clustering is essential. Vertica's grid-based architecture provides linear scalability on clusters of commodity servers, allowing the choice of the appropriate performance curve.

 

5. Availability. The need to have the right information at the right time makes the availability of the analytics platform a vital requirement. Vertica's grid-based architecture has built-in, native, high availability. The projections are organized so that if a node fails, a copy is available on one of the surviving nodes.

 

Vertica is architected from the get-go for complex, large-scale, real-time analytics. In our world of petabytes (that is headed toward Brontobyte land), we are guaranteed to have unimaginable volumes of data with the ever-increasing need for analysis to glean valuable intelligence. In the absence of such analysis, all we will have is Big Data—without information—and therefore, no ROI (Note: Return on Information).

 

To win a presidential campaign in a country the size of the United States, you need the right tools. And these tools must enable the effective execution of an underlying strategy. I have no intentions of running for political office. But if I do, I know which analytics platform I will use to better comprehend my voter base.

 

What say you? How are you analyzing the data that your enterprise has access to? What do you think are other defining characteristics for a viable analytics platform? Please let me know.

 

Connect with Nadhan on: Twitter, Facebook, Linkedin and Journey Blog.

 

References:

 

HP Vertica Analytics Platform

HP Information Management & Analytics Solutions

Comments
Yogi Sikri(anon) | ‎01-08-2013 07:51 PM

very good article.

Yogi Sikri(anon) | ‎01-08-2013 08:05 PM

was there unstructured data involved as well. Vertica can work great with unstructured data. Does it also support capabilities with unstructured data, including text, video, blobs, audio, etc and make sense out of that data. A lot of FB data is very unstructued for example

Nadhan | ‎01-08-2013 08:53 PM

Thank you, Yogi.  The MIT Technology Review report does not expand on the extent to which unstructured data was involved.  Please feel free to give it a closer read.  We are in a world of exploding volumes of brontobytes of big data -- both structured and unstructured.  The most effective way of gleaning meaningful Return on Information from all this data is to use an integrated set of tools including those that turn the tables on computers.  

 

Connect with Nadhan on: Twitter, Facebook, Linkedin and Journey Blog.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the community guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author
About the Author(s)


Follow Us