Rethink BI : Business Insights over Business Intelligence
The purpose of this business insights thought leadership blog is to share the HP point of view on industry trends such as Big Data and Real Time Analytics, and provide updates on key innovations and solutions.

Big Data architecture in the New Style of IT—Part 4

In this fourth part of our series, we ask Greg Battas, CTO of Business Intelligence Solutions for HP Converged Systems to help us understand the trend—and common misconceptions—about Software Defined Storage. Greg’s response In the second part of this series, Greg reminded us of how current application-centric data centers are accumulating around processing of single jobs … leading to a succession of computing nodes assigned to a single data cluster or application. His sense is that the demands of Big Data require a more converged system approach.   

 

iStock_000023930618Small.pngWhat about the trend toward Software Defined Storage?

GB: Coupled with this common misconception that it is always better to take the processing to the data is this emergence of Software Defined Storage (SDS). Big Data began with the idea of moving away from traditional storage architectures and instead leveraging industry-standard servers running a Distributed File System (DFS) as a kind of software defined storage array. If you look inside many modern industry standard storage arrays you find what looks like a cluster of nodes running an advanced DFS. What we’re seeing in Big Data is more and more companies choosing to move to industry standard servers running parallel file systems, or NoSQL products, rather than traditional storage arrays or databases. Unlike a purpose-built storage array, there are options for co-locating work with data, so this is a “best of both worlds” solution. It’s not a traditional storage array, and there are common compute functions so you can actually push a lot of work toward it.

 

Clearly a distributed files ystem on industry standard servers is not a replacement for storage arrays in traditional enterprise applications but they excel in certain areas like price per TB and scalability. For these reasons we see them commonly used in emerging areas, such as Big Data with HDFS and objectstores used for processes like archival. Both of these uses value scale and cost per TB more than raw performance.

 

Some of these projects are becoming more valued for their interfaces than their implementations. HDFS is becoming the de facto  interface for Big Data with several vendors offering differentiated filesystems presenting themselves as HDFS. In the ObjectStore world, OpenStack Swift and Amazon S3 are becoming the default interface for many products. This opens the door for the emergence of companies, such as Cleversafe and others, that are designed to serve specific needs without having to create a new standard.

 

So there’s an explosion of interest here in how these distributed file systems are going to be deployed — and Big Data was the “canary in the coal mine.” Ultimately using a DFS on industry standard servers opens the door for a different enterprise data architectures.

 

At HP, we have our own version of this with our HAVEn products. Here we use the Hadoop distributed file system as an underlying substrate to store data that can be analyzed and understood with Vertica and processed with HP Autonomy. With the wealth of connectivity options in our sofware portfolio we can uniquely connect HDFS and Hadoop to the enterprise and keep if fed with useful data. And that’s exciting!

 

To learn more about HAVEn and how you can profit from Big Data, check out the HP HAVEn webpage for useful tools, videos and presentations.

 

 

Greg_Battas_badge_176x304_tcm245_1428057_tcm245_1422290_32_tcm245-1428057.pngAbout Greg Battas

Greg’s background in solving business problems for customers — in particular those in the retail, telecommunications and financial services sectors — and in product development for relational database management systems, has played a critical role in helping bridge the gap between the viewpoints of IT and business decision-makers to explain how to use technology to solve challenging organizational issues.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the community guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author
About the Author(s)
  • Technology marketing professional with over 25 years of experience in energy, semiconductor and IT industries.
  • hp.com/go/convergedsystems
  • Jeff Spiller has over 30 years experience in architecting highly available and scalable multi-tier platforms for a variety of Fortune 500 companies. He is currently HP’s technical lead for the Enterprise Data Warehouse (EDW) Appliance, optimized for Microsoft Parallel Data Warehouse (PDW) software. As a member of HP's ESS Performance and Solutions Engineering COE (Center of Excellence), Jeff has a proven track record for designing, tuning and performing capacity planning in OLAP, ROLAP, OLTP and consolidated environments.
  • Focused on cloud, virtualization and appliance solutions for HP technology.
  • HP Servers, Converged Infrastructure, Converged Systems and ExpertOne


Follow Us