Around the Storage Block Blog
Find out about all things data storage from Around the Storage Block at HP Communities.

Consolidating file and object with hyperscalable HP StoreAll

Headshot 100X100.jpgBy Calvin Zito, @HPStorageGuy  vexpert 2012 logo.gif

I don’t like having a day with multiple blog posts but I’m compelled to give you more of the news from HP Discover today.  In my previous blog posts, I first gave you an overview of our Converged Storage announcement and in the second I focused on our HP 3PAR StoreServ announcement.  In my final blog post for today (promise, this is it for today) I want to highlight the HP StoreAll Storage platform.  There are three things I want to call out and then as in my other blog posts today, I have a Countdown and ChalkTalk video to share.


Hyperscale is a great word to describe HP StoreAll and when thinking about how can you manage hundreds of millions of files and objects, it’s exactly what you need.  We’re not talking about a mid-range NAS that can provide file access to a couple hundred clients – I’m talking about scaling to over 1.6PB and over a thousand nodes.  I’m almost embarrassed to mention the data explosion (I don’t know about you but I’m tired of hearing about it) – but it really is what is driving a solution like StoreAll. 

StoreAll supports the Representational State Transfer (REST) API which enables sync/share application deployments at massive scale and enables you to harness the value of this unstructured data paradigm.  And it is a converged file and object platform.


I feel the need for speed
StoreAll.jpgIf you can’t find the information you need then you’re just holding on to a digital landfill.  HP StoreAll Express Query, developed by HP Labs, gives instant value extraction of data.  I heard a great story from our StoreAll product manager – the team was pulling together a benchmark to test how fast it would be.  They were loading up the system with 500,000,000 (half a billion) files and objects, thinking this test would take a bit of time.  When they ran it the query, it finished in 1.434 seconds!  The team wanted to load StoreAll with a billion files but then stopped to consider how long it would take to run the same query on with a traditional file system scan.  That scan took over 42 hours - 151,278.706 seconds to be exact!  The team wanted to compare a billion files thinking the results would likely be even better but ran out of time.  Maybe they’ll get to that. 

There’s also integration with HP Autonomy IDOL.  Here’s what the fact sheet says about this: In addition, HP StoreAll Express Query IDOL connector integrates with HP Autonomy Intelligent Data Operating Layer (IDOL) to streamline the processing of dynamic content across large data sets. HP StoreAll Express Query's accelerated file namespace scan delivers inline updates to IDOL-based applications, making it possible to rapidly process new data changes and deliver up to date analytics so that decisions are based on real-time vs. outdated information, while using considerably less compute resources than conventional storage technologies. HP StoreAll Storage also is integrated P1010225 StoreAll 9730.JPGwith HP Autonomy Consolidated Archive, a meaning-based archiving solution powered by HP Autonomy IDOL. This enables it to deliver advanced analytics resulting in superior information searches and significant cost savings by identifying business value within the data, enabling more appropriate and accurate retention schemes.


Over the next few days, I hope to get deeper on HP StoreAll Express Query as we have HP Labs folks here in Frankfurt.  Hopefully we’ll get time to do a podcast! 


Scale-out economics
There are a few things about the StoreAll platform that make it an economic platform. 


  • policy-based tiering
  • converged file and object
  • scale-out pay as you grow architecture

The result is that HP StoreAll Storage improves operational flexibility and reduces storage costs. 

Here are the videos.   First, here is the StoreAll Countdown video:



Next, huddle up because here's my StoreAll ChalkTalk:



(January 2 update)

I had intended to do another blog post pointing to this architecture overview ChalkTalk I did but haven't gotten around to it yet so I'll add it here. 



Phew - HP Discover hasn't officially started yet - that will happen in a few hours so time to get a a few hours of sleep for the event kicks off.  Lots more to come - stay tuned!

nate | ‎12-06-2012 01:09 AM

Can you do me a favor  - during your Autonomy tests can you :

  1. Show examples as to what sort of data is being queried
  2. Why this data would be in Atonomy (vs HP Vertica)

In the speech that I saw, the Autonomy guy (forgot his name I'm terrible with names) said they used some sort of social media data set, and said it was really fast.. What was in that data set specifically? From the sounds of it it seems like it could of easily been a Vertica data set.


Autonomy is apparently for unstructured data and Vertica is for structured data. Vertica doesn't depend on any fancy storage integration it is just warp speed all on it's own (If you haven't looked into it I strongly urge you do - it really is a product that folks can get excited about - especially when you combine it with something like Tableau software)


I'd just like to see Autonomy do something special on unstructured data that couldn't be accomplished just as well by structuring the data and putting it in Vertica(which I imagine is significantly faster than Autonomy in part because of the benefits of structuring the data).







| ‎12-07-2012 04:49 AM

Hey Nate - I'm not expert here but I think mainly Vertica is focused on structured data and is really good at making sense of that data.  Obviously with StoreAll and the links into Autonomy, we're talking about unstructured data.


I think you've asked some great questions here and I'll work with my StoreAll and StoreAll Express Query experts to do something to address it. 

patrick.osborne ‎12-10-2012 09:22 PM - edited ‎12-10-2012 09:40 PM



Our primary goal for V1 of StoreAll and Express Query was to help customers provide some "structure to their unstructured data". The StoreAll platform is extremely scalable and useful for storing unstrucuted file-based data and we use Express Query to search the metadata for SRM functionality as well as tag the data with custom tags in the EQ database. 


One of the use cases we decided to invest in was with Autonomy IDOL. IDOL is an enterprise search and index platform that allows customers to index over 1000+ file types for the purpose of meaning-based analystics based on pattern matching algorithms. In this case, you can store petabytes of unstructured content (video, audio, docs, news feeds, tweets, etc...) and derive sentiment and actionable information from lots of enterprise unstructured data. For example, finding and tagging things that look like a company's Intellectual Property in a broad array of file content. 


As for Vertica, we are working with them as well. In fact, we are both located in Massachustetts. Vertica is a columnar database that focuses on analytics for semi-structured data. This is a litte different than storing content in a scale-out NAS or object store, but we are investigating solutions around ETL and data conditioning in the Vertica pipeline. If you have a specific need in this area, we would love to hear what you are trying to accomplish. 


Thanks for the inquiry and your commitment to HP Storage.



Showing results for 
Search instead for 
Do you mean 
About the Author
25+ years experience around HP Storage. The go-to guy for news and views on all things storage..

Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.