Innovation @ HP Labs
Insights on research,, innovation, and emerging technology from HP Labs researchers around the world.

Enabling End-to-End Data Tracing in the Cloud with HP TrustCloud

In my last blog post in the series “Tracing Data for Provenance, Transparency and Accountability in Cloud Computing”, I discussed how data is taking centre stage in cloud computing, and how current system tools are unable to effectively log file accesses and transfers within a Cloud environment.

 

These two factors call for a data-centric, detective approach which enables data events in the cloud to be captured, recorded and analyzed. We need a solution that enables all cloud stakeholders to monitor data files in the cloud and ensure that they remain where should be.

 

In this post, I want to introduce our proposed framework for managing the information in the cloud, as well as the granularities of data logs collected.

 

TrustCloud, proposed by HP Labs Singapore in collaboration with ArcSight (an enterprise security leader HP acquired in 2010) , enables all cloud stakeholders to trace their data in and out of the cloud. It adopts an end-to-end, data-centric methodology grounded on a five-layer framework: systems, data, workflow, laws and regulations, and policies (See Figure 1).

 

 

HP_Labs_Singapore_TrustCloud_framework.png

Figure 1: HP Labs Singapore TrustCloud Framework

 

At the systems layer, data events—such as file create, write, delete, or transfer— are tracked at the file- and block-level. They are logged as data logs via kernel-space sensors planted on all virtual and physical machines in the cloud. These logs are then securely transmitted and analyzed for end-to-end cloud data provenance at the data layer. Workflows and audit trails linking to human users and policies are then distilled at the workflow layer and checked against the laws and regulations layer and policies layer.

 

HP Labs Singapore and ArcSight are collaborating to build data-centric cloud forensics  tools, which are designed to empower cloud stakeholders with the ability to track their data. For example, they would allow cloud stakeholders to identify important files to track, and if these files are violated or stolen, they would be alerted about the history of data violations. Another example of how these tools work is the ability to send text messages to cloud stakeholders when their files leave predefined  boundaries (e.g. banking data leaving a country).

 

Now the key question is “how did TrustCloud achieve data traceability in as complex, virtualized, dynamic and distributed an environment such as the Cloud?” The answer lies in the Systems Layer, which I will discuss in my next post.

 

For more information on TrustCloud, please refer to the following HP Labs Technical Report: TrustCloud: A Framework for Accountability and Trust in Cloud Computing

Search
About the Author
  • Managing Editor, Innovation @ HP Labs blog, Strategic Planning manager at HP Labs
  • Steve Simske is an HP Fellow and Director in the Printing and Content Delivery Lab in Hewlett-Packard Labs, and is the Director and Chief Technologist for the HP Labs Security Printing and Imaging program.
Follow Us