Despite all the hype in Cloud, despite Gartner's prediction that "by 2012 20% of businesses will get rid of all IT assets as they embrace cloud", large enterprises will not drop their ERP implementations any time soon. And guess what, their key data is embedded in their ERP systems. As David Linthicum points out in an excellent blog post titled: "The data-integration buzzkill for cloud computing", moving any business relevant functionality to the cloud requires addressing the issue of integrating the cloud based applications with the enterprise IT systems.
Identifying integration mechanisms is critical for the adoption of the hybrid cloud model, where public and private clouds are integrated to address the variable needs of enterprises. Discussing the latest RSA conference, Christine Dunlap points out in her blog post "Hybrid Clouds hit Data Centers", that security and infrastructure providers realize the importance of private cloud as a first step towards moving to public ones. Linking private and public cloud data is not only an issue related to privacy and legal requirements around the location of data, it is also an operational need.
Starting from the premise that CIO's and CEO's are not ready to put all their data in the cloud, there are two possible approaches to date:
- Keeping all the data in the private cloud and developing a mechanism for the public cloud functionality to fetch the data out of the private cloud when required (single location approach)
- Duplicating some data in the public cloud and synchronizing the data between the two locations.
Let's look at each of those in more details, but first let's point out that most companies today will only allow a small portion of their data to be available in a public cloud. This demonstrates the need for enterprises to clearly identify the confidentiality nature of each data item. It's a data classification exercise, a prerequisite for moving to a public cloud beyond initial proof of concepts.
In the single location approach, a staging server is established in the DMZ. On this server, the "public data" is stored. A mechanism is established to synchronize the data with the master copy available behind the firewall. An agent located on the staging server is in contact with the cloud based application and sends the data across on request. Obviously, encryption, VPN and digital batches may be used to secure the connection. Once the data has been used by the cloud based application it is whipped out. The advantage of this approach is that data never resides permanently in the cloud, limiting the exposure of the data. The disadvantage is that it will result in increased network traffic, which comes at a cost. Also, if extremely large amounts of data are required (for CAD applications for example) the approach may not be practical as the latency might be too long.
A dual location approach implies the permanent presence of data in the cloud and requires an application to push data to the cloud as soon as the master data has been updated. It limits the communication costs as the data is only transferred when it changes, but it leaves data in the cloud with the proliferation, privacy and other security issues we discussed in previous posts.
Google released the Google Secure data Connector, an example of how data can be transferred securely between the data center and the cloud. However, one element has to be clear, and this is that the approaches described above require the development of the applications. Typically SaaS offerings do typically not provide such secure integration mechanisms.
So, classify your data first, analyze how exposed you are in case the data you want to put in the cloud becomes public, and then identify what data can reside permanently in the cloud, what data you should keep under your control, but make visible to the cloud, and what data has NO place in the cloud. This will allow you to identify not only what business logic you can migrate to a public cloud, but also what mechanism you should use for protecting the data.
As I already mentioned several times, cloud is in many ways a paradigm shift. Although, like mentioned in a previous post, cloud is more evolutionary than revolutionary, at least in many aspects, maintaining data in the cloud raises many questions. The immaterial nature of the cloud makes it difficult to assess the legal nature of the services. It is interesting to realize though, that little has been written on the subject and that compliance does not appear in challenges/issues surveys, such as the latest one from IDC (slide 11). Let me make a non exhaustive list of what they are:
- Trans-border data flows. Two main elements have to be taken into account. First, any data passing through the US can be accessed by law enforcement agencies (US Patriot Act). Not knowing where data is stored, nor how it is routed between data centers result in the inability to assess whether the data is passing through the US or not, and what the exposure is to improper use of that data. The second has to do with privacy. The EU for example, has stringent privacy laws, prohibiting the transfer of personal information to countries that do not provide the same level of protection with respect to that data. The US happens to be one of those countries. In other words, am I allowed to store the e-mail address and birth date of a European citizen in salesforce.com?
- If a company outsources the handling of personal information to another company, they may have some responsibility to ensure the outsourcer has some level of reasonable security to protect personal and confidential information. An example of what needs to be taken into account is described in the AICPA/CICA privacy framework, organized in 10 different components (management, notice, choice & consent, collection, use & retention, access, disclosure, security, quality and monitoring & performance). The issue is that many cloud services are provided through a "supply chain" of companies, making the assessment of "reasonable security" extremely difficult.
- What happens when cloud data is required in litigation? One specific case, U.S. vs. Weaver, demonstrates the intricacies of who owns the data that resides in the cloud (in Hotmail in this particular instance). Electronic communications stored on the cloud are entitled to the full scope of the Secure Communication Act's protections with regard to requests from non-governmental entities.
- Enforcement of record retention policies, imply that data no longer needed is destroyed. However, in the cloud multiple copies of the data are maintained. What guarantee do we have that, once the data may be destroyed, all available copies are? In particular when multiple companies teamed up to provide the service in the first place, are we sure that, when the data is destroyed, the relationships are still active?
As David Navetta and Tanya Forsheit point out in a series of blog entries on the subject, ... There is going to be incredible financial pressure on organizations to take advantage of the pricing and efficiency of cloud computing and if attorneys fail to understand the issues ahead of time there is a serious risk of getting "bulldozed" into cloud computing arrangements without time or resources to address some serious legal issues that are implicated.
Companies eager to migrate data in the cloud should keep a couple elements in mind. First they should assess the confidential nature of the data they plan to migrate, using similar rules than the ones they use for documents (confidential, private, restricted etc.), and should assess which classes of data can be stored in the cloud. As ground rule, I would only store in the cloud data, whose accidental exposure does not harm the company or any of its stakeholders.
Secondly, companies should ensure the agreements with their service providers include:
- How they handle third party requests for access to information
- How not only the data, but also the cloud metadata is protected
- How the authenticity of the data is ensured. In particular this should include audit trails.
- Who bears the cost of gathering the data that might be required for legal proceedings?
Storing data in the cloud does not relieve the subscriber from adhering to the legislations available around the world. As most of those have been established prior to the cloud era, they do not take cloud into account, which results in a number of caveats that need to be addressed. Over time jurisdiction will fine tune the law, but that will take time. In the mean time, companies may carefully review what they do to ensure compliance.
At the Gartner Symposium, last week, HP's CEO Mark Hurd was quoted about the lack of security in the Cloud. This is only one of the voices heard about cloud computing security. So, should we stop thinking about linking our partners in the cloud to gain visibility in our Supply Chain? Maybe, maybe not. What vulnerability are we talking about? In public clouds there are three major:
- The transfer of information from your partner to the cloud. There standard SSL security (with 128 bit encryption) is used. This issue is not specific to cloud, it is actually applicable to e-commerce etc. Yes, there have been breaches at that level (nothing is fully secured), but we continue shopping, don't we. To address this, some cloud providers allow VPN connections, but often more in a "private cloud" type offering.
- The hacking of the datacenter in which the data may be maintained. And here again this has happened in multiple environments. Mark Hurd pointed out that HP gets 1000 attacks per day. Obviously, hosting applications and data in the cloud, forces companies to trust the cloud providers who, for very understandable reasons, do typically not highlight/explain all the security measures they take. So, this is a chicken and egg problem. Data centers have been hacked, but it is not stopping companies storing credit card numbers etc. in internet enabled datacenters.
- The third vulnerability is the least known one. As the cloud implies the running of applications in shared environments, using virtual machines, there is a possibility for tech savvy hackers to co-locate themselves with the application they want to hack and penetrate that VM container. This is obviously only applicable in public cloud environments. The security at hypervisor (the software allowing multiple VM's to run on the same hardware) level is the main question here. Unfortunately in this space there is not a huge amount of experience yet as this is rather a young area. HPLabs is currently working on the concept of secure cells to address this.
So, this being said, should we use cloud computing to share our ecosystem information? The fundamental question to ask ourselves is how private this information really is. Let me give an example. If you are a cosmetic company, you are probably not interested in putting perfume recipes in the cloud, as that is what makes you unique. So, even with a very small chance of the information becoming public, it does not make sense to take that risk. On the other hand, marketing material and prices/discounts are publically available. Yes competitors may have to search a little, but they can/will find the information if they wish so. Having that information in the cloud does not augment the risk drastically.
So, prior to using cloud services to collaborate in the Supply Chain, it is important to assess the confidential nature of that information, and whether this data can be obtained by other means. Objectively assessing the nature of the information is critical to establish whether putting the data in the cloud is/or is not a real tread for the future of the enterprise and its ecosystem.
If no clear consensus can be obtained, you may want to look at intermediate solutions. For example, utility based environments such as AIS (Adaptive Infrastructure Services) provide a secure access to the environment (using VPN or leased lines). As these environments have more stringent security rules, they may appear to the community as less subject to hacking. Ultimately, the security debate is one about trust. The fundamental question is whether the supply chain community trusts the provider or not.
New security techniques will be developed in the future and will change the perception of companies. However, if companies want to start experimenting with cloud today, they should start in non-critical areas.
As part of HP's latest press release on its partnership with GS1 Canada, HP is announcing the HP cloud computing ecosystem for manufacturing industries. This is a real exciting development. Let me explain why.
In many entries of this blog, I have been talking about the need for companies to increase visibility across their value chain, exchanging information with suppliers and distribution partners to reduce supply chain costs, improve responsiveness and mitigate risk. Although many companies understand the rational for such an approach, they have issues with the investments required to develop the infrastructure needed to obtain that infrastructure. Also, many of them do not trust the use of public hubs. They are looking for something else, but what?
In a previous post, titled "Cloud Computing in Manufacturing", I mentioned that the cloud could be used for cross enterprise collaboration. Well, that's exactly what the HP cloud computing ecosystem for manufacturing industries enables. Now, why is that the case and what is different from the approaches discussed above?
First, by putting the collaborative functionality in the cloud, we remove the hurdle of having to invest upfront in infrastructure, loading the CAPEX portion of the balance sheet, as the cloud typically works with a pay per use model. But where things are even more interesting at this point in time is that a cloud approach allows the access of massive amounts of distributed data by discrete or composite services, created by any number of community stakeholders. This actually resolves another perceived problem of the traditional hub approach. Suppliers in particular are often reluctant to have their information located outside their control, not knowing how far this information will percolate. Allowing them to keep the responsibility for their portion of the data makes them feel way more willing to participate in such environment.
HP cloud services - these are technology-enabled services that are used by manufacturing, distribution and retail companies. The available services allow those companies to track, trace, recall and authenticate manufacturing products. Working with customers, we intend to develop more services addressing a variety of cross-enterprise collaboration needs.
The HP cloud platform - a development and run-time environment that provides data, analytic, management and security services. This platform has been developed by our team in Galway, Ireland in conjunction with HPLabs.
- The HP cloud infrastructure - a scalable infrastructure that automates resource provisioning and system management
The first commercial application has been developed with GS1 Canada, to enhance the product recall process. Over the last couple years an increasing amount of recall cases have hit the press and millions have been spent in ensuring food safety. Through providing a better understanding of where particular batches of products have been distributed, the service not only enables a significant decrease in the amount of time needed to respond to a recall (hence improving safety), but also limits the amount of products that have to be recalled, hence reducing costs.
The biggest challenge facing both the food industry and the consumers for the provision of a global food supply chain that can be tracked and traced is not the absence of data; it is how:
- Data is integrated and managed
- Data privacy is maintained
- Value-added services that harness the collective intelligence the data represents are provisioned. The proposed services need to:
- Protect the food business organizations IP and brand
- Meet the needs of the customers' appetite for information.
HP saw what the market lacked: a single service that acts as an independent broker of information and spans the entire supply chain, and the absence of a provider that offered a full suite of services addressing the industries needs.
The HP Cloud Computing Platform for manufacturing industries is the supporting platform for these services. Obviously, being focused on the provision of real-time information, communication standardization, the generation of alerts and the maintenance of audit trails, there are many other business problems that can be addressed using the same platform. We are currently looking for partners and customers interested in developing other cross-industry services.
Two weeks ago I was asked to run a workshop on collaboration at an event in China. Based on some of the ideas I shared in an earlier blog, I developed the content of the workshop. What daunted on me, listening to people in the audience, is that both the users and the providers keep talking about tools, without looking at the full portfolio of needs. Let's look at what those are.
I am starting from a very simple model, in collaboration both structured and unstructured information needs to be shared, while interactions may have to be synchronous or asynchronous. Let's look at each of the four areas in a little more details:
- Asynchronous exchange of structured data. Most business to business transactions are of this category. Indeed, information is shared by the sender when it becomes available, and picked up by the receiver when it can be processed. There is no expectation by the sender of an immediate response. EDI transactions for example, follow this scheme.
- Asynchronous exchange of unstructured data. Exchange of documents, review processes, calendar planning etc., typically follow such scheme. Here again the information is shared by the sender at a particular moment in time, and processed by the receiver when he/she has availability. A good example is the review of a document. The author will send it when he/she finishes writing the first draft. The reviewer will take time to review and post comments. When done the document will be sent back to the author who will take the feedback into account and issue draft 2.
- Synchronous exchange of structured data. Some collaboration requires immediate commitments. For example, when an OEM requests whether a contract manufacturer has the necessary capacity available to produce a particular batch of product, the OEM expects a response (positive or negative) allowing him to allocate the production to this or another supplier. This is a two way collaboration where the messages are directly related to each other and as such synchronous.
- Synchronous exchange of unstructured data typically relates to direct human interactions or joint work on a document, a CAD file etc.
You are probably saying, what's the issue? We use B2B software for the first and the third category; we use e-mail for the second and unified communication and telepresence for the fourth. This is how many companies look at the collaboration issues. But they forget one thing, what if within one conversation, we move from one category to another. How do we keep track of what has happened and how things evolved?
Let me take an example. Let's assume an OEM and its contract manufacturer jointly work at the development of a new product. First specifications are written. The OEM develops the first draft, sends it to the CM for review and comment. The CM realizes that by slightly changing the specifications, manufacturing could be done cheaper, resulting in benefits for both companies. They send an updated specification to the OEM. All this happens through e-mail. But the OEM is not really convinced and decides a synchronous interaction is required. So, a call is set-up between the parties and a negotiation takes place, after which a new version of the specifications is developed. At this point in time at least two technologies have been used. But now, the actual development starts, using CAD and CAE tools. Subsequent versions of the design are exchanged to ensure manufacturability, regular design reviews take place, till the product is finalized and production is getting planned. Engineering change management tools, collaboration and communication tools are used throughout the process, but how do we maintain consistency of the information and logging of all decisions taken.
You may ask yourself why this is important. Well, let's assume the product has warranty problems, how can we identify which decision got us in trouble, allowing us to understand better what happened and improve things in the future.
The tools exist today, but each of them is taken in isolation. Google seems to be trying to address that through their Wave project, but Wave is focused at consumers. Who will address the true problem of collaboration in the enterprise space, particularly now that business travel has been reduced while companies become more international? Employees are expected to increase their productivity while neither tools nor training is provided regarding global collaboration.