The rush to the Cloud is well underway. For some, the transition has gone smoothly. For others, the switch has been more expensive and problematic than ever anticipated. The difference is due to the level of preparation, education and planning undertaken prior to making the move. The leap into the Cloud (or any new technology) has never been as easy or smooth as promised by promoters and some vendors. Most enterprises are reluctant to share a public analysis of problems. To our benefit, many government entities are subject to different rules and motivations so follies and missteps get identified and publicly aired, allowing others to learn from their missteps. Let’s look at one case.
Enterprises evaluating the Cloud can learn something from NASA’s experience. Recently, NASA’s Inspector General (IG) published an audit of the agency’s Cloud usage that highlighted some issues to which enterprises should pay attention. There is no intent here to bash NASA. The agency has a well-known history of pioneering in the Cloud. In fact, NASA created and contributed key technology that forms the basis of OpenStack. OpenStack has been widely adopted as the foundation for the Cloud industry standard technology.
We use NASA’s experience to provide examples of issues that can arise when dealing with the Cloud. The reader should keep in mind that NASA accepted the findings of the IG. It has already made plans to correct the problems that were uncovered.The points that we are making apply equally to both government agencies and private enterprises.
The issues that we explore break into four broad categories 1) Governance 2) Security, 3) Reliability, 4) Interoperability and Open Standards. Let’s examine each of these in turn.
Governance relates to the overall management of the Cloud. The NASA auditors began by doing a survey of the different divisions of the agency to determine the actual usage of the Cloud. They discovered that the NASA CIO was unaware of all of the ways the Cloud was being used in the different parts of the agency. The various departments had signed contracts with a variety of different Cloud suppliers. In fact, some individuals had used NASA credit cards to buy time from Cloud providers. These deals were made while generally ignoring processes and criteria relating to security requirements as well as existing Federal standards for Cloud contracts.
We suspect that most companies might be surprised to find their own internal Cloud usage is more widespread and unregulated than anyone truly knows . The first item of business then is to determine how many departments have made their own arrangements with any of the numerous and varied Cloud suppliers. In this connection and for most organizations, we believe that the CIO’s organization will be expected to have the responsibility for the governance of the Cloud. The NASA example shows that this assignment of responsibility needs to be clearly communicated throughout the organization. In most situations, it will be the CEO’s responsibility to clearly and unequivocally communicate this fact to everyone. Failure to communicate in a clear, pervasive, structured manner was key factor in NASA problems.
Next, it is necessary to catalog and review all of the business arrangements that these departments have made with the various suppliers. In most cases, they have probably just signed ‘boiler-plate’ agreements the vendor presented to them. The organization will need to develop their own specific standards for contracts and service level agreements to be used in Cloud service procurements. The CIO will need to in form those responsible in the different functions that they need to follow these standards. Good governance requires monitoring to assure compliance. Where existing agreements do not follow these standards they will have to be cancelled, modified or renegotiated to comply, whichever makes the most sense for the given department or group.
The next item to be addressed is security. Before getting into the topic of Cloud-specific security requirements, the enterprise must analyse its applications to identify and assess the risk undertaken if that application moves to the Cloud. The Federal government created a three tier classification of apps which provides a good starting point for other organizations. The NASA IG used this method. It divides the apps into low, moderate, and high risk. Low risk means that there would be little damage if the app was compromised. Moderate means that there would be damage if the app was compromised, but it can be contained. Finally, high risk means that there is the potential for significantly great damage. Security requirements escalate in scope and depth as the risk increases. Enterprises cannot move forward with a Cloud security strategy until they have appropriate classification for each of their applications. It is worth noting that two moderate risk applications were moved by NASA organizations to outside Cloud providers with no special security steps taken.
The recent Snowden case dramatically makes the point that many of the most egregious security violations are posed and occur from within an organization. Therefore, using a private Cloud and keeping it within the organization’s firewall does not eliminate all security threats. On the other hand, using a public Cloud does demand that the organization review, evaluate and monitor the security offered by a Cloud provider.
At the most fundamental level, good security demands that enterprises take the necessary steps to ensure that unauthorized parties are denied access to sensitive information. This requirement continues when such information resides in a Cloud. We believe that as enterprises study their data they will decide that some information is too sensitive or its release would be too damaging to allow it to be in the Cloud.
We said earlier that many organizations might be surprised at the amount of work that has already been moved to the Cloud. Generally, there is little long term harm when developers do this. However, moving production work is another matter entirely. Business units tend to take a very short term view with a focus on ‘getting the job done’ or meeting immediate goals. This can lead to serious problems when a quick solution leads to data breaches or security or reliability or availability failures. History teaches us that correcting the resulting problems can be very expensive. In the category of production we include enterprises web sites. The NASA IG discovered that NASA’s key web site (NASA.gov) had moved to the Cloud. No test had ever been done of the security of the Cloud in question nor does it appear any attention paid to adhering to appropriate contracts or SLAs. The website did not comply with government policies.
For an enterprise running applications in its own datacenter, the cost of reliability and its cousin availability is clear. There must be redundancy in the server, storage, software, and networking. The mantra is: ‘eliminate any single points of failure’. Periodically, failover drills are conducted to assure that the operations staff can manage moving applications to backup systems. Moreover, some provision need to be made to handle environmental threats such as fire, flood, hurricane, etc. All of these steps need to be taken otherwise reliability and availability can be seriously compromised.
We find when organizations move applications to the Cloud, many ignore these requirements.The thinking is that they can skip these requirements because the responsibility has shifted to the Cloud provider. However, this is only true if it is detailed and specified in the contract. Even then, reliability and performance guarantees need to be analyzed to assure they comply with what the organization needs and requires. Money saved moving to the Cloud is illusory if necessary reliability is compromised or not provided at all. Reliability is not free in an organization’s own datacenter. It will not be free in the Cloud either. Organizations must have a clear understanding of their reliability requirements; then communicate their expectations to their Cloud vendor. They should monitor and test to confirm that these are metover time. We expect wise organizations will split their work between different Cloud providers in different geographies or at the least between different and geographically separate datacenters of their Cloud provider.
Interoperability & Open standards
For widespread adoption of Cloud, it is critical that enterprises be able to move applications and data from one Cloud to another. Col. Hill of the US Army, the head of the Futures Directorate, said that interoperability between Clouds as well as ability to move data from one Cloud to another is an important factor in the acceptance of Cloud . He went on to say that the Army needed an open architecture that would allow it to use the best features of the various Clouds in the market now. To us, OpenStack currently shows the most promise of being that architecture. We recommend that organizations consider supporting and participating in OpenStack activities.
It is clear that any enterprise evaluating the Cloud has a great deal of work to do. It needs to make sure that its data and reputation will be protected in the Cloud. The Cloud can offer substantial savings but it is critical that these savings do not come at the expense of the security, reliability, and interoperability that will be needed in the long term. Moving to the Cloud doesn’t relieve enterprise IT of its responsibilities of governance, security, reliability, interoperability and standards. It does mean that while the work related to these is done by others, the enterprise organization must have a clear understanding what is necessary and required in each case. Then, it must monitor, manage, analyze and verify that their internal requirements are met.
By Bill Moran and Rich Ptak
1. For the report see http://oig.nasa.gov/audits/reports/FY13/IG-13-021.pdf
2. HP, Intel, and IBM are just a few of the companies that have adopted OpenStack as the basis for their Cloud technology. There are currently more than one hundred companies in the OpenStack org. See http://www.openstack.org/
3. Developers, in particular, are likely to go outside the organization to accelerate the development and testing of applications that they are responsible for.