Cloud Infrastructure Blueprint

May 28, 2008 – 22:38 by Ryan

As cloud platforms and services start to make their way to the market, we think it’s becoming obvious how the industry will play out. To understand the future, it’s important to look at the past.

Best of Breed

Stand alone providers who offer services provided by clouds are going to find it difficult to survive on their own. Currently, enterprises have way too many Internet service vendors. The situation is reminiscent of the software industry in 2000. For a long time enterprises found themselves taking a best-of-breed approach with regard to software vendor selection. This worked for a while, but eventually products mature, prices align and competitive differences dwindle.

A good example of this is the competition between J2EE container providers. For a few years, BEA provided a stronger J2EE container than IBM, JRun, JBoss, ATG, SilverStream and others. The competing products eventually matured and the industry was commoditized. This placed customers in an awkward position when justifying the expenses related to maintaining numerous software vendors. SilverStream went to Novell, BEA went to Oracle, JBoss went to Red Hat and JRun went to Adobe (by way of a sale to Macromedia).

The Stack

Eventually, the software “stack” was born. Software publishers started consolidating and now you can get just about everything you need from a single vendor. Of course some companies still take the best-of-breed approach in their vendor selection process, but when you make a lot of separate purchases you miss the economies of scale that a single large transaction can bring. Most software publishers offer sizable discounts on non-core products when you buy the complete stack.

In a nutshell, the same consolidation we witnessed in the software industry is about to hit the cloud/platform space.

Blueprint

The name of the game is efficiency and this can only be achieved through service consolidation. The following is a list of services/functionality we think are essential for cloud platforms providers:

  • Storage – The ability to transparently increase your storage capabilities. This is going to be a tough nut to crack. If company A has a disk IO read requirement of 400 MBps, they may have issues with the current services available. Currently, cloud storage models are based on bytes transferred in and out of the cloud and the amount of storage available. Eventually, you’ll be able to pay extra for high throughput.
  • Message Bus – The ability to reliably communicate between applications and distributed nodes using a common interface.
  • Network Isolation – Most companies don’t believe in encrypting everything that hits the wire. This results in potential security issues for applications sitting on the same network. This may be a difficult problem for cloud vendors to solve (consider each of the nodes your app runs on may be on a separate subnet).
  • Database – This is obvious, but what is not is the death of RDBS technology. It doesn’t make sense for companies to continue using this technology much longer. The costs associated with writing applications that require multiple programming languages are high. OO databases are starting to mature and the reduction in bookkeeping and development costs will drive people to this technology. Google, Amazon and others have already released OO database services. Google added a SQL interface to their OO database for legacy users/apps.
  • Job Scheduling – Some applications need to run at scheduled times.
  • Load Balancing – HTTP load balancing should be transparent.
  • Resource Scheduling – Clouds need to provide consistent performance for all types of applications. For non-web based applications this means that they need to dedicate specific disk, memory and CPU resources. Additionally, the cloud must detect when an application needs additional resources and dynamically allocate those resources. This is easy for most standard HTTP applications but is a more difficult problem when looking at data processing/computational applications.
  • Parallel/Grid Processing – Let’s say we need to analyze the HTTP logs we collected over the last year. This can be dispatched to a single machine but it would take forever to run. The ability to transparently process data in parallel is essential for the enterprise adoption of cloud platforms.
  • Network Capacity – Some applications are optimized and can easily max out a 1 Gb network connection. Most cloud platforms use shared or virtualized resources. This can make it difficult to isolate network capacity. The release of 10 Gb networking technology (with 100Gb technology on the way) will drastically change the way applications are designed. Engineers will have significantly fewer restrictions when developing distributed systems. The primary focus with network capacity is on the LAN level. It’s assumed that the cloud platform has enough network capacity to handle their customer’s WAN requirements.
  • Caching – This should be transparent for data store access. Users need the ability to create complex transient data structures that support distributed access.
  • Advertising – Complete campaign management and fulfillment from a single vendor is essential. This includes email, text, banner, video, outdoor, IPTV, etc. The recent consolidation in the industry indicates that this is already underway.
  • Web Analytics - Currently, there are a variety of stand alone companies who provide this service. We think these companies will be forced to sell/merge into the cloud platforms.
  • CDN – Content delivery should be as simple and transparent as possible. Cloud platforms can transparently provide this service to their customers. Expect to see a lot of consolidation in this space over the next two years. The jewel in the industry is Panther Express
  • Email/Communication – Providing a single corporate UI to your employees is essential. Switching UIs results in a context switch and requires a brief period for the user to adjust to the alternate environment. Integrating web-based email access with the rest of software/services customers use to run their business will further increase the efficiency of their employees.
  • Core Service APIs – Embedding new applications in a cloud needs to be fast and easy. Providing a common API/framework is critical. Imagine customer A uses service B provided by company X. Customer A already has an application management console provided by the cloud. They need to be able to configure service B using their existing admin tools. Pluggable admin tools should resemble Facebook’s development environment.
  • Content Management - Companies need to be able to organize, publish and track their content. This needs to support all common media formats.
  • DNS/Registrar – Nobody wants to think about the plumbing nor do they want to maintain multiple accounts to manage this functionality. Using DNS provided by a single ISP doesn’t make sense from a reliability standpoint. ISP networks and services are much too xenophobic. Cloud platforms need to be 100% fault tolerant. An outage no longer impacts one company, it impacts thousands.
  • ERP Tools – All companies need a few basic tools no matter what industry they’re in (e.g., time tracking, contact lists/corporate directories, issue tracking, project management, documents, spreadsheets, billing, etc.).
  • Monitoring/Alerts - Users need to be notified when conditions change. These attributes must be configurable.
  • Service Billing – Clouds will provide great environments for mashup development. A clean billing and tracking solution is necessary so that services can be used without adding additional vendors to your list of partners.

Serious Clouds

Looking at the functionality list above, it’s obvious what Google, Amazon and others have been thinking about for a while.

A serious cloud platform will require a lot of custom development, but it’s still really early in the game. However, developing ALL of the systems/applications listed is a bit unrealistic. The best approach to creating a company that can compete with Google, IBM and Microsoft is to make a slew of acquisitions and integrate the applications. Most of the applications can continue to operate in relative backend silos with a completely integrated UI. The problem is that the return on investment will not be realized for some time. Enterprises are only beginning to look at moving to cloud platforms.

The golden age for platforms is not too far away and everyone and their brother will be looking for a pick and an axe.

Sphere: Related Content
  1. 4 Responses to “Cloud Infrastructure Blueprint”

  2. I must say this is a great article i enjoyed reading it keep the good work :)

    By Dan Waldron on May 28, 2008

  3. Err….ARG? Is that a pun by a former employee? ;^)

    By John on May 29, 2008

  4. Editorial help from the guy traveling around the world :-) Thanks, it’s fixed.

    By Ryan on May 29, 2008

  5. Great post. I like this blueprint. How can we get people away from this “web2.0″ nonsense and towards becoming more familiar with real infrastructure? It seems that too few people are thinking about competing with big Internet giants.

    By Steven on May 29, 2008

Post a Comment