The Services...
Lets start with the obvious – what are the generic categories of service offered by all service providers?
IaaS
Infrastructure as a Service. This is exactly as it suggests, an infrastructure component (compute, storage, network)
SaaS
Software as a Service. Software that can be accessed remotely and fully managed as a service by the service provider, such as Salesforce
PaaS
Platform as a Service, allows you to run an application without the need to manage the underlying infrastructure
The Cloud service providers are numerous. The top two are, not surprisingly, Amazon Web Services and Microsoft Azure. If we look a little more closely at these two (sorry Google and IBM), we will see that each have their own set of acronyms for the products and services they offer. Very broadly speaking (I did mention complexity as one of the disadvantages of Public Cloud!), these can be separated in to the physical platforms that we may already be familiar with (compute, storage, and network), the software that runs on top (database, additional software to support functionality), and the features that improve overall efficiency (auto scaling for example). To understand these products and services, let’s say storage, it is necessary to understand what each service description actually means. It now becomes even more complex, as storage can be broadly categorised in to ‘tiers’ - block, file, object, backup and recovery, and archive. Block is the higher cost and typically higher performing tier of storage (I am using my words very carefully as this is a minefield, with higher network speeds on Ethernet using IP being compared with ‘lossless’ fibre channel protocol block storage making this discussion a little contentious) and suitable for certain types of data requiring high performance. Block storage uses LUNs (disks) to store data, and typically requires a filesystem to manage the data being written to blocks on the disk. The LUN is presented to a host across a fibre channel connection using SCSI to communicate with the disk, or an Ethernet connection using iSCSI (which encapsulates the SCSI commands on a TCP/IP protocol connection). Block storage is the highest performing tier, and the data written to block storage is referred to as ‘structured’.
File storage is fairly self-explanatory, and can be seen in all workplaces when a user connects to their desktop and opens a file. Data held on file storage is referred to ‘unstructured’, and uses NFS, SMB and CIFS protocols to access file ‘shares’ (folders). File storage is typically a medium performing tier of storage, and can be easily shared between end users and servers, unlike block storage.
Object storage on the other hand is more complex in its use cases, and is typically a lower cost option. Object storage is again used for specific types of data, and is lower performing, and can be treated essentially as a ‘cheap-and-deep’ tier of storage. Object storage stores data in objects, and each object contains meta data describing that object. Because high performance is not a characteristic of Object storage, objects will consist of mainly static data, and will be replicated among other object stores in other datacentre locations to provide very high resilience against data loss. The complexity continues when considering the type of storage that is being used at the physical layer. This can be either a hard drive, or a solid state drive (SDD). The key premise behind Object storage is availability through replication of objects between datacenters and regions.
For compute resources, to fully understand what is on offer from a service provider (Amazon refer to their Cloud compute as Elastic Cloud Compute, or EC2 for short, and Microsoft have Virtual Machines), you need to have an appreciation of the physical aspects of the virtual machine that is being provisioned. Remember, this is all virtualised, so a virtual machine will be provisioned from a physical server running a specific rating of CPU, with a finite number of physical cores. The virtual machine will then be configured with a count of virtual CPUs (vCPU), which are directly related to the physical and logical cores on the host server. The physical platform will be shared with other virtual machines running on the same physical server. The pricing for compute will be based on the performance characteristics of the physical server, and there will be varying levels of performance available, with the lower performing platforms being at a lower price point. It is therefore essential that the performance profile of the application moving to a Cloud compute platform is understood.
File storage is fairly self-explanatory, and can be seen in all workplaces when a user connects to their desktop and opens a file. Data held on file storage is referred to ‘unstructured’, and uses NFS, SMB and CIFS protocols to access file ‘shares’ (folders). File storage is typically a medium performing tier of storage, and can be easily shared between end users and servers, unlike block storage.
Object storage on the other hand is more complex in its use cases, and is typically a lower cost option. Object storage is again used for specific types of data, and is lower performing, and can be treated essentially as a ‘cheap-and-deep’ tier of storage. Object storage stores data in objects, and each object contains meta data describing that object. Because high performance is not a characteristic of Object storage, objects will consist of mainly static data, and will be replicated among other object stores in other datacentre locations to provide very high resilience against data loss. The complexity continues when considering the type of storage that is being used at the physical layer. This can be either a hard drive, or a solid state drive (SDD). The key premise behind Object storage is availability through replication of objects between datacenters and regions.
For compute resources, to fully understand what is on offer from a service provider (Amazon refer to their Cloud compute as Elastic Cloud Compute, or EC2 for short, and Microsoft have Virtual Machines), you need to have an appreciation of the physical aspects of the virtual machine that is being provisioned. Remember, this is all virtualised, so a virtual machine will be provisioned from a physical server running a specific rating of CPU, with a finite number of physical cores. The virtual machine will then be configured with a count of virtual CPUs (vCPU), which are directly related to the physical and logical cores on the host server. The physical platform will be shared with other virtual machines running on the same physical server. The pricing for compute will be based on the performance characteristics of the physical server, and there will be varying levels of performance available, with the lower performing platforms being at a lower price point. It is therefore essential that the performance profile of the application moving to a Cloud compute platform is understood.