Amazon Web Services (AWS) started with elastic and scalable compute and storage offerings – an elastic compute cloud (EC2) and a simple storage service (S3). EC2 was based on virtual machines. S3 was a simple key value store with an object store backend.
Heroku came up with a deploy an app with a single command line model – heroku deploy. It would allow automation of deployment of rails apps. It was an early example of a platform as a service (PAAS).
NASA engineers wanted to store and process big data and tasked an internal team and Rackspace to build a web application framework for it. When they tried to use the existing aws offerings, they found the time to upload the data to AWS would take too long. They came up with their own implementation to manage virtual machines and handle storage, with openstack. It provided EC2 and S3 like apis and worked with certain hypervisors.
Greenplum which is a big data company acquired by EMC, wanted a framework to make use of big data easily. EMC had brought VMware which brought Cloudfoundry. Greenplum worked with VWware and a dev company pivotal labs to build up Cloudfoundry as a mechanism to deploy apps so they could be scaled up easily. Pivotal got acquired by VMware and later spun out as Pivotal Cloudfoundry to provide enterprise level support to the open source cloudfoundry offering. This had participation from EMC, VMWare (both were one already), and a third participant – GE with an interest to build an industrial cloud. Cloudfoundry forms the basis of the GE predix cloud.
Cloudfoundy is now a 3rd gen PAAS which works on top of an IAAS layer such as openstack or vmware vsphere or even docker containers running locally via a Cloud Provider Interface (CPI).
Another interesting project is Apache Geode, an in-memory database along the lines of Hana.
Meanwhile Amazon and Azure increase the number of webservices available rapidly as well as reducing their costs.
There was a meetup on Ranger recently at Pivotal labs which discussed authorization for various components in the Hadoop ecosystem, including Kafka.