Some interesting diagrams on composition of a device data processing pipeline with AWS are at –http://aws-de-media.s3.amazonaws.com/images/jmetzner_Hackday_Berlin.pdf
The services listed are:
Amazon Cognito: Identity and Security. Gets token with role for API access by a certain user.
Amazon Kinesis: Massive data ingestion. Uses token auth, but token signing can be a easiest.
AWS Lambda: Serverless Data Compute. Supposed to save on EC2 instance costs ( at the expense of lock-in ).
Amazon S3: Virtually unlimited storage. This is what really makes AWS tick.
Amazon Redshift: Petabyte-scale data analysis
It does not say what data goes to S3 and what data goes to the database.
On Redshift, here’s a comment from Nokia:
“http://www.cio.com/article/2860383/data-warehousing/7-amazon-redshift-success-stories.html#slide3” where their volume of data “literally broke the database”, prompting them to look for more scalable solutions.
There is a tension between “servers” and “services”, which goes back to IAAS vs PAAS distinction. PAAS can be faster to develop with reduced focus on server maintenance. However the number of PAAS concepts to deal with is neither small nor particularly inviting, as instead of a single server, one now has to deal with multiple services, each has to be authenticated, priced, guarded for possible misuse and each has the potential for surprises. A key to simplicity is how composable the services are.