Category: cloud

Kubernetes security

Kubernetes is a Platform-as-a-Service (PAAS) build on top of (docker) containers, but with an additional unit of abstraction called a pod, which a) is its smallest unit of execution b) has a single external IP address, c) is a group of one or more containers where d) the group of containers are connected over a network namespace and e) each pod is isolated from others by network namespaces. Within a pod, different containers can see each other over different ports over a loopback interface. Within an instance, while different pods can see each other as different IP addresses. It has a control plane built on top of etcd, a consistent, distributed, highly available key-value store, which is an independent opensource CNCF project.

It is conceptually similar to Cloud Foundry, Mesos, OpenStack, Mirantis and similar abstraction layers.

A threat matrix for Kubernetes by MS – https://www.microsoft.com/security/blog/2020/04/02/attack-matrix-kubernetes/

From RSA’20, here’s a talk on ‘The future of Kubernetes attacks’ – https://youtu.be/CH7S5rE3j8w

From Coinbase, a blog on ‘Why Kubernetes is not part of our stack’https://blog.coinbase.com/container-technologies-at-coinbase-d4ae118dcb6c makes these points

  • it needs a full-time compute team to maintain
  • securing it is neither trivial nor well understood. SPIFFE, SPIRE, Envoy, Istio, OPA, service mesh are a few of the technologies.

This blog links to – https://k8s.af/

Another viewpoint – https://pythonspeed.com/articles/dont-need-kubernetes/

A counterpoint to the Coinbase blog – https://blog.kumina.nl/2020/07/in-response-to-container-technologies-at-coinbase/

Scratch notes:

K8S is based on a Controller pattern:

  • Resources capture the desired state
  • Current state is kept centralized in etcd, a distributed key-value store, similar to Consul
  • Controllers reconcile current state with desired state

Pod is a top level resource, is the smallest deployment unit, and is a group of one or more containers described by a yaml file, similar to docker-compose.yml .

K8S Operator is a kind of resource manager, for Custom resources.

https://blog.frankel.ch/your-own-kubernetes-controller/1/

https://pushbuildtestdeploy.com/when-do-kubernetes-operators-make-sense

Spinnaker is a Continuous Delivery platform that itself runs on k8s as a set of pods which can be scaled up

A kubectl cheat sheet:

https://kubernetes.io/docs/reference/kubectl/cheatsheet

An article on cloud security https://medium.com/xm-cyber/having-fun-with-cloud-services-e281f8a7fe60 , which I think makes the point of why things are relatively complex to begin with.

One comes across the terms helm and helm charts. Helm is a way to package a complex k8s application. This adds a layer of indirection to an app – https://stepan.wtf/to-helm-or-not/ .

A repo to list failing pods – https://github.com/edrevo/suspicious-pods

Exploring networking in k8s – https://dustinspecker.com/posts/how-do-kubernetes-and-docker-create-ip-addresses/

Plugin for Pod networking on EKS using ENIs – https://github.com/aws/amazon-vpc-cni-k8s

Hardening EKS with IAM, RBAC – https://snyk.io/blog/hardening-aws-eks-security-rbac-secure-imds-audit-logging/

EKS Authentication with IAM – how does it work ?

IAM is only used for authentication of valid IAM entities. All permissions for interacting with EKS Kubernetes API is managed through the native Kubernetes RBAC system.

AWS IAM Authenticator for EKS is a component that enables access to EKS via IAM, for provisioning, managing, updating the cluster. It runs on the EKS Control Plane – https://github.com/kubernetes-sigs/aws-iam-authenticator#aws-iam-authenticator-for-kubernetes

A k8s ConfigMap is used to store non-confidential data in key-value pairs.

The above authenticator gets its configuration information from the aws-auth ConfigMap. This ConfigMap can be edited via eksctl (recommended) or be directly edited.

A Kubernetes service account provides an identity for processes that run in a pod. For more information see Managing Service Accounts in the Kubernetes documentation. If your pod needs access to AWS services, you can map the service account to an AWS Identity and Access Management identity to grant that access.

Aviatrix Controller on AWS for secure networking

These are some notes from a talk by Aviatrix last week. Many customers get started with Aviatrix orchestration system for deploying AWS Transit Gateway (TGW) and Direct Connect. The transit gateway is the hub gateway that connects multiple VPCs with an on-premise link, possibly over Direct Connect. The Aviatrix product can then deploy and manage multiple VPCs and the communication between them, directing which VPC can talk to which other VPC. It controls the communication by simply deleting the routes.

The advanced transit controller solution is useful for multiple regions, to manage the communication between regions. Another aspect is there are high speed interconnects between the cloud providers and Aviatrix builds an overlay that bridges between public clouds. Multi-account communication and secure communication between the networks using segmentation can be enabled.

According to Aviatrix, AWS’s motto is go build, and do it yourself, it is designed for the builders. But when you go beyond 3 VPCs to 3000 VPCs, one needs a solution to manage the routes in an automated manner. This is the situation for many larger customers. For smaller ones where there are Production, Development and Edge/On-premise network components to manage it also finds use.

Remote user VPN is another use case. Not only can one VPN in and get to all the VPCs, but specify which CIDR they can get to and other restrictions.

WebServices Composition with AWS

Some interesting diagrams on composition of a device data processing pipeline with AWS are at –http://aws-de-media.s3.amazonaws.com/images/jmetzner_Hackday_Berlin.pdf

The services listed are:
Amazon Cognito: Identity and Security. Gets token with role for API access by a certain user.
Amazon Kinesis: Massive data ingestion. Uses token auth, but token signing can be a easiest.
AWS Lambda: Serverless Data Compute. Supposed to save on EC2 instance costs ( at the expense of lock-in ).
Amazon S3: Virtually unlimited storage. This is what really makes AWS tick.
Amazon Redshift: Petabyte-scale data analysis
It does not say what data goes to S3 and what data goes to the database.
On Redshift, here’s a comment from Nokia:
http://www.cio.com/article/2860383/data-warehousing/7-amazon-redshift-success-stories.html#slide3” where their volume of data “literally broke the database”, prompting them to look for more scalable solutions.
There is a tension between “servers” and “services”, which goes back to IAAS vs PAAS distinction. PAAS can be faster to develop with reduced focus on server maintenance. However the number of PAAS concepts to deal with is neither small nor particularly inviting, as instead of a single server, one now has to deal with multiple services, each has to be authenticated, priced,  guarded for possible misuse and each has the potential for surprises. A key to simplicity is how composable the services are.