Category: containers

DevSecOps – Securing the Software Supply Chain

A position paper from CNCF on securing the software supply chain, talks about hardening the software construction process by hardening each of the links in the software production chain –

https://github.com/cncf/tag-security/blob/main/supply-chain-security/supply-chain-security-paper/CNCF_SSCP_v1.pdf

Quote – “To operationalize these principles in a secure software factory several stages are needed. The software factory must ensure that internal, first party source code repositories and the entities associated with them are protected and secured through commit signing, vulnerability scanning, contribution rules, and policy enforcement. Then it must critically examine all ingested second and third party materials, verify their contents, scan them for security issues, evaluate material trustworthiness, and material immutability. The validated materials should then be stored in a secure, internal repository from which all dependencies in the build process will be drawn. To further harden these materials for high assurance systems it is suggested they should be built directly from source.

Additionally, the build pipeline itself must be secured, requiring the “separation of concerns” between individual build steps and workers, each of which are concerned with a separate stage in the build process. Build Workers should consider hardened inputs, validation, and reproducibility at each build. Finally, the artifacts produced by the supply chain must be accompanied by signed metadata which attests to their contents and can be verified independently, as well as revalidated at consumption and deployment.”

The issue is that software development is a highly collaborative process. Walking down the chain and ensuring the ingested software packages are bug-free is where it gets challenging.

The Department of Defense Enterprise DevSecOps Reference design, speaks to the aspect of securing the build pipeline –

https://dodcio.defense.gov/Portals/0/Documents/DoD%20Enterprise%20DevSecOps%20Reference%20Design%20v1.0_Public%20Release.pdf?ver=2019-09-26-115824-583

The DoD Container Hardening Guide referenced in the CNCF doc is at –

https://software.af.mil/wp-content/uploads/2020/10/Final-DevSecOps-Enterprise-Container-Hardening-Guide-1.1-Public-Release.pdf

which has a visual Iron Bank flow diagram on p.20

Kubernetes security

Kubernetes is a Platform-as-a-Service (PAAS) build on top of (docker) containers, but with an additional unit of abstraction called a pod, which a) is its smallest unit of execution b) has a single external IP address, c) is a group of one or more containers where d) the group of containers are connected over a network namespace and e) each pod is isolated from others by network namespaces. Within a pod, different containers can see each other over different ports over a loopback interface. Within an instance, while different pods can see each other as different IP addresses. It has a control plane built on top of etcd, a consistent, distributed, highly available key-value store, which is an independent opensource CNCF project.

It is conceptually similar to Cloud Foundry, Mesos, OpenStack, Mirantis and similar abstraction layers.

A threat matrix for Kubernetes by MS – https://www.microsoft.com/security/blog/2020/04/02/attack-matrix-kubernetes/

From RSA’20, here’s a talk on ‘The future of Kubernetes attacks’ – https://youtu.be/CH7S5rE3j8w

From Coinbase, a blog on ‘Why Kubernetes is not part of our stack’https://blog.coinbase.com/container-technologies-at-coinbase-d4ae118dcb6c makes these points

  • it needs a full-time compute team to maintain
  • securing it is neither trivial nor well understood. SPIFFE, SPIRE, Envoy, Istio, OPA, service mesh are a few of the technologies.

This blog links to – https://k8s.af/

Another viewpoint – https://pythonspeed.com/articles/dont-need-kubernetes/

A counterpoint to the Coinbase blog – https://blog.kumina.nl/2020/07/in-response-to-container-technologies-at-coinbase/

Scratch notes:

K8S is based on a Controller pattern:

  • Resources capture the desired state
  • Current state is kept centralized in etcd, a distributed key-value store, similar to Consul
  • Controllers reconcile current state with desired state

Pod is a top level resource, is the smallest deployment unit, and is a group of one or more containers described by a yaml file, similar to docker-compose.yml .

K8S Operator is a kind of resource manager, for Custom resources.

https://blog.frankel.ch/your-own-kubernetes-controller/1/

https://pushbuildtestdeploy.com/when-do-kubernetes-operators-make-sense

Spinnaker is a Continuous Delivery platform that itself runs on k8s as a set of pods which can be scaled up

A kubectl cheat sheet:

https://kubernetes.io/docs/reference/kubectl/cheatsheet

An article on cloud security https://medium.com/xm-cyber/having-fun-with-cloud-services-e281f8a7fe60 , which I think makes the point of why things are relatively complex to begin with.

One comes across the terms helm and helm charts. Helm is a way to package a complex k8s application. This adds a layer of indirection to an app – https://stepan.wtf/to-helm-or-not/ .

A repo to list failing pods – https://github.com/edrevo/suspicious-pods

Exploring networking in k8s – https://dustinspecker.com/posts/how-do-kubernetes-and-docker-create-ip-addresses/

Plugin for Pod networking on EKS using ENIs – https://github.com/aws/amazon-vpc-cni-k8s

Hardening EKS with IAM, RBAC – https://snyk.io/blog/hardening-aws-eks-security-rbac-secure-imds-audit-logging/

EKS Authentication with IAM – how does it work ?

IAM is only used for authentication of valid IAM entities. All permissions for interacting with EKS Kubernetes API is managed through the native Kubernetes RBAC system.

AWS IAM Authenticator for EKS is a component that enables access to EKS via IAM, for provisioning, managing, updating the cluster. It runs on the EKS Control Plane – https://github.com/kubernetes-sigs/aws-iam-authenticator#aws-iam-authenticator-for-kubernetes

A k8s ConfigMap is used to store non-confidential data in key-value pairs.

The above authenticator gets its configuration information from the aws-auth ConfigMap. This ConfigMap can be edited via eksctl (recommended) or be directly edited.

A Kubernetes service account provides an identity for processes that run in a pod. For more information see Managing Service Accounts in the Kubernetes documentation. If your pod needs access to AWS services, you can map the service account to an AWS Identity and Access Management identity to grant that access.

EKS Networking – https://aws.amazon.com/blogs/containers/de-mystifying-cluster-networking-for-amazon-eks-worker-nodes/

https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html

https://docs.aws.amazon.com/vpc/latest/reachability/getting-started-cli.html

https://docs.aws.amazon.com/eks/latest/userguide/dashboard-tutorial.html

Firecracker MicroVM Security

I wanted to get a better understanding of firecracker microVM security, from the bottom up. A few questions –

a) how does firecracker design achieve a smaller threat surface than a typical vm/container ?

b) what mechanisms are available to secure code running in a microvm ?

c) and lastly, how can microvms change security considerations when deploying code for web services ?

The following design elements contribute to a smaller threat surface:

  • minimal design, in a memory safe, compact, readable rust language
  • minimal guest virtual device model: a network device, a block I/O device, a timer, a  KVM clock, a serial console, and a partial keyboard
  • minimal networking; from docs/vsock.md : “The Firecracker vsock device aims to provide full virtio-vsock support to software running inside the guest VM, while bypassing vhost kernel code on the host. To that end, Firecracker implements the virtio-vsock device model, and mediates communication between AF_UNIX sockets (on the host end) and AF_VSOCK sockets (on the guest end).”
  • static linking of the firecracker process limits dependancies
  • seccomp BPF limits the system calls to 35 allowed calls, 30 with simple filtering, 5 with advanced filtering that limits the call based on parameters (SeccompFilter::new call in vmm/src/default_syscalls/filters.rs, seccomp/src/lib.rs)

The production security setup recommends using jailer to apply isolation based on cgroups, namespaces, seccomp. These techniques are typical of container isolation and act in addition to KVM based isolation.

The Firecracker Host Security Configuration recommends a series of checks to mitigate side-channel issues for a multi-tenant system:

  • Disable Simultaneous Multithreading (SMT)
  • Check Kernel Page-Table Isolation (KPTI) support
  • Disable Kernel Same-page Merging (KSM)
  • Check for speculative branch prediction issue mitigation
  • Apply L1 Terminal Fault (L1TF) mitigation
  • Apply Speculative Store Bypass (SSBD) mitigation
  • Use memory with Rowhammer mitigation support
  • Disable swapping to disk or enable secure swap

How is the firecracker process organized ? The docs/design.md has the following descriptions:

Internal Design: Each Firecracker process encapsulates one and only one microVM. The process runs the following threads: API, VMM and vCPU(s). The API thread is responsible for Firecracker’s API server and associated control plane. It’s never in the fast path of the virtual machine. The VMM thread exposes the machine model, minimal legacy device model, microVM metadata service (MMDS) and VirtIO device emulated Net and Block devices, complete with I/O rate limiting. In addition to them, there are one or more vCPU threads (one per guest CPU core). They are created via KVM and run the `KVM_RUN` main loop. They execute synchronous I/O and memory-mapped I/O operations on devices models. 

Threat Containment: From a security perspective, all vCPU threads are considered to be running malicious code as soon as they have been started; these malicious threads need to be contained. Containment is achieved by nesting several trust zones which increment from least trusted or least safe (guest vCPU threads) to most trusted or safest (host). These trusted zones are separated by barriers that enforce aspects of Firecracker security. For example, all outbound network traffic data is copied by the Firecracker I/O thread from the emulated network interface to the backing host TAP device, and I/O rate limiting is applied at this point.

What about mechanisms to secure the code running inside firecracker ? The serverless environment, AWS Lambda, and its security best practices are a place to start. Resources on these are here, here, here and here. AWS API gateway supports input validation, as described here.  While serverless reduces the attack surface, the web threats such as OWASP still apply and must be taken into account during design and testing.

For the last question – uVMs and serverless appear to offer a promising model to build a service incrementally from small secure building blocks – and this is something to explore further.

Linux Bridges and Firewalls for Containers

Hubs and Repeaters work on bits on Layer1. Bridges and Switches work on frames, the protocol data unit for Layer2. A Hub supports a star configuration – like a splitter it simply forwards traffic received on one segment to all the others. A Repeater amplifies signal for longer distances. A Bridge is a smarter type of Hub, usually implemented in software, which looks at the destination MAC address of the packet before forwarding to the segment with MAC matching that destination, thereby avoiding unnecessary network traffic on all the other segments. It does not look at IP addresses. A Switch is a reincarnation of a Bridge in hardware with several ports.  The basic operation of a Bridge/Switch is L2 forwarding, which includes ARL (Address Resolution Logic) table learning and ageing.  An L3 Router switches packets based on the IP address instead of the frame address.

A linux bridge from a VM to the Host, does not need an IP address, but it is a good idea if you want to talk to the host for management or set firewall rules on it.

To create a bridge br0 between two interfaces eth0 and eth1, one runs commands: brctl addbr br0, brctl stp br0 on, brctl addif br0 eth0, brctl addif br0 eth1.

Since a bridge is an L2 construct, how can one filter IP addresses with it. A quote: An Ethernet interface cannot usefully have an IP address if it is also attached to a bridge. However it is possible to achieve the same effect by binding an address to the bridge itself:

ifconfig br0 192.168.0.1

An example of firewall configuration on a bridge.

Troubleshooting bridges:

brctl showmacs br0

Docker creates a few bridges by default. A primer on Docker Container networking and the bridges it creates are here and  here. Here are three common ones.

bridge: docker0 bridge . This is the default, allowing all containers to talk to each other

host: container listens to host, without any isolation.

Some problems with host iptables rules reveal themselves in working with docker.

A look at firewalld, ebtables, ipset, nftables at https://www.slideshare.net/albertspijkers/red-hat-enterprise-linux-next-generation-firewall . The command line tools talk to the kernel over a netlink socket.

A more extensible approach is with Berkeley Packet Filter (BPF), the classical one now called cBPF, the extended successor called eBPF.   A very basic problem is filtering HTTP or DNS packets based on the destination hostname, rather than the IP address.  This is discussed in an example by Cloudflare here.