I wanted to get a better understanding of firecracker microVM security, from the bottom up. A few questions I had in mind –
a) how does firecracker design achieve a smaller threat surface than a typical vm/container ?
b) what mechanisms are available to secure code running in a microvm ?
c) and lastly, how can microvms change security considerations when deploying code for web services ?
The following design elements contribute to a smaller threat surface:
- minimal design, in a memory safe, compact, readable rust language
- minimal guest virtual device model: a network device, a block I/O device, a timer, a KVM clock, a serial console, and a partial keyboard
- minimal networking; from docs/vsock.md : “The Firecracker vsock device aims to provide full virtio-vsock support to software running inside the guest VM, while bypassing vhost kernel code on the host. To that end, Firecracker implements the virtio-vsock device model, and mediates communication between AF_UNIX sockets (on the host end) and AF_VSOCK sockets (on the guest end).”
- static linking of the firecracker process limits dependancies
- seccomp BPF limits the system calls to 35 allowed calls, 30 with simple filtering, 5 with advanced filtering that limits the call based on parameters (SeccompFilter::new call in vmm/src/default_syscalls/filters.rs, seccomp/src/lib.rs)
The production security setup recommends using jailer to apply isolation based on cgroups, namespaces, seccomp. These techniques are typical of container isolation and act in addition to KVM based isolation.
The Host Security Configuration recommends a series of checks to mitigate side-channel issues for a multi-tenant system:
- Disable Simultaneous Multithreading (SMT)
- Check Kernel Page-Table Isolation (KPTI) support
- Disable Kernel Same-page Merging (KSM)
- Check for speculative branch prediction issue mitigation
- Apply L1 Terminal Fault (L1TF) mitigation
- Apply Speculative Store Bypass (SSBD) mitigation
- Use memory with Rowhammer mitigation support
- Disable swapping to disk or enable secure swap
How is the firecracker process organized ? The docs/design.md has the following descriptions:
Internal Design: Each Firecracker process encapsulates one and only one microVM. The process runs the following threads: API, VMM and vCPU(s). The API thread is responsible for Firecracker’s API server and associated control plane. It’s never in the fast path of the virtual machine. The VMM thread exposes the machine model, minimal legacy device model, microVM metadata service (MMDS) and VirtIO device emulated Net and Block devices, complete with I/O rate limiting. In addition to them, there are one or more vCPU threads (one per guest CPU core). They are created via KVM and run the `KVM_RUN` main loop. They execute synchronous I/OÂ and memory-mapped I/O operations on devices models.
Threat Containment: From a security perspective, all vCPU threads are considered to be running malicious code as soon as they have been started; these malicious threads needÂ to be contained. Containment is achieved by nesting several trust zones which increment from least trusted or least safe (guest vCPU threads) to most trusted or safest (host). These trusted zones are separated by barriers that enforce aspects of Firecracker security. For example, all outbound network traffic data is copied by the Firecracker I/O thread from the emulated network interface toÂ the backing host TAP device, and I/O rate limiting is applied at this point.
As to question about mechanisms to secure code running in firecracker, the serverless environment, AWS Lambda, and its security best practices are a good place to start . A few resources on this are here, here, here and here. AWS API gateway supports input validation, as described here. However, while serverless reduces the attack surface, the web threats such as OWASP still apply and must be taken into account during design and testing.
For the last question – uVMs and serverless appear to offer a promising model to build a service incrementally from small secure building blocks – and this is something to explore further.