robgibbon
on 18 August 2021
There are some great reasons to use nested virtualisation on compute clouds. Nested virtualisation is when you have Virtual Machines running within Virtual Machines. This approach offers a stronger isolation model than containers: by using a dedicated kernel and IO devices, bad behaviour cannot easily break out and spread. In fact, nested virtualisation is at the heart of Firecracker (which is the foundation for AWS Lambda) and the Kata Containers project. And recent work on streamlining device drivers, kernel footprint and VM image size have meant that the performance and startup times are pretty amazing. But surely the main reason to use nested virtualisation on the cloud, is just because you can!
There are some limitations, though. The biggest limitation is that, while GCP and Azure offer varying levels of support for nested virtualisation, AWS only offers support for it on bare metal instances. There are a couple of solutions that claim to work around that limitation, like the Xen hypervisor running in paravirtualisation mode, which has been extended to minimize instances of kernel traps with the XenBlanket and Xen-Blanket-NG patchsets. But for this series of how-to articles, we’ll run nested virtualisation with the free and open source LXD instead. LXD can run nested, fully virtualised instances on GCP and Azure, and fall back to Linux Containers on AWS and any other clouds that don’t support it yet.
Let’s get started!
Spark up: building a Spark cluster
In this series of blog posts, we’ll build an Apache Spark cluster, running on MicroK8s on Ubuntu Core on the cloud to demonstrate the versatility of this approach. First, we will build up a basic solution locally. Then, in Part 2, we will have a go on the cloud. In Part 3, we’ll put Apache Spark on top. Finally, in Part 4, we’ll build a fully distributed MicroK8s compute cluster.
Ubuntu Core is a nifty new operating system that’s built from first principles with zero trust security in mind. Running MicroK8s on Ubuntu Core obviously brings those advantages of robust computing foundations to Kubernetes. And with MicroK8s, it’s quite literally a snap to set up a K8s cluster.
Apache Spark is the data engineers’ tool of choice for data lifting, transforming, loading, analysing and so much more. Folks are even busy hooking it up to data visualization and exploration tools like Apache SuperSet. By the end of this four-part series, we’ll have a working Spark setup that we can use for interactive data science.
But before we build our Spark cluster, we need to build up the foundational services for the solution. We need Ubuntu Core.
Core to the fore: introducing Ubuntu Core
Ubuntu Core is Canonical’s free and open source next-generation operating system based on snapd. In Ubuntu Core, everything is a snap. Even the kernel is a snap! Ubuntu Core uses snapd’s granular, AppArmor-driven and enforced sandbox permissions system, called confinement. Only snaps that implement the strict confinement model can be installed on Ubuntu Core. That means it’s harder for a rogue process to break out from the sandbox and do bad things across your system and beyond. We’ll be using Ubuntu Core for this project.
Let’s launch an Ubuntu Core VM on our local system using Multipass, install LXD on it, and then blast MicroK8s over the top. We’ll do some configuration tweaking along the way to make it all sing and dance nicely. Use the following commands:
sudo snap install multipass
multipass launch -m 12G -d 60G -n ubu-core core18
multipass shell ubu-core
sudo snap install lxd
That was quite easy! The VM might update itself and reboot a few times, but don’t worry about that. If a package installation is in progress while the reboot takes place, the installation will be rolled back and then resumed.
Lucy in the X with Diamonds: LXD and Linux Containers
LXD is a sophisticated, next-generation cloud management and orchestration engine with hypervisor and Linux Container engine support built in. With LXD, you can run Linux Container workloads, Linux VMs, or other operating systems like FreeBSD or Microsoft Windows if you prefer. Note that, for this first demo, we’ll use Linux Containers, but when we do this on the cloud, we’ll use nested virtualisation to benefit from the enhanced security of a dedicated kernel and virtual IO device sandboxing. It’s all possible, let’s just roll with it.
Let’s set up LXD to run MicroK8s. MicroK8s is the awesome new easy-peasy, lemon squeezy way to deploy Kubernetes. It can run standalone on a workstation, or banded together as a highly available cluster, and it’s a piece of pie to set up, even in HA mode. Oh and it’s also free open source software. For this post, we’re going to run it standalone, but you can learn more about MicroK8s on its website, including how to set it up as a full-on distributed compute cluster.
First, we’ll initialize our LXD environment so that we provision 40GB of space on a ZFS filesystem, mounted as a loopback to a file, with automatic IP addressing and bridged networking. Of course, this is a demo; when we do this for real on the cloud, we’ll allocate a dedicated block device rather than a file, for better performance. Use the following commands:
cat > lxd-init.yaml <<EOF
config: {}
networks:
- config:
ipv4.address: auto
ipv6.address: auto
description: ""
name: lxdbr0
type: ""
project: default
storage_pools:
- config:
size: 40GB
description: ""
name: default
driver: zfs
profiles:
- config: {}
description: ""
devices:
eth0:
name: eth0
network: lxdbr0
type: nic
root:
path: /
pool: default
type: disk
name: default
projects: []
cluster: null
EOF
cat lxd-init.yaml | sudo lxd init --preseed
rm lxd-init.yaml
Now for the next step. We’ll make a Linux Container profile especially tuned for MicroK8s:
sudo lxc profile create microk8s
cat > microk8s.profile <<EOF
config:
boot.autostart: "true"
linux.kernel_modules: ip_vs,ip_vs_rr,ip_vs_wrr,ip_vs_sh,ip_tables,ip6_tables,netlink_diag,nf_nat,overlay,br_netfilter,nf_conntrack_ipv4
raw.lxc: |
lxc.apparmor.profile=unconfined
lxc.mount.auto=proc:rw sys:rw cgroup:rw
lxc.cgroup.devices.allow=a
lxc.cap.drop=
security.nesting: "true"
security.privileged: "true"
security.syscalls.intercept.bpf: "true"
security.syscalls.intercept.bpf.devices: "true"
security.syscalls.intercept.mknod: "true"
security.syscalls.intercept.setxattr: "true"
description: ""
devices:
aadisable:
path: /sys/module/nf_conntrack/parameters/hashsize
source: /sys/module/nf_conntrack/parameters/hashsize
type: disk
aadisable1:
path: /sys/module/apparmor/parameters/enabled
source: /dev/null
type: disk
aadisable2:
path: /dev/zfs
source: /dev/zfs
type: disk
aadisable3:
path: /dev/kmsg
source: /dev/kmsg
type: disk
aadisable4:
path: /sys/fs/bpf
source: /sys/fs/bpf
type: disk
name: microk8s
used_by: []
EOF
cat microk8s.profile | sudo lxc profile edit microk8s
rm microk8s.profile
The next step is to fire up a Linux container and install the MicroK8s snap in it. After that, we’ll do some tweaks to get it running smoothly:
sudo lxc launch -p default -p microk8s ubuntu:20.04 microk8s
sleep 10
sudo lxc exec microk8s -- sudo snap install microk8s --classic
sudo lxc shell microk8s
cat > /etc/rc.local <<EOF
#!/bin/bash
apparmor_parser --replace /var/lib/snapd/apparmor/profiles/snap.microk8s.*
exit 0
EOF
chmod +x /etc/rc.local
systemctl restart rc-local
echo 'L /dev/kmsg - - - - /dev/null' > /etc/tmpfiles.d/kmsg.conf
echo '--conntrack-max-per-core=0' >> /var/snap/microk8s/current/args/kube-proxy
exit
sudo lxc restart microk8s
sleep 15
sudo lxc exec microk8s -- sudo swapoff -a
Drop a bot: testing with Microbot
Alright – we’re all set to deploy some stuff on Kubernetes! Let’s drop a Microbot on there, and make sure that we can see it from outside of the LXD environment:
sudo lxc exec microk8s -- sudo microk8s.kubectl create deployment microbot --image=dontrebootme/microbot:v1
sudo lxc exec microk8s -- sudo microk8s.kubectl expose deployment microbot --type=NodePort --name=microbot-service --port=80
MICROBOT_PORT=$(sudo lxc exec microk8s -- sudo microk8s.kubectl get all | grep service/microbot | awk '{ print $5 }' | awk -F':' '{ print $2 }' | awk -F'/' '{ print $1 }')
UK8S_IP=$(sudo lxc list microk8s | grep microk8s | awk -F'|' '{ print $4 }' | awk -F' ' '{ print $1 }')
sudo snap install curl
curl http://$UK8S_IP:$MICROBOT_PORT/
If you’ve seen some microbot HTML in your terminal, then you’re all good – it’s all working as planned. If not, you might need to check back and review the previous steps carefully to get it running. It’ll be even better when this thing is running on the cloud though!
We’ll get to that in Part 2 of this blog post. Stay tuned…