Scaling Out

Scaling-Out #

Scaling-Out or scaling horizontally describes the process of adding more nodes to a system in order to improve its performance. In a distributed system’s simulation scenario, the aim of scaling out is to simulate __and analyse a more realistic scenario with peers communicating over a physical network. This reduces reliability and performance of the network connections and therefore approaches how components might behave in a real-world application.

The table below describes the hardware used to horizontally scale the decentralized learning system:

Machine/Specification RAM (GB) Architecture Quantity
Raspberry Pi 4 8 ARM64v8 2
Jetson Nano 4 ARM64v8 2
ODROID N2 4 ARM64v8 3
Small ARM devices get unresponsive very quickly. These devices might need to be manually restarted if the resource usage exceeds its capacity.

Orchestration #

Orchestration tools automatically configure and deploy containers on multiple nodes within a system.

Docker Swarm #

Docker Swarm is already shipped with the default docker installation. After initiating a cluster on the master node with

docker swarm init

and joining from a worker node with

docker swarm join TOKEN

the application can be already horizontally scaled when working with a docker-compose file.

docker-compose build && docker-compose push # This needs to be re-run for every code change!
docker stack deploy --compose-file docker-compose.yml tangle # To deploy the containers to the swarm
docker service scale tangle_peer=<number-of-peers>

Docker Swarm has two missing features for the decentralized learning use case. The IPFS client requires multi-casting to communicate with other peers, which is not implemented by Docker Swarm’s networking overlay. The other missing feature is required when the cluster consists of multiple different architectures. Docker Swarm always deploys the image fitting to the master-architecture. If a worker has a different architecture the deployment fails, even if it could run on the target host (such as arm32 images on a arm64 host). Since some of our dependencies do not support arm64 the latter was especially impoartant. Furthermore, since Swarm relies on Docker-Compose, the management processes become very slow once a lot of large containers are run. This problem is especially prevalent when scaling lots of containers on a single machine like in the scale-up scenario.

Kubernetes #

Kubernetes is another orchestration tool, initially developed by Google and now continued by the Cloud Native Computing Foundation. YAML files are used to configure the resources to be deployed.

Compare to Docker Swarm, Kubernetes is more mature and can be controlled in a more fine-grained way. These advantages come with additional complexity. Compare to Docker Swarm it already comes with two features required for the decentralized learning use case: Multicast support on the network layer and variable image architectures depending on each worker host machine. It also supports deploying images of compatible architectures to hosts (such as arm32 images on a arm64 hosts). This allowed us to use arm32 images where dependencies were not available for arm64.

K3s #

There are many tools apart from the original Kubernetes distribution implementing the Kubernetes API. One of these tools is k3s, which is a lightweight Kubernetes distribution build for IoT & Edge devices. It is much smaller than the original Kubernetes distribution (the binary is smaller than 40MB) and is also optimized for ARM usage.

Installation #

Please refer to the official documentation for further details about the K3s setup process.

For installation on the master node run:

curl -sfL https://get.k3s.io | sh -

For installation on worker nodes run:

curl -sfL https://get.k3s.io | K3S_URL=https://myserver:6443 K3S_TOKEN=<my-node-token> sh -

Kubernetes and K3S only support 110 pods on the same node in their default configuration. This number can be increased by passing the max-pods argument during installation. If it is planned to run more than 254 pods on one node, one must also change the subnet size by passing the node-cidr-mask-size argument.

curl -sfL https://get.k3s.io | sh -s - --kubelet-arg="max-pods=2000" --kube-controller-manager-arg=node-cidr-mask-size=19

To check if the cluster is properly connected use this command:

kubectl cluster info

Docker-Compose to Kubernetes configuration files #

Kompose translates a docker-compose file into Kubernetes configuration files. After installing, a compose file can be converted into respective Kubernetes resources:

kompose convert -f docker-compose.yaml -o k8s/

Then directly apply the converted configurations to the Kubernetes cluster, using the apply command

kubectl apply -f k8s/

Scaling #

To scale a pod to multiple instances use scale command. For example, scaling the peer pod up to three replicas:

kubectl scale --replicas=3 -f k8s/peer-deployment.yaml

Minikube #

Minikube helps to run a Kubernetes cluster on a single machine, on top of a VM. Since Minikube runs inside of a VM it comes with lots of overhead. This makes Minikube less responsive and consumes more resources. Moreover, at scale Minikube got even less responsive and got unavailable very quickly. Therefore it is not suited to a benchmarking use case since it reduces the possibility to simulate larger decentralized systems with more peers.