CODEX

Reliable Kubernetes on a Raspberry Pi Cluster: Storage

Scott Jones
CodeX
Published in
4 min readJan 15, 2021

--

Photo by Harrison Broadbent on Unsplash

With a great cluster, capabilities come great storage requirements. But how do we cater for them? I wanted to keep ownership of my own data, so cloud storage was immediately out. Having a fancy NAS setup was out too due to costs. That left me with a very limited number of options.

Part 1: Introduction
Part 2: The Foundations
Part 3: Storage
Part 4: Monitoring
Part 5: Security

Local Path Storage

K3s ships with local path storage capability. This means that the persistent volume will be stored locally on the node for which it is deployed. This can obviously lead to some unexpected results as objects get shifted around the cluster. As a way to combat this, K3s requires that you specify node affinities so that you can ensure it is always deployed to the same place.

NFS Storage

Perhaps the most widely used option in the RPi cluster is NFS storage. This relies on an NFS server being sat somewhere serving up files to the network. It means you have a centralized service responsible for interacting with files, so all your nodes can talk to it and get the same results back no matter where they run.

Longhorn et al.

There are other more sophisticated cluster storage mechanisms but at the time of writing very little support for the ARM architecture. Especially both 32 and 64 bit ARM. I may look to upgrade to a more sophisticated mechanism at some point, but for now, these are out of scope.

My Solution

I did not want to have any external dependencies for my cluster. So relying on an external NFS server was out. But spinning one up as part of my cluster was definitely in. I decided to run an NFS server, using local path storage. I had to pin this down to a specific node, but that was fine — the HDD I wanted to use was plugged into one Pi anyway. I created my NFS-server.yaml as below

apiVersion: v1
kind: Namespace
metadata:
name: storage
labels:
app: storage
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
namespace: storage
spec:
capacity:
storage: 500Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: <<PATH-TO-SHARE>>
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: hdd
operator: In
values:
- enabled
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-claim
namespace: storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-storage
resources:
requests:
storage: 500Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-server
namespace: storage
labels:
app: nfs-server
spec:
replicas: 1
selector:
matchLabels:
app: nfs-server
template:
metadata:
labels:
app: nfs-server
name: nfs-server
spec:
containers:
- name: nfs-server
image: itsthenetwork/nfs-server-alpine:11-arm
env:
- name: SHARED_DIRECTORY
value: /exports
ports:
- name: nfs
containerPort: 2049
- name: mountd
containerPort: 20048
- name: rpcbind
containerPort: 111
securityContext:
privileged: true
volumeMounts:
- mountPath: /exports
name: mypvc
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: local-claim
nodeSelector:
hdd: enabled
---
kind: Service
apiVersion: v1
metadata:
name: nfs-server
namespace: storage
spec:
ports:
- name: nfs
port: 2049
- name: mountd
port: 20048
- name: rpcbind
port: 111
clusterIP: 10.43.184.230 # This is optional, but you can guarantee its IP if you set this.
selector:
app: nfs-server

Apply this the usual way

$ sudo kubectl apply -f nfs-server.yaml

This won’t actually deploy it yet though. Our node selector says it must be deployed on a node with a label of HDD: enabled. So let's go ahead and tag the node which contains the path you wish to share.

$ sudo kubectl label node k3s-master hdd=enabled

Once that is done, go ahead and check to make sure the NFS server has started up

$ sudo kubectl get pods -n storage
A running NFS server

Once you have this up and running, then you can reference that in your other PersistentVolumes and use NFS for everything else across the cluster, meaning everything else can be running anywhere. As an example, you could have something like this yaml (Spoiler, this is the exact storage we will be touching on next time!)

apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana-nfs-volume
namespace: monitoring
labels:
directory: grafana
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: slow
nfs:
path: /grafana
server: 10.43.184.230
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-nfs-claim
namespace: monitoring
spec:
storageClassName: slow
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
selector:
matchLabels:
directory: grafana

One thing of note — the setup above does not expose the NFS server external to the cluster. For me, that was a conscious decision, but for you, there may be reasons for which you want to do this. To expose it, all you have to do is create a LoadBalancer service (like the one we saw last time) exposing all the relevant ports to the service.

So there we have it! Local persistent storage, all contained within your cluster. All still owned by you, and limited impact on reliability. Granted, we still have a single point of failure, but that is managed by putting that on the single master node (another single point) so we still have one failure point instead of two. I’ll see you next time when we look into how to monitor our cluster effectively!

--

--

Scott Jones
CodeX

Home automation enthusiast. Self titled k8s Guru. RPi cluster god