It's been a while. Here's some Kubernetes storage chat (1/?)

So, it’s been a busy time, and I haven’t gotten to the stage where I reflexively reach for this blog whenever I’m short of a diversion. Still, on we go.

For some time I’ve been getting frustrated with volume management, which as anyone who’s tried bare-metal Kubernetes hosting knows, is a bit of a pain point.

K3s by default ships with local storage provisioning, which effectively just points towards a folder on one of the hosts. Cons of this method:

backups are harder and have to be filesystem level, not volume level
no redundancy, so it’s locked to a node
not a lot of management options - WYSIWYG

Introducing Longhorn which exposes a Container Storage Interface (CSI - the operator for volumes in K8s) to provide block storage for volumes. Now, it’s still technically “just” a filesystem alias, but it comes with bells and whistles like:

replication of volumes!
backups!
monitoring dashboard, I guess that’s nice :shrug:

So this is all very nice. However, if you have cheap, I mean, cost-effective hardware, this can be present some problems in that there’s now a different point of failure. Issues include:

my node broke and so did Longhorn, so everything else broke
Longhorn broke so everything else broke
backups broke my database workloads with locks

Nothing’s perfect, I get it. These problems were however magnified by having a teeny tiny postgres database per application. It’s convenient in that I don’t really have to manage the databases, but it’s inconvenient in that it’s that many more moving parts.

Well, I woke up to broken things one too many times (looking at you, Nextcloud) and finally got tired of it. I tried the Bitnami Postgresql High Availability chart and found that while there are some flaws - a variety of issues in the Github pages notwithstanding, my main concern is that of not hardcoding credentials in git - it worked pretty well for my needs.

And that’s how I found myself doing multiple postgresql backups from individual instances and restores into a shiny Postgres cluster, usually breaking something multiple times before getting it right, in my spare moments over my Easter weekend.

Part 2 (coming soon) will be about how I caved and mounted NFS shares instead of using PVCs for any data that I actually cared about keeping.

A Happy Easter, Chag Sameach, and Ramadan Mubarak to all.

It’s been a while. Here’s some Kubernetes storage chat (1/?)