It’s been a while. Here’s some Kubernetes storage chat (1/?)
So, it’s been a busy time, and I haven’t gotten to the stage where I reflexively reach for this blog whenever I’m short of a diversion. Still, on we go.
For some time I’ve been getting frustrated with volume management, which as anyone who’s tried bare-metal Kubernetes hosting knows, is a bit of a pain point.
K3s by default ships with local storage provisioning, which effectively just points towards a folder on one of the hosts. Cons of this method:
- backups are harder and have to be filesystem level, not volume level
- no redundancy, so it’s locked to a node
- not a lot of management options - WYSIWYG
Introducing Longhorn which exposes a Container Storage Interface (CSI - the operator for volumes in K8s) to provide block storage for volumes. Now, it’s still technically “just” a filesystem alias, but it comes with bells and whistles like:
- replication of volumes!
- backups!
- monitoring dashboard, I guess that’s nice :shrug:
So this is all very nice. However, if you have cheap, I mean, cost-effective hardware, this can be present some problems in that there’s now a different point of failure. Issues include:
- my node broke and so did Longhorn, so everything else broke
- Longhorn broke so everything else broke
- backups broke my database workloads with locks
Nothing’s perfect, I get it. These problems were however magnified by having a teeny tiny postgres database per application. It’s convenient in that I don’t really have to manage the databases, but it’s inconvenient in that it’s that many more moving parts.
Well, I woke up to broken things one too many times (looking at you, Nextcloud) and finally got tired of it. I tried the Bitnami Postgresql High Availability chart and found that while there are some flaws - a variety of issues in the Github pages notwithstanding, my main concern is that of not hardcoding credentials in git - it worked pretty well for my needs.
And that’s how I found myself doing multiple postgresql backups from individual instances and restores into a shiny Postgres cluster, usually breaking something multiple times before getting it right, in my spare moments over my Easter weekend.
Part 2 (coming soon) will be about how I caved and mounted NFS shares instead of using PVCs for any data that I actually cared about keeping.
A Happy Easter, Chag Sameach, and Ramadan Mubarak to all.