Built fast, not right
The backend architecture of siev is not something I really want to carry into the future - it’s cobbled together and spread across 4 different machines to try and keep costs down while staying focused on building something interesting.
One corner of this stack is a very old server that I’d had running for about decade. Its nothing game changing, but it’s been cheaper than running systems in AWS for the majority of the time that I’ve had it. Some time in the past 2-3 years it’s probably become slower than an equivalent system in the cloud but the time-cost of actually switching over hasn’t made sense.
As a result this box is a place where I’ve put anything that needs to be long running and internet accessible - in particular its what I’ve been using to pull podcast data for the past 2 years and has been the box that I query for new shows that are available for processing and analysis.
In an ideal world this system would not be a separate code base from what I’m building now, but again, that takes time and I’m here to ship.
The downside is that a physical box like this requires some maintenance above what the cloud provides, specifically disk space would slowly get eaten up on this machine and then a nightly backup service would stop working. So every 6 months or so I need to go and find some files to delete, free some space, and let the system keep chugging along.
A permanent fix?
Backups started failing in the past two weeks so I decided to take on a more aggressive solution so I didn’t have to care about this problem for at least another year. This involved searching for very large directories1 and then deleting them with the equivalent of a nuclear weapon - the dreaded “sudo rm -rf /dir” command.
For the non-technical, “sudo rm -rf /dir” will delete all files in the folder “dir” with admin level permissions. This means really delete. Unrecoverable levels of delete. Not double checking if you really want to delete a file because you’re the admin levels of delete.
So I’m checking for large directories and completely obliterating them. Cache? Gone. OldCode? Gone. Python2.7? You bet I don’t want that around.
Then I got sloppy.
There’s this folder called “/var” which can get real big, but it hold both super essential libraries and absolute nonsense like logs that you can just wipe away. My intention was to find the nonsense that can just be wiped away.
So once I was done deleting some of the obvious I wanted to poke around /var to see if there was something obvious to wipe out. I tapped up in my history, replaced a directory with /var and got an error message.
What? Since when does listing the directory size produce an error message.
In that moment I saw the horror that I had just run:
sudo rm -rf /var
That, kids, is something you never ever want to do.2
A permanent fix
On the plus side, I now had a ton more disk space. On the downside I just deleted all of the code that lets the server run in the first place.
The one sliver of an advantage that I had is that these files are only needed on boot, so since the server was still on and I had a shell opened I could perform any emergency stabilization efforts needed and extract any data that was still useful.
I did attempt to fix the server - simply reinstalling every package should have replaced all the missing files and gotten everything back to a reasonable place. Unfortunately, the list of what was installed was stored in /var, so it wouldn’t have been the easiest process.
The server is decommissioned, its data has been moved to a new location and I’m working on merging together the two code bases so that I have one less system to deal with. Overall this is maybe two or three days of lost time but it should even out in the long run with me not having to worry about the complexity of having a entire other system to manage.
It’s been a good ride server, but when I shut you down, you’re not rebooting.
Think of the rabbits and the alfalfa you’ll grow for them. You’ll get to tend to the rabbits and live on the fat of the land.
“sudo du -h -x -d1 /” is a nifty little command for finding large directories
I have run the even more dreaded “sudo rm -rf /” before, twice, but thats a story for another time