A new update from me for posterity here; I’d previously figured I’d do a few more final tests before potentially giving up and sending it back for repairs. Hope this can help anyone coming across this issue as well.
I tried running logs from HWiNFO up to the point where the PC started freezing, and realised that the CPU’s default temps were way too high. I just didn’t notice it before since they were always high, all the time, for me, even when the CPU supposedly wasn’t under load.
The subsequent realisation was that somehow, the CPU was always running on the boosted clock speed, which was causing the heat to stay at 80+C even when idle. I was not actually running on the High Performance power plan on Windows, so I was a little surprised at that. The next step I took was to go into the BIOS to disable Core Performance Boost, which I think forces it to run at base speeds for now, just for testing.
And voila, it’s been running for the past 4 days without a hitch. I think this is the most promising lead I’ve had so far throughout. I’ve ordered some thermal paste, and I’ll probably find some time in the next weeks to do the replacement, and then start digging into what was causing the CPU to run on the boost clock the whole time. All my Docker services and stuff are running ok now, but my game server hosting has been impacted by only running base clock speeds lol.
My guess is that it’s been running like that for the past year or so, but eventually the high temps caught up (and probably got higher and higher due to the thermal paste degradation perhaps) and started causing these seemingly random issues. I will just continue monitoring for now.