I installed Bookworm, Docker and Frigate (in docker) on an older i5 Dell with a Coral TPU (USB) and an old GeForce 6600 GPU, the TPU and GPU are for use with Frigate frame processing and video conversion. It was working well for months and then I upset the apple cart by adding whisper and Piper in the docker for my Home Assistant (different server) to use. After I got that all working I noticed that Debian would freeze after several hours of no user use (no one logged in either on the console or SSH). This seems to happen now even after I removed the Piper and whisper containers. Thinking it might be a suspend issue I masked all the suspend points and set the power profile to “never” but that did not help. Looking at journalctl does not show anything I can see, it seems to just stop logging at the freeze point, interestingly the ethernet jack still shows activity but no ping :(
What would be the next troubleshooting steps to find what is causing the freeze?
I have a similar problem. Have you tried turning off c-states? For me it didn’t comepletely resolve the issue, but reduced the frequency a bit.
I looked in the BIOS settings but did not see where I could disable the c-states unfortunately. I did try masking all the triggers in the OS with no change.
Have you tried booting a live distro off usb to see if it’s machine or software? Maybe something that can test memory, cpu, gpu and disk - or a burn in
No I have not tried that. That’s a good idea, is there a live boot image geared toward burn-in / fault testing?
Not really. I see https://www.stresslinux.org/sl/
And besides that you could probably use whatever and just install tools from this list :