01

09/09

Bitchy little watchdog

17:00 by skaven. Filed under: Hardware

Yesterday we tried to install OpenSolaris 2009.06 on a x86-based server at work. The first attempts ended in a sudden reboot without any error messages. But then the installation worked but the system kept rebooting the installed system. We didn’t expected any hardware errors since the server is brand-new. Nevertheless we ran some tests but memtest showed the same habit to reboot the whole system. After some observation we found out that the reboots occured exactly after 4 minutes of testing.
We’ve searched the internet for solutions but the problem was unknown. So we searched through the handbook of the Super Micro X7SBE motherboard and my instructor found something that was worth a try. There was a jumper mentioned which controls the behavior of the built-in watchdog timer. The jumper was set to reboot the whole system if an application hangs. So we set it to just send an interrupt to the application and… behold… memtest just stopped after 4 minutes with a message about an unexpected interrupt. The third setting was for disabling the watchdog timer completely. This is the setting we chose finally and now the system runs fine.