Tiny Core Linux

Tiny Core Extensions => TCE Corepure64 => Topic started by: FlyingDutchman on January 27, 2022, 02:48:10 PM

Title: How to enable watchdog?
Post by: FlyingDutchman on January 27, 2022, 02:48:10 PM: Hi,

Is there any way to activate the software Linux watchdog? The watchdog command is not available from busybox. I do have 2 devices /dev/watchdog and /dev/watchdog0.

There is a package watchdog-KERNEL-tinycore64.tcz, but that only seems to contain kernel modules, no binaries, deamons or watchdog.conf file.

BR
Title: Re: How to enable watchdog?
Post by: bmarkus on January 27, 2022, 03:02:39 PM: Try the WDT kernel boot option.
Title: Re: How to enable watchdog?
Post by: FlyingDutchman on January 27, 2022, 04:10:37 PM: Hi bmarkus,

Thanks for your quick answer. I will try your suggestion tomorrow on my test VM.

However, I think the kernel part is running. If I open the /dev/watchdog device file (by reading from it), the device reboots after a minute or so. The part I'm missing is the user space daemon that should regularly perform tests and then write to /dev/watchdog to reset the keepalive timer.
Title: Re: How to enable watchdog?
Post by: FlyingDutchman on January 28, 2022, 02:22:21 PM: Hi bmarkus,

I tested and no change. The /dev/watchdog and /dev/watchdog0 devices are still there, but no userspace deamon to control them. I read the core cookbook and the tc-config script, there is no mention of a wdt boot code. What is it supposed to do?
Title: Re: How to enable watchdog?
Post by: Rich on January 28, 2022, 05:12:53 PM: Hi FlyingDutchman
The util-linux extension contains wdctl which allows you to check the status and set the timeout of the
hardware watchdog timer.

There is also watchdogd available on github:
https://github.com/troglobit/watchdogd

If you describe what you are looking to monitor and under what conditions you are looking to reboot, you may
get more responses.
Title: Re: How to enable watchdog?
Post by: FlyingDutchman on January 29, 2022, 04:34:59 AM: Hi Rich,

Thanks for your response and the pointers towards wdctl and watchdogd. I'm running a small home server based on TCL; main tasks are to act as NAS and router. I have a health check script scheduled in cron to perform some basic checks every 5 minutes and take action if needed. Mostly this works well, but in some rare occasions this script does more damage than good. That results in a system with too many processes running, probably short on memory and high CPU load. At those times, I can't even log in and a hard reboot is the only solution.

I stumbled upon the watchdog mechanism in Linux, never heard of it before last week and am now investigating if this is a solution. Mostly for education, a bit to solve the minor issue above.
Title: Re: How to enable watchdog?
Post by: Rich on January 29, 2022, 02:09:41 PM: Hi FlyingDutchman
Quote from: FlyingDutchman on January 29, 2022, 04:34:59 AM
... I have a health check script scheduled in cron to perform some basic checks every 5 minutes and take action if needed. Mostly this works well, but in some rare occasions this script does more damage than good. That results in a system with too many processes running, probably short on memory and high CPU load. ...
I would seriously try to find out why your script causes this behavior before considering a watchdog timer.

A few things come to mind:
Sometimes your script doesn't exit. Over time multiple copies are left running, possibly fighting each other.

Sometimes your script takes more than 5 minutes to run. A second copy gets started and clashes with the first copy.

A loop in your script reads a system file (/proc, /sys, ... ) waiting for something to happen. Without a sleep command
to slow down that loop, CPU usage will quickly rise to 100%.

Launching a command in the background that doesn't always complete. Over time multiple copies are left running.