WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: How to enable watchdog?  (Read 3139 times)

Offline FlyingDutchman

  • Newbie
  • *
  • Posts: 36
How to enable watchdog?
« on: January 27, 2022, 11:48:10 AM »
Hi,

Is there any way to activate the software Linux watchdog? The watchdog command is not available from busybox. I do have 2 devices /dev/watchdog and /dev/watchdog0.

There is a package watchdog-KERNEL-tinycore64.tcz, but that only seems to contain kernel modules, no binaries, deamons or watchdog.conf file.

BR

Offline bmarkus

  • Administrator
  • Hero Member
  • *****
  • Posts: 7183
    • My Community Forum
Re: How to enable watchdog?
« Reply #1 on: January 27, 2022, 12:02:39 PM »
Try the WDT kernel boot option.
Béla
Ham Radio callsign: HA5DI

"Amateur Radio: The First Technology-Based Social Network."

Offline FlyingDutchman

  • Newbie
  • *
  • Posts: 36
Re: How to enable watchdog?
« Reply #2 on: January 27, 2022, 01:10:37 PM »
Hi bmarkus,

Thanks for your quick answer. I will try your suggestion tomorrow on my test VM.

However, I think the kernel part is running. If I open the /dev/watchdog device file (by reading from it), the device reboots after a minute or so. The part I'm missing is the user space daemon that should regularly perform tests and then write to /dev/watchdog to reset the keepalive timer.

Offline FlyingDutchman

  • Newbie
  • *
  • Posts: 36
Re: How to enable watchdog?
« Reply #3 on: January 28, 2022, 11:22:21 AM »
Hi bmarkus,

I tested and no change. The /dev/watchdog and /dev/watchdog0 devices are still there, but no userspace deamon to control them. I read the core cookbook and the tc-config script, there is no mention of a wdt boot code. What is it supposed to do?

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11178
Re: How to enable watchdog?
« Reply #4 on: January 28, 2022, 02:12:53 PM »
Hi FlyingDutchman
The  util-linux  extension contains  wdctl  which allows you to check the status and set the  timeout  of the
hardware watchdog timer.

There is also  watchdogd  available on github:
https://github.com/troglobit/watchdogd

If you describe what you are looking to monitor and under what conditions you are looking to reboot, you may
get more responses.

Offline FlyingDutchman

  • Newbie
  • *
  • Posts: 36
Re: How to enable watchdog?
« Reply #5 on: January 29, 2022, 01:34:59 AM »
Hi Rich,

Thanks for your response and the pointers towards wdctl and watchdogd. I'm running a small home server based on TCL; main tasks are to act as NAS and router. I have a health check script scheduled in cron to perform some basic checks every 5 minutes and take action if needed. Mostly this works well, but in some rare occasions this script does more damage than good. That results in a system with too many processes running, probably short on memory and high CPU load. At those times, I can't even log in and a hard reboot is the only solution.

I stumbled upon the watchdog mechanism in Linux, never heard of it before last week and am now investigating if this is a solution. Mostly for education, a bit to solve the minor issue above.

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11178
Re: How to enable watchdog?
« Reply #6 on: January 29, 2022, 11:09:41 AM »
Hi FlyingDutchman
... I have a health check script scheduled in cron to perform some basic checks every 5 minutes and take action if needed. Mostly this works well, but in some rare occasions this script does more damage than good. That results in a system with too many processes running, probably short on memory and high CPU load. ...
I would seriously try to find out why your script causes this behavior before considering a watchdog timer.

A few things come to mind:
Sometimes your script doesn't exit. Over time multiple copies are left running, possibly fighting each other.

Sometimes your script takes more than 5 minutes to run. A second copy gets started and clashes with the first copy.

A loop in your script reads a system file (/proc, /sys, ... ) waiting for something to happen. Without a sleep command
to slow down that loop, CPU usage will quickly rise to 100%.

Launching a command in the background that doesn't always complete. Over time multiple copies are left running.