WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: [Solved] anything in base that automatically reboots the system?  (Read 7560 times)

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #15 on: June 16, 2022, 08:31:18 AM »
Ok, you guys convinced me to keep hardware issue on the list of possibilities. At this point I'm really hoping that the problem recurs soon so that we can have a log to look at ;D

If it turns out to be a hardware issue, I have a different laptop sitting on a shelf somewhere that could take over. That would be easier than swapping out this laptop's CPU or giving it the kind of cleanup you are suggesting. Let's wait and see what the log shows.

Offline tacpilot

  • Newbie
  • *
  • Posts: 30
Re: anything in base that automatically reboots the system?
« Reply #16 on: June 16, 2022, 08:43:45 AM »
You may want to install setup something like lm-sensors so as to include
temperature monitoring in your logs. Correlating temps to some random
crash could help narrow the cause.

Quote
I'd run Memtest86 to see whether the RAM might be getting dodgy.
Should always be in the top of your toolbox .. Wouldnt be the first time I've seen
mem causing issues.Then pulling out, cleaning, and reinstalling the same
sticks fixed the issue.

Thermal issues can cause mem issues .. especially being a laptop, deff need watch temps..
« Last Edit: June 16, 2022, 08:54:32 AM by tacpilot »
Never limit your creativity by the imagination of others.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #17 on: June 16, 2022, 09:27:31 AM »
Thanks, tacpilot. I installed lm-sensors and will keep a persistent temperature log. I will also check the RAM.

Offline tacpilot

  • Newbie
  • *
  • Posts: 30
Re: anything in base that automatically reboots the system?
« Reply #18 on: June 16, 2022, 11:04:24 PM »
Just a reminder to check the sensor logs to make sure they are generating as expected.
Never limit your creativity by the imagination of others.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #19 on: June 17, 2022, 09:51:57 AM »
It happened again in the middle of the night last night. syslog shows nothing unusual (in fact, last entries in the syslog were 10 minutes before the reboot). I have a script that logs temperatures every 10 seconds and the last entry shows normal temperatures. Whatever caused the reboot didn't run "reboot" or "/sbin/reboot" because I deleted /sbin/reboot and have a script in my PATH called "reboot" that does nothing except create a log entry.

Could it be that even though this laptop can operate with a broken battery (or even no battery), sometimes some low-level piece of software (e.g., BIOS or kernel) causes a reboot when it sees a broken battery or missing battery?

This morning I removed the harddrive and put it in a different laptop (with new battery, different power adapter, and different RAM). If it happens again I think it would point to something in the OS or one of the background processes.

P.S. The only hardware that remains the same in my setup is the modem, powerstrip, and harddrive (plus everything on it, namely TCL13 and my extensions). I will replace the powerstrip and harddrive soon so that I can confidently exclude a hardware issue.
« Last Edit: June 17, 2022, 10:00:28 AM by GNUser »

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #20 on: June 17, 2022, 10:50:52 AM »
Neither syslog nor my temperature log gave me any leads the last time there was an unexpected reboot. Can you guys think of anything else I should be logging?
« Last Edit: June 17, 2022, 11:07:10 AM by GNUser »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11568
Re: anything in base that automatically reboots the system?
« Reply #21 on: June 17, 2022, 11:27:42 AM »
Hi GNUser
... Can you guys think of anything else I should be logging?
Power supply voltages (and currents) if possible.

... Whatever caused the reboot didn't run "reboot" or "/sbin/reboot" because I deleted /sbin/reboot and have a script in my PATH called "reboot" that does nothing except create a log entry. ...
Doesn't prevent someone from running this:
Code: [Select]
sudo busybox reboot :o ::)

There are some watchdog options that are set in the kernel configuration. You could try adding this boot code:
Code: [Select]
nowatchdog
Post the results of this:
Code: [Select]
cat /proc/interrupts
These suggestions are for the original laptop since I suspect that and not the hard drive.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #22 on: June 17, 2022, 11:41:00 AM »
Doesn't prevent someone from running this:
Code: [Select]
sudo busybox reboot :o ::)
Of course not ;) Not having  /sbin/reboot  is just to stop processes that may try to reboot but are too unsophisticated to do "sudo busybox reboot". Until I sort this out, "sudo busybox reboot" and "sudo busybox poweroff" is how I myself and rebooting and powering off my "router".

Logging power supply voltages and currents is a good idea. I will arrange that.

Thanks for the "nowatchdog" boot code. I will try it on the new machine if the problem recurs.

I don't have access to the old machine at the moment. I will post output of "cat /proc/interrupts" on the new machine if the problem recurs.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #23 on: June 17, 2022, 12:11:16 PM »
Hi, Rich. I'm at work but my wife just let me know that WiFi went down again. I can confirm via SSH that the new laptop-router (i.e., different RAM, new battery, different power adapter, different BIOS) rebooted at 11:51 a.m. my local time.

I will attach syslog, temperature/voltage log (from within 10 seconds of the reboot), and output of "cat /proc/interrupts"

Things are pointing to a software issue but I have no idea how to clinch it. I'm so frustrated by this :'( At this point I'm completely out of ideas.
« Last Edit: June 17, 2022, 01:14:45 PM by Rich »

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #24 on: June 17, 2022, 01:01:35 PM »
I'm getting a sinking feeling about this. I'm going to try downgrading the laptop-router to TCL12 x86_64. I don't remember ever having this problem in the past.

Rich, after you've downloaded the three files above to inspect them, would you kindly delete the attachments from my post? There are some private things in there (MAC addresses, etc) that I'd prefer not to be a permanent part of this thread.
« Last Edit: June 17, 2022, 01:03:36 PM by GNUser »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11568
Re: anything in base that automatically reboots the system?
« Reply #25 on: June 17, 2022, 02:04:47 PM »
Hi GNUser
Attachments removed.  :)

There's a lot going on in that log file and I don't understand all of it. A few things did catch my
attention, though I don't know if they are problems.

Two copies of dnsmasq being launched:
Code: [Select]
Jun 17 09:40:12 x200 daemon.info dnsmasq[3531]: started, version 2.79 cachesize 150
Jun 17 09:40:12 x200 daemon.info dnsmasq[3531]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
Jun 17 09:40:12 x200 daemon.info dnsmasq-dhcp[3531]: DHCP, IP range 192.168.x.x -- 192.168.x.x, lease time 1d
Jun 17 09:40:12 x200 daemon.info dnsmasq-dhcp[3531]: DHCP, sockets bound exclusively to interface wlan1
Jun 17 09:40:12 x200 daemon.info dnsmasq[3531]: reading /etc/resolv.conf
Jun 17 09:40:12 x200 daemon.info dnsmasq[3531]: using nameserver 75.75.75.75#53
Jun 17 09:40:12 x200 daemon.info dnsmasq[3531]: using nameserver 75.75.76.76#53
 ----- Snip -----
Jun 17 09:40:13 x200 daemon.info dnsmasq[3531]: using nameserver 193.138.218.74#53
 ----- Snip -----
Jun 17 09:40:13 x200 daemon.info dnsmasq[3761]: started, version 2.79 cachesize 150
Jun 17 09:40:13 x200 daemon.info dnsmasq[3761]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
Jun 17 09:40:13 x200 daemon.info dnsmasq-dhcp[3761]: DHCP, IP range 192.168.x.x -- 192.168.x.x, lease time 1d
Jun 17 09:40:13 x200 daemon.info dnsmasq-dhcp[3761]: DHCP, sockets bound exclusively to interface wlan2
Jun 17 09:40:13 x200 daemon.info dnsmasq[3761]: reading /etc/resolv.conf
Jun 17 09:40:13 x200 daemon.info dnsmasq[3761]: using nameserver 193.138.218.74#53
 ----- Snip -----
By the way, that's the only reference to  wlan2  I see in there.

This series of errors:
Code: [Select]
Jun 17 11:07:26 x200 auth.info sshd[5301]: Accepted publickey for bruno from 162.x.x.x port 47x ssh2: RSA SHA256:Secret
Jun 17 11:07:29 x200 auth.err sshd[5303]: error: connect_to 192.168.x.161 port 22: failed.
Jun 17 11:07:29 x200 auth.info sshd[5306]: Accepted publickey for bruno from 162.x.x.x port 47x ssh2: RSA SHA256:Secret
Jun 17 11:07:32 x200 auth.err sshd[5308]: error: connect_to 192.168.x.62 port 22: failed.
Jun 17 11:07:35 x200 auth.info sshd[5309]: Accepted publickey for bruno from 162.x.x.x port 47x ssh2: RSA SHA256:Secret
Jun 17 11:07:38 x200 auth.err sshd[5311]: error: connect_to 192.168.x.161 port 22: failed.
Jun 17 11:07:39 x200 auth.info sshd[5314]: Accepted publickey for bruno from 162.x.x.x port 47x ssh2: RSA SHA256:Secret
Jun 17 11:07:42 x200 auth.err sshd[5316]: error: connect_to 192.168.x.62 port 22: failed.
For some reason your (LAN) subnet was alternating between 62 and 161, or 2 devices were trying to connect at about the same time.

Then I saw this message amongst the traffic 2 seconds before the end:
Code: [Select]
Jun 17 11:48:27 x200 daemon.notice hostapd: wlan1: STA xx.xx.xx.xx.xx.xx IEEE 802.11: did not acknowledge authentication response

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #26 on: June 17, 2022, 02:13:07 PM »
Hi, Rich. Thank you very much for looking it over. I have two wireless USB adapters connected to the laptop (wlan1 and wlan2). There are two instances of hostapd and two instances of dnsmasq, one for each wireless USB adapter. So that's all fine. wlan2 is there for when we have guests staying over. We don't have any guests visiting us right now, so wlan2 isn't doing much.

Some devices on the network get an IP via DHCP (192.168.x.100-192.168.x.200) and others get static addresses (192.168.x.2-192.168.x.99), so that's also fine.

Sometimes devices in my home network get powered off or go into suspend, so that might explain the lack of authentication response, although I'm not 100% sure about that final excerpt that you quoted.

Thanks for deleting the attachments. I have downgraded the laptop-router's OS to TCL12 x86_64. Let's see how that goes.

Offline tacpilot2

  • Newbie
  • *
  • Posts: 3
Re: anything in base that automatically reboots the system?
« Reply #27 on: June 17, 2022, 04:08:47 PM »
at work cant rem password and reset option not seem to be working.

Still leaning towards hardware.
My next guess was for failing RAM or HDD..
since you have removed all the other options from the
equation, then it would appear to be a failing HDD.

You should be able to verify by looking at the logs stored
in the HDD it self and/or running drive health checking tools.

Till then, an easier method to see if some weird software may
be at fault, you can remove the HDD from the equation.
  • Set the (copy2fs.flg) so that everything gets copied to RAM.
  • use bootcode noswap to ensure its not trying to use the drives swap space
  • then unmount anything mounted to that drive
at that point, even if the drive goes offline, it wont matter to TC

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1494
Re: anything in base that automatically reboots the system?
« Reply #28 on: June 17, 2022, 04:14:03 PM »
TCL uses harddrive at boot but then runs from RAM, so a harddrive failure shouldn't cause a crash and reboot. Nevertheless, I will be replacing the harddrive soon just to exclude all hardware.

Offline tacpilot2

  • Newbie
  • *
  • Posts: 3
Re: anything in base that automatically reboots the system?
« Reply #29 on: June 17, 2022, 04:32:51 PM »
looking at the corebook.pdf .. page 8
1.9. Copy mode...
seems to indicate unless the copy2fs.flg is set ..
any externally loaded extensions are mounted on drive not in RAM.

if swap space exists on drive it is mounted by default as well

betting new HDD will solve the issue
« Last Edit: June 17, 2022, 04:34:53 PM by tacpilot2 »