net.core.rmem_default = 67108864
net.core.rmem_max = 67108864
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.udp_rmem_min = 1501632
net.core.wmem_default = 67108864
net.core.wmem_max = 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
net.ipv4.udp_wmem_min = 1501632
vm.lowmem_reserve_ratio = 256 256 32 0
netstat -s -u
IcmpMsg:
InType3: 53854
InType5: 29
InType8: 103009
InType11: 1912
OutType0: 103009
OutType3: 752672
Udp:
35873971 packets received
1116273 packets to unknown port received.
127929 packet receive errors
57372802 packets sent
127819 receive buffer errors
0 send buffer errors
IgnoredMulti: 13692
UdpLite:
IpExt:
InMcastPkts: 32074
OutMcastPkts: 8271
InBcastPkts: 13692
InOctets: 1378192101
OutOctets: -1523059425
InMcastOctets: 2571852
OutMcastOctets: 658752
InBcastOctets: 2262805
InNoECTPkts: 41282534
InECT0Pkts: 5512
ifconfig eth0
eth0 Link encap:Ethernet HWaddr D4:7C:44:D2:5B:69
inet addr:IP Bcast: Mask:255.255.255.0
inet6 addr: IPV6/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:41340621 errors:0 dropped:10 overruns:0 frame:0
TX packets:61277449 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6253258711 (5.8 GiB) TX bytes:12207575731 (11.3 GiB)
Memory:92200000-9227ffff
cat /proc/770859/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 102400000 102400000 bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 127717 127717 processes
Max open files 1048576 1048576 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 127717 127717 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
The version you have is quite old. I've been using TC and bind as my internet facing domain DNS for years without any problems. Are you using iptables with logging also? The system logs from that could be quite helpful. Some information about your configuration would be helpful, such as if you are using master/slave replication, a chroot jail, split horizon, etc. I should have an updated extension for bind 9.18 available for TC 14 this week. I need to do some more testing. Would you be able to upgrade once the extensions are available on the mirrors?Thanks for your reply.The 9.11 version has not encountered similar issues on other Linux systems, but has encountered this issue on TC. I think it may not necessarily be a bind issue, because when this situation occurs, the crond. log that the system automatically runs every minute is not running when the fault occurs. The newly established SSH link from the client is also in the waiting phase, and the root prompt does not appear until the fault is automatically restored. Do you have any other suggestions for viewing?
... It seems that when a malfunction occurs, the entire system seems to freeze, with cron not executing, SSH new connections waiting, and existing SSH connections functioning normally.If the system is still responsive enough to run commands from a terminal, here
for i in `seq 1 1 5`; do sleep 1; ps aux > ps"$i".txt; done
grep -v "0.0 0.0 0 0" ps1.txt | less
Entries that have %CPU %MEM VSZ RSS all set to zero will be filtered out.free -m
Look at the -/+ buffers/cache: row. If its free column is approaching zero, yourvmstat 1
Look at the si and so columns to see if the system is busy swapping.USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
named 288374 18.9 15.4 5980300 5082544 ? Ssl 00:01 158:24 /opt/bind9/sbin/named -u named -c /opt/bind9/etc/named.conf
Due to loading RPZ in multiple regions, it takes up a lot of memory.root@AAA:~# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 24639664 3692 2108904 0 0 52 79 74 63 4 3 93 1 0
0 0 0 24639656 3692 2108976 0 0 0 0 3735 16767 5 5 90 0 0
0 0 0 24639800 3700 2109076 0 0 0 432 3609 16210 6 4 90 1 0
0 0 0 24639800 3700 2109128 0 0 0 0 3821 16340 5 4 92 0 0
0 0 0 24639800 3700 2109200 0 0 0 32 3671 16029 5 4 91 0 0
0 0 0 24639800 3700 2109304 0 0 0 0 3717 15870 5 4 91 0 0
0 0 0 24639800 3700 2109372 0 0 0 0 3869 17196 5 5 90 0 0
0 0 0 24639548 3708 2109464 0 0 0 428 3996 17018 5 5 89 1 0
0 0 0 24639296 3708 2109560 0 0 0 0 3992 18611 5 4 91 0 0
0 0 0 24639296 3708 2109680 0 0 0 0 4203 18668 5 4 91 0 0
^C
root@AAA:~# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 24609900 3792 2110680 0 0 52 79 75 64 4 3 93 1 0
0 0 0 24605324 3792 2110664 0 0 0 0 398 281 1 1 98 0 0
0 0 0 24605072 3792 2110664 0 0 0 0 235 27 0 0 100 0 0
0 0 0 24604568 3792 2110664 0 0 0 0 231 52 0 0 100 0 0
0 0 0 24604316 3792 2110644 0 0 0 0 263 55 0 0 100 0 0
0 0 0 24603812 3792 2110644 0 0 0 0 232 55 0 0 100 0 0
0 0 0 24603308 3792 2110652 0 0 0 104 263 68 0 0 100 0 0
0 0 0 24602552 3792 2110652 0 0 0 0 301 101 0 0 100 0 0
0 0 0 24602300 3792 2110652 0 0 0 0 269 62 0 0 100 0 0
0 0 0 24602048 3792 2110652 0 0 0 0 278 64 0 0 100 0 0
root@AAA:~# free -m
total used free shared buff/cache available
Mem: 32169 6044 24065 1907 2058 23810
Swap: 8191 0 8191
[code][ 36.176529] pcm512x 1-004d: Failed to get supply 'AVDD': -517
[ 36.176536] pcm512x 1-004d: Failed to get supplies: -517
[ 36.191753] pcm512x 1-004d: Failed to get supply 'AVDD': -517[/code]
[ 36.176529] pcm512x 1-004d: Failed to get supply 'AVDD': -517
[ 36.176536] pcm512x 1-004d: Failed to get supplies: -517
[ 36.191753] pcm512x 1-004d: Failed to get supply 'AVDD': -517
Hi zbs888No,sir.
Are you running this in some kind of virtual environment
like qemu, vmware, etc.? Or maybe chroot ?
Is the console of the physical machine showing the same slowness in responding as the network connections, or is it just the networking part that seems to be having problems? Do you have access to another computer on the same network running wireshark or tcpdump?Thanks for your reply.
dmesg
[150339.863197] myshell (527294): drop_caches: 3
[150844.672845] myshell (532878): drop_caches: 3
[151455.711752] myshell (538907): drop_caches: 3
[152056.927311] myshell (547089): drop_caches: 3
[152763.928666] myshell (556228): drop_caches: 3
[153244.841877] myshell (566220): drop_caches: 3
[153850.340627] myshell (576719): drop_caches: 3
tail /var/log/kernel.log
Apr 19 09:40:07 localhost kernel: myshell (521604): drop_caches: 3
Apr 19 09:51:41 localhost kernel: myshell (527294): drop_caches: 3
Apr 19 10:00:06 localhost kernel: myshell (532878): drop_caches: 3
Apr 19 10:10:17 localhost kernel: myshell (538907): drop_caches: 3
Apr 19 10:20:18 localhost kernel: myshell (547089): drop_caches: 3
Apr 19 10:32:05 localhost kernel: myshell (556228): drop_caches: 3
Apr 19 10:40:06 localhost kernel: myshell (566220): drop_caches: 3
Apr 19 10:50:12 localhost kernel: myshell (576719): drop_caches: 3
... (fault)Base on this it almost looks like the system is sleeping.Code: [Select]root@AAA:~# vmstat 1
...
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 24609900 3792 2110680 0 0 52 79 75 64 4 3 93 1 0
0 0 0 24605324 3792 2110664 0 0 0 0 398 281 1 1 98 0 0
0 0 0 24605072 3792 2110664 0 0 0 0 235 27 0 0 100 0 0
0 0 0 24604568 3792 2110664 0 0 0 0 231 52 0 0 100 0 0
0 0 0 24604316 3792 2110644 0 0 0 0 263 55 0 0 100 0 0
0 0 0 24603812 3792 2110644 0 0 0 0 232 55 0 0 100 0 0
0 0 0 24603308 3792 2110652 0 0 0 104 263 68 0 0 100 0 0
0 0 0 24602552 3792 2110652 0 0 0 0 301 101 0 0 100 0 0
0 0 0 24602300 3792 2110652 0 0 0 0 269 62 0 0 100 0 0
0 0 0 24602048 3792 2110652 0 0 0 0 278 64 0 0 100 0 0
What I'm really trying to do is figure out what kind of problem you are having. I haven't heard anything yet which tells me for sure it's a computer problem or a networking problem.Dear sir.My problem is on my computer, there are intermittent issues with the bind service being unable to connect.This issue may seem like a bind issue, but through some testing, I always feel that it doesn't seem like a bind issue.For example, when a problem occurs, using iptables for rule setting may get stuck, but iptables - L - nv can be executed; The programs that execute every minute in crond. log are no longer running, and existing SSH links can operate normally, but creating a new SSH connection machine will get stuck.I currently have no clue where to start, so please give me some advice.
...Why are you constantly dropping caches?Code: [Select]tail /var/log/kernel.log
...
Apr 19 09:40:07 localhost kernel: myshell (521604): drop_caches: 3
Apr 19 09:51:41 localhost kernel: myshell (527294): drop_caches: 3
Apr 19 10:00:06 localhost kernel: myshell (532878): drop_caches: 3
Apr 19 10:10:17 localhost kernel: myshell (538907): drop_caches: 3
Apr 19 10:20:18 localhost kernel: myshell (547089): drop_caches: 3
Apr 19 10:32:05 localhost kernel: myshell (556228): drop_caches: 3
Apr 19 10:40:06 localhost kernel: myshell (566220): drop_caches: 3
Apr 19 10:50:12 localhost kernel: myshell (576719): drop_caches: 3
Hi zbs888...Why are you constantly dropping caches?Code: [Select]tail /var/log/kernel.log
...
Apr 19 09:40:07 localhost kernel: myshell (521604): drop_caches: 3
Apr 19 09:51:41 localhost kernel: myshell (527294): drop_caches: 3
Apr 19 10:00:06 localhost kernel: myshell (532878): drop_caches: 3
Apr 19 10:10:17 localhost kernel: myshell (538907): drop_caches: 3
Apr 19 10:20:18 localhost kernel: myshell (547089): drop_caches: 3
Apr 19 10:32:05 localhost kernel: myshell (556228): drop_caches: 3
Apr 19 10:40:06 localhost kernel: myshell (566220): drop_caches: 3
Apr 19 10:50:12 localhost kernel: myshell (576719): drop_caches: 3
... is beneficial for keeping the system with more cache to handle other things.The problem with that is it clears the entire cache. You might be clearing large
Hi zbs888Hi,Rich,thanks for your help.... is beneficial for keeping the system with more cache to handle other things.The problem with that is it clears the entire cache. You might be clearing large
amounts of frequently used data when you do this. The system will then need to
fetch that data again which may impact the systems response time.
Your system appears to have about 32 Gbytes of RAM, so I doubt you will run out of cache.
... Do you mean this problem occurs because I clear the cache every 10 minutes? ...I don't know that, but it may be contributing to the problem.
Hi zbs888Thanks,Rich.I removed Dorp_ Is there anything else to pay attention to when checking the cache script that still has issues?... Do you mean this problem occurs because I clear the cache every 10 minutes? ...I don't know that, but it may be contributing to the problem.
The operating system will flush older unused entries on its own if it needs more space.
vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 24534844 8176 2198516 0 0 47 84 36 22 4 3 92 1 0
0 0 0 24534844 8176 2198516 0 0 0 0 215 60 0 0 100 0 0
0 0 0 24534844 8176 2198516 0 0 0 0 240 52 0 0 100 0 0
0 0 0 24534844 8176 2198516 0 0 0 0 253 44 0 0 100 0 0
0 0 0 24534592 8176 2198516 0 0 0 8 305 74 0 0 100 0 0
0 0 0 24534592 8176 2198516 0 0 0 0 319 82 0 0 100 0 0
0 0 0 24534592 8176 2198516 0 0 0 0 370 73 0 0 100 0 0
0 0 0 24534592 8176 2198516 0 0 0 0 363 48 0 0 100 0 0
0 0 0 24534592 8176 2198516 0 0 0 0 409 68 0 0 100 0 0
0 0 0 24534592 8176 2198516 0 0 0 0 389 86 0 0 100 0 0
0 0 0 24534592 8180 2198512 0 0 0 4 322 83 0 0 99 0 0
0 0 0 24534592 8180 2198512 0 0 0 0 373 70 0 0 100 0 0
0 0 0 24534592 8180 2198512 0 0 0 0 324 52 0 0 100 0 0
0 0 0 24534592 8180 2198512 0 0 0 0 278 39 0 0 100 0 0
Apr 20 09:20:00 localhost crond[2490]: USER root pid 408819 cmd /root/shell1
Apr 20 09:21:00 localhost crond[2490]: USER root pid 412708 cmd /root/shell2
Apr 20 09:21:00 localhost crond[2490]: USER root pid 412709 cmd /root/shell3
Apr 20 09:21:00 localhost crond[2490]: USER root pid 412710 cmd /root/shell4
Apr 20 09:22:00 localhost crond[2490]: USER root pid 413146 cmd /root/shell2
Apr 20 09:22:00 localhost crond[2490]: USER root pid 413147 cmd /root/shell3
Apr 20 09:22:00 localhost crond[2490]: USER root pid 413148 cmd /root/shell4
Apr 20 09:23:00 localhost crond[2490]: USER root pid 413589 cmd /root/shell2
Apr 20 09:26:25 localhost crond[2490]: USER root pid 413900 cmd /root/shell3
Apr 20 09:26:25 localhost crond[2490]: USER root pid 413902 cmd /root/shell4
Apr 20 09:26:30 localhost crond[2490]: user root: process already running: /root/shell3
Apr 20 09:26:30 localhost crond[2490]: user root: process already running: /root/shell4
Apr 20 09:26:30 localhost crond[2490]: user root: process already running: /root/shell3
Apr 20 09:26:30 localhost crond[2490]: user root: process already running: /root/shell4
Apr 20 09:26:30 localhost crond[2490]: user root: process already running: /root/shell3
Apr 20 09:26:30 localhost crond[2490]: user root: process already running: /root/shell4
Apr 20 09:26:30 localhost crond[2490]: USER root pid 414214 cmd /root/shell5
Apr 20 09:26:30 localhost crond[2490]: USER root pid 414215 cmd /root/shell6
Apr 20 09:26:30 localhost crond[2490]: USER root pid 414216 cmd /root/shell2
Apr 20 09:27:00 localhost crond[2490]: USER root pid 414538 cmd /root/shell2
... Is there anything else to pay attention to when checking the cache script that still has issues? ...I haven't seen your cache script so I can't answer that.
Hi zbs888Sorry,sir.What I mean is, what other commands do I need to use to check for potential issues?I don't have a clue at all... Is there anything else to pay attention to when checking the cache script that still has issues? ...I haven't seen your cache script so I can't answer that.
Hi zbs888This fault occurs every few tens of minutes. Below is some information about the time the fault occurred. Could you please help me take a look.
Add the boot code syslog to your boot loader.
When the system faults, grab a copy of /var/log/messages and
attach it to your next post. Maybe it will contain something interesting.
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz64 panic=3 noapic acpi=ht console=ttyS1,115200 console=ttyS0,115200
tail /var/log/messages.log
Apr 20 10:17:16 localhost named[2441]: none:104: 'max-cache-size 90%' - setting to 28952MB (out of 32169MB)
Apr 20 10:17:16 localhost named[2441]: set up managed keys zone for view UNICOM-user, file 'ab05e628ff9a962c.mkeys'
Apr 20 10:17:16 localhost named[2441]: none:104: 'max-cache-size 90%' - setting to 28952MB (out of 32169MB)
Apr 20 10:17:16 localhost named[2441]: set up managed keys zone for view CMCC-user, file '0d52e9253c2aad60.mkeys'
Apr 20 10:17:16 localhost named[2441]: none:104: 'max-cache-size 90%' - setting to 28952MB (out of 32169MB)
Apr 20 10:17:16 localhost named[2441]: set up managed keys zone for view any-user, file 'any-user.mkeys'
Apr 20 10:17:16 localhost named[2441]: none:104: 'max-cache-size 90%' - setting to 28952MB (out of 32169MB)
Apr 20 10:17:16 localhost named[2441]: command channel listening on 127.0.0.1#953
Apr 20 10:17:17 localhost kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 20 10:18:05 localhost kernel: floppy0: no floppy controllers found
tail /var/log/kernel.log
Apr 20 10:16:58 localhost kernel: igb 0000:02:00.0 eth0: renamed from lan1
Apr 20 10:16:58 localhost kernel: igb 0000:03:00.0 eth1: renamed from lan2
Apr 20 10:16:58 localhost kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 20 10:17:06 localhost kernel: NET: Registered protocol family 10
Apr 20 10:17:06 localhost kernel: Segment Routing with IPv6
Apr 20 10:17:09 localhost kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 20 10:17:09 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Apr 20 10:17:13 localhost kernel: ip_local_port_range: prefer different parity for start/end values.
Apr 20 10:17:17 localhost kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 20 10:18:05 localhost kernel: floppy0: no floppy controllers found
Hi zbs888
Add the boot code syslog to your boot loader.
When the system faults, grab a copy of /var/log/messages and
attach it to your next post. Maybe it will contain something interesting.
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz64 panic=3 console=ttyS1,115200 console=ttyS0,115200 syslog processor.max_cstate=0 nohz=off intel_idle.max_cstate=0 idle=halt idle=nomwait selinux=0
tail kernel.log
Apr 24 10:47:09 localhost kernel: igb 0000:03:00.0 eth2: igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 24 10:47:10 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
Apr 24 10:47:14 localhost kernel: ip_local_port_range: prefer different parity for start/end values.
Apr 24 10:47:17 localhost kernel: igb 0000:03:00.0 eth2: igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 24 10:48:05 localhost kernel: floppy0: no floppy controllers found
Apr 24 11:23:25 localhost kernel: igb 0000:03:00.0 eth2: igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 24 11:25:20 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 11:44:53 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 12:43:48 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 13:50:22 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
# cat /proc/cpuinfo | grep 'cpu M'
cpu MHz : 968.333
cpu MHz : 974.472
cpu MHz : 961.290
cpu MHz : 949.197
cpu MHz : 955.121
cpu MHz : 959.014
cpu MHz : 963.917
cpu MHz : 940.815
# cat /proc/cpuinfo | grep 'cpu M'
cpu MHz : 1073.077
cpu MHz : 1087.866
cpu MHz : 1061.614
cpu MHz : 1079.117
cpu MHz : 1069.619
cpu MHz : 1057.048
cpu MHz : 1095.803
cpu MHz : 1074.776
# cat /proc/cpuinfo | grep 'cpu M'
cpu MHz : 1040.258
cpu MHz : 1062.527
cpu MHz : 1022.373
cpu MHz : 998.141
cpu MHz : 977.028
cpu MHz : 1057.740
cpu MHz : 1035.175
cpu MHz : 1020.519
# cat /sys/devices/system/cpu/cpuidle/current_driver
none
cat /sys/module/intel_idle/parameters/max_cstate
0
Are you saying /var/log/messages is empty?Hi zbs888
Add the boot code syslog to your boot loader.
When the system faults, grab a copy of /var/log/messages and
attach it to your next post. Maybe it will contain something interesting.
----- Snip -----
I added syslog in grub.conf,when the system faults ,/var/log/message and kernel.log,show nothing. ...
That looks kind of interesting.Code: [Select]Apr 24 11:25:20 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 11:44:53 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 12:43:48 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 13:50:22 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
sudo sysctl -a 2>&1 | grep -Ei "_syn|backlog|somax|abort"
Hi zbs888Hi,rich,/var/log/messages save normal named startup message,nothing about any error messages.Are you saying /var/log/messages is empty?Hi zbs888
Add the boot code syslog to your boot loader.
When the system faults, grab a copy of /var/log/messages and
attach it to your next post. Maybe it will contain something interesting.
----- Snip -----
I added syslog in grub.conf,when the system faults ,/var/log/message and kernel.log,show nothing. ...
Or
Are you saying you looked at /var/log/messages and decided it contained nothing important?QuoteThat looks kind of interesting.Code: [Select]Apr 24 11:25:20 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 11:44:53 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 12:43:48 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
Apr 24 13:50:22 localhost kernel: TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending cookies. Check SNMP counters.
What does this command return:Code: [Select]sudo sysctl -a 2>&1 | grep -Ei "_syn|backlog|somax|abort"
sysctl -a 2>&1 | grep -Ei "_syn|backlog|somax|abort"
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.ipv4.fib_sync_mem = 524288
net.ipv4.tcp_abort_on_overflow = 1
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syncookies = 1
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 5
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 5
sudo sysctl -w net.ipv4.tcp_abort_on_overflow=0
disable_ipv6=1
Hi zbs888Hi,rich.
Most of that looks OK. net.ipv4.tcp_abort_on_overflow should
probably be set to zero:Code: [Select]sudo sysctl -w net.ipv4.tcp_abort_on_overflow=0
Unless you really need IPv6 try adding the boot code:Code: [Select]disable_ipv6=1
You might want to check the config files for software that uses
the network connection.
Look for settings that involve:
number of connections
queues
backlog
/var/log/kernel.log
Apr 25 15:31:23 localhost kernel: mce: CPU3: Core temperature above threshold, cpu clock throttled (total events = 3167)
Apr 25 15:31:23 localhost kernel: mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 4653)
Apr 25 15:31:23 localhost kernel: mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 4653)
Apr 25 15:31:23 localhost kernel: mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 4653)
Apr 25 15:31:23 localhost kernel: mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 4647)
Apr 25 15:31:23 localhost kernel: mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 4650)
Apr 25 15:31:23 localhost kernel: mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 4651)
Apr 25 15:31:23 localhost kernel: mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 4652)
Apr 25 15:31:23 localhost kernel: mce: CPU1: Package temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU7: Package temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU2: Package temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU4: Package temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU5: Package temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU6: Package temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU3: Core temperature/speed normal
Apr 25 15:31:23 localhost kernel: mce: CPU0: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 262)
Apr 25 16:01:28 localhost kernel: mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 4988)
Apr 25 16:01:28 localhost kernel: mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 4989)
Apr 25 16:01:28 localhost kernel: mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 4988)
Apr 25 16:01:28 localhost kernel: mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 4989)
Apr 25 16:01:28 localhost kernel: mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 4987)
Apr 25 16:01:28 localhost kernel: mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 4989)
Apr 25 16:01:28 localhost kernel: mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 4986)
Apr 25 16:01:28 localhost kernel: mce: CPU6: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU3: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU1: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU0: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU5: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU4: Package temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU2: Core temperature/speed normal
Apr 25 16:01:28 localhost kernel: mce: CPU7: Package temperature/speed normal
Apr 25 16:26:05 localhost kernel: floppy0: no floppy controllers found
Apr 25 16:33:43 localhost kernel: floppy0: no floppy controllers found
Apr 26 00:01:03 localhost kernel: mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 287)
Apr 26 00:01:03 localhost kernel: mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 4989)
Apr 26 00:01:03 localhost kernel: mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 4992)
Apr 26 00:01:03 localhost kernel: mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 4990)
Apr 26 00:01:03 localhost kernel: mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 4985)
Apr 26 00:01:03 localhost kernel: mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 4992)
Apr 26 00:01:03 localhost kernel: mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 4991)
Apr 26 00:01:03 localhost kernel: mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 4991)
Apr 26 00:01:03 localhost kernel: mce: CPU5: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU4: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU7: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU3: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU2: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU6: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU0: Core temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU1: Package temperature/speed normal
/var/log/messages.log
Apr 26 00:01:02 localhost named[483136]: compiled by GCC 9.2.0
Apr 26 00:01:02 localhost named[483136]: compiled with OpenSSL version: OpenSSL 1.1.1f 31 Mar 2020
Apr 26 00:01:02 localhost named[483136]: linked to OpenSSL version: OpenSSL 1.1.1f 31 Mar 2020
Apr 26 00:01:02 localhost named[483136]: compiled with libxml2 version: 2.9.7
Apr 26 00:01:02 localhost named[483136]: linked to libxml2 version: 20907
Apr 26 00:01:02 localhost named[483136]: compiled with libjson-c version: 0.11
Apr 26 00:01:02 localhost named[483136]: linked to libjson-c version: 0.11
Apr 26 00:01:02 localhost named[483136]: compiled with zlib version: 1.2.11
Apr 26 00:01:02 localhost named[483136]: linked to zlib version: 1.2.11
Apr 26 00:01:02 localhost named[483136]: threads support is enabled
Apr 26 00:01:02 localhost named[483136]: ----------------------------------------------------
Apr 26 00:01:02 localhost named[483136]: BIND 9 is maintained by Internet Systems Consortium,
Apr 26 00:01:02 localhost named[483136]: Inc. (ISC), a non-profit 501(c)(3) public-benefit
Apr 26 00:01:02 localhost named[483136]: corporation. Support and training for BIND 9 are
Apr 26 00:01:02 localhost named[483136]: available at https://www.isc.org/support
Apr 26 00:01:02 localhost named[483136]: ----------------------------------------------------
Apr 26 00:01:02 localhost named[483136]: adjusted limit on open files from 100000 to 1048576
Apr 26 00:01:02 localhost named[483136]: found 8 CPUs, using 8 worker threads
Apr 26 00:01:02 localhost named[483136]: using 7 UDP listeners per interface
Apr 26 00:01:02 localhost named[483136]: using up to 21000 sockets
Apr 26 00:01:02 localhost named[483136]: loading configuration from '/opt/bind9/etc/named.conf'
Apr 26 00:01:02 localhost named[483136]: reading built-in trust anchors from file '/opt/bind9/etc/bind.keys'
Apr 26 00:01:02 localhost named[483136]: using default UDP/IPv4 port range: [9000, 65000]
Apr 26 00:01:02 localhost named[483136]: using default UDP/IPv6 port range: [9000, 65000]
Apr 26 00:01:02 localhost named[483136]: listening on IPv6 interfaces, port 53
Apr 26 00:01:02 localhost named[483136]: listening on IPv4 interface lo, 127.0.0.1#53
Apr 26 00:01:02 localhost named[483136]: listening on IPv4 interface lo:1, 202.194.98.98#53
Apr 26 00:01:02 localhost named[483136]: listening on IPv4 interface eth2, 202.194.97.134#53
Apr 26 00:01:02 localhost named[483136]: listening on IPv4 interface eth3, 192.168.1.110#53
Apr 26 00:01:02 localhost named[483136]: generating session key for dynamic DNS
Apr 26 00:01:02 localhost named[483136]: sizing zone task pool based on 2845 zones
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: set up managed keys zone for view SDCERNET-user, file 'aafee67691ba58de.mkeys'
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: set up managed keys zone for view EDU-user, file 'd0cc50c716520045.mkeys'
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: set up managed keys zone for view TELECOM-user, file '2ff95fc2a86c198f.mkeys'
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: set up managed keys zone for view UNICOM-user, file 'ab05e628ff9a962c.mkeys'
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: set up managed keys zone for view CMCC-user, file '0d52e9253c2aad60.mkeys'
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: set up managed keys zone for view any-user, file 'any-user.mkeys'
Apr 26 00:01:02 localhost named[483136]: none:104: 'max-cache-size 90%' - setting to 57995MB (out of 64439MB)
Apr 26 00:01:02 localhost named[483136]: command channel listening on 127.0.0.1#953
Apr 26 00:01:03 localhost kernel: mce: CPU5: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU4: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU7: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU3: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU2: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU6: Package temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU0: Core temperature/speed normal
Apr 26 00:01:03 localhost kernel: mce: CPU1: Package temperature/speed normal
dmesg
[23726.715413] mce: CPU0: Package temperature/speed normal
[25531.425320] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 262)
[25531.425321] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 4988)
[25531.425322] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 4989)
[25531.425363] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 4988)
[25531.425364] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 4989)
[25531.425366] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 4987)
[25531.425367] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 4989)
[25531.425368] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 4986)
[25531.426350] mce: CPU6: Package temperature/speed normal
[25531.426352] mce: CPU3: Package temperature/speed normal
[25531.426353] mce: CPU1: Package temperature/speed normal
[25531.426353] mce: CPU0: Package temperature/speed normal
[25531.426354] mce: CPU5: Package temperature/speed normal
[25531.426355] mce: CPU4: Package temperature/speed normal
[25531.432933] mce: CPU2: Core temperature/speed normal
[25531.440852] mce: CPU7: Package temperature/speed normal
[27008.886266] floppy0: no floppy controllers found
[27466.693450] floppy0: no floppy controllers found
[54306.838281] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 287)
[54306.838282] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 4989)
[54306.838283] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 4992)
[54306.838284] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 4990)
[54306.838321] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 4985)
[54306.838323] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 4992)
[54306.838324] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 4991)
[54306.838325] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 4991)
[54306.839300] mce: CPU5: Package temperature/speed normal
[54306.839301] mce: CPU4: Package temperature/speed normal
[54306.839322] mce: CPU7: Package temperature/speed normal
[54306.839359] mce: CPU3: Package temperature/speed normal
[54306.839359] mce: CPU2: Package temperature/speed normal
[54306.839360] mce: CPU6: Package temperature/speed normal
[54306.845887] mce: CPU0: Core temperature/speed normal
[54306.853799] mce: CPU1: Package temperature/speed normal
Today's fault occurred at:
2023-04-26 01:25:00-@127.0.0.1
2023-04-26 01:37:00-@127.0.0.1
2023-04-26 01:38:00-@127.0.0.1
2023-04-26 01:39:00-@127.0.0.1
2023-04-26 01:40:00-@127.0.0.1
2023-04-26 01:45:00-@127.0.0.1
2023-04-26 01:47:00-@127.0.0.1
2023-04-26 01:48:00-@127.0.0.1
2023-04-26 01:49:00-@127.0.0.1
2023-04-26 01:50:00-@127.0.0.1
2023-04-26 02:09:00-@127.0.0.1
2023-04-26 02:10:00-@127.0.0.1
2023-04-26 02:14:00-@127.0.0.1
2023-04-26 02:15:00-@127.0.0.1
2023-04-26 03:04:00-@127.0.0.1
2023-04-26 03:05:00-@127.0.0.1
2023-04-26 04:03:01-@127.0.0.1
2023-04-26 04:04:01-@127.0.0.1
2023-04-26 04:05:01-@127.0.0.1
2023-04-26 04:18:01-@127.0.0.1
2023-04-26 04:19:01-@127.0.0.1
2023-04-26 04:20:01-@127.0.0.1
2023-04-26 05:54:01-@127.0.0.1
2023-04-26 05:55:01-@127.0.0.1
2023-04-26 07:19:01-@127.0.0.1
2023-04-26 07:20:01-@127.0.0.1
2023-04-26 07:40:01-@127.0.0.1
2023-04-26 08:00:01-@127.0.0.1
2023-04-26 08:24:01-@127.0.0.1
2023-04-26 08:25:01-@127.0.0.1
2023-04-26 08:43:01-@127.0.0.1
2023-04-26 08:44:01-@127.0.0.1
2023-04-26 08:45:01-@127.0.0.1
... At present, the frequency of BIND and CROND service freezes is about 30-40 minutes, ...
The system overheating events are 30 minutes apart. I'd call that a pretty good clue.Code: [Select]/var/log/kernel.log
Apr 25 15:31:23 localhost kernel: mce: CPU3: Core temperature above threshold, cpu clock throttled (total events = 3167)
----- Snip -----
Apr 25 16:01:28 localhost kernel: mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 262)
----- Snip -----
... lasting 4-5 minutes each time. However, this frequency does not match the mce information in the kernel file, which is really confusing. ...If the system is overheating and malfunctioning, I wouldn't necessarily expect all of the logs to make sense.
What's going on here? Your dmesg says there is no floppy controller.Code: [Select]----- Snip -----
Apr 25 16:26:05 localhost kernel: floppy0: no floppy controllers found
Apr 25 16:33:43 localhost kernel: floppy0: no floppy controllers found
----- Snip -----
... At present, the frequency of BIND and CROND service freezes is about 30-40 minutes, ...QuoteThe system overheating events are 30 minutes apart. I'd call that a pretty good clue.Code: [Select]/var/log/kernel.log
Apr 25 15:31:23 localhost kernel: mce: CPU3: Core temperature above threshold, cpu clock throttled (total events = 3167)
----- Snip -----
Apr 25 16:01:28 localhost kernel: mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 262)
----- Snip -----Quote... lasting 4-5 minutes each time. However, this frequency does not match the mce information in the kernel file, which is really confusing. ...If the system is overheating and malfunctioning, I wouldn't necessarily expect all of the logs to make sense.QuoteWhat's going on here? Your dmesg says there is no floppy controller.Code: [Select]----- Snip -----
Apr 25 16:26:05 localhost kernel: floppy0: no floppy controllers found
Apr 25 16:33:43 localhost kernel: floppy0: no floppy controllers found
----- Snip -----
Hit Ctrl-Alt-F1 to switch to the console and see if any mce errors are
being reported there. Hit Ctrl-Alt-F2 to switch back to the GUI.
/ipmitool sensor
Temp_GPU_SLOT0 | na | degrees C | na | na | na | na | 100.000 | 105.000 | na
Temp_GPU_SLOT1 | na | degrees C | na | na | na | na | 100.000 | 105.000 | na
Temp_GPU_SLOT2 | na | degrees C | na | na | na | na | 100.000 | 105.000 | na
S_Host_Power | 0x0 | discrete | 0x0180| na | na | na | na | na | na
Temp_Ambient | 35.000 | degrees C | ok | na | na | na | 43.000 | 46.000 | na
Temp_VR_CPU | 42.000 | degrees C | ok | na | na | na | 100.000 | 105.000 | na
Temp_VR_GT | na | degrees C | na | na | na | na | 100.000 | 105.000 | na
Temp_PCH | 44.000 | degrees C | ok | na | na | na | 81.000 | 85.000 | na
Temp_Outlet | 37.000 | degrees C | ok | na | na | na | 65.000 | 70.000 | na
DTS_CPU | 52.000 | degrees C | ok | na | na | na | na | na | na
State_CPU0 | 0x0 | discrete | 0x8080| na | na | na | na | na | na
Temp_CPU_0 | 48.000 | degrees C | ok | na | na | na | na | 100.000 | na
Tmargin_CPU | 14.000 | degrees C | ok | na | na | na | na | na | na
TJMAX_CPU | 100.000 | degrees C | ok | na | na | na | na | na | na
S_CPU_CH_A1 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
S_CPU_CH_A2 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
S_CPU_CH_B1 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
S_CPU_CH_B2 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
T_DIMM_A_1 | na | degrees C | na | na | na | na | 78.000 | 80.000 | na
T_DIMM_A_2 | na | degrees C | na | na | na | na | 78.000 | 80.000 | na
T_DIMM_B_1 | na | degrees C | na | na | na | na | 78.000 | 80.000 | na
T_DIMM_B_2 | na | degrees C | na | na | na | na | 78.000 | 80.000 | na
Speed_Fan_1 | 4120.000 | RPM | ok | na | 824.000 | na | na | 24926.000 | na
Speed_Fan_2 | 5253.000 | RPM | ok | na | 824.000 | na | na | 24926.000 | na
Speed_Fan_3 | 5459.000 | RPM | ok | na | 824.000 | na | na | 24926.000 | na
Speed_Fan_4 | 0.000 | RPM | ok | na | 824.000 | na | na | 24926.000 | na
Speed_Fan_5 | 0.000 | RPM | ok | na | 824.000 | na | na | 24926.000 | na
State_HDD_0 | na | discrete | na | na | na | na | na | na | na
State_HDD_1 | na | discrete | na | na | na | na | na | na | na
State_HDD_2 | na | discrete | na | na | na | na | na | na | na
State_HDD_3 | na | discrete | na | na | na | na | na | na | na
State_HDD_4 | na | discrete | na | na | na | na | na | na | na
State_HDD_5 | na | discrete | na | na | na | na | na | na | na
State_HDD_6 | na | discrete | na | na | na | na | na | na | na
State_HDD_7 | na | discrete | na | na | na | na | na | na | na
P12V_1_SCALED | 12.096 | Volts | ok | 10.240 | 10.496 | 10.752 | 12.992 | 13.248 | 13.504
P12V_2 | 12.032 | Volts | ok | 10.304 | 10.624 | 10.944 | 13.376 | 13.696 | 14.016
P1V05_PCH_AUX | 1.036 | Volts | ok | 0.826 | 0.861 | 0.896 | 1.099 | 1.127 | 1.155
P1V0_VCCST | 1.029 | Volts | ok | 0.819 | 0.854 | 0.896 | 1.099 | 1.141 | 1.176
P1V15_BMC_AUX | 1.134 | Volts | ok | 0.959 | 0.994 | 1.029 | 1.260 | 1.288 | 1.309
P1V2_BMC_AUX | 1.190 | Volts | ok | 1.008 | 1.043 | 1.078 | 1.316 | 1.344 | 1.379
P1V2_VDDQ | 1.197 | Volts | ok | 1.008 | 1.043 | 1.078 | 1.316 | 1.365 | 1.400
P1V8_PCH_AUX | 1.786 | Volts | ok | 1.523 | 1.570 | 1.617 | 1.974 | 2.021 | 2.068
P2V5_VPP | 2.556 | Volts | ok | 2.092 | 2.175 | 2.258 | 2.739 | 2.822 | 2.905
P3V3_AUX | 3.270 | Volts | ok | 2.805 | 2.888 | 2.971 | 3.635 | 3.718 | 3.801
P3V3_SCALED | 3.337 | Volts | ok | 2.805 | 2.888 | 2.971 | 3.635 | 3.718 | 3.801
P3V_BAT_SCALED | 3.052 | Volts | ok | 2.492 | 2.576 | 2.688 | 3.276 | 3.416 | 3.584
P5V | 5.074 | Volts | ok | 4.300 | 4.429 | 4.515 | 5.504 | 5.590 | 5.719
PVCC_CPU | 0.940 | Volts | ok | na | na | na | 1.889 | 1.993 | 2.096
PVCC_VCCIO_SEN | 0.938 | Volts | ok | 0.805 | 0.826 | 0.840 | 1.169 | 1.190 | 1.211
PSU1_Fan | na | RPM | na | na | na | na | na | na | na
PSU1_Input_Vol | na | Volts | na | na | na | na | na | na | na
PSU1_Output_Vol | na | Volts | na | na | na | na | na | na | na
PSU1_Pin | na | Watts | na | na | na | na | na | na | na
PSU1_Pout | na | Watts | na | na | na | na | na | na | na
State_PSU1 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
PSU1_Temp | na | degrees C | na | na | na | na | na | na | na
PSU2_Fan | na | RPM | na | na | na | na | na | na | na
PSU2_Input_Bol | na | Volts | na | na | na | na | na | na | na
PSU2_Output_Vol | na | Volts | na | na | na | na | na | na | na
PSU2_Pin | na | Watts | na | na | na | na | na | na | na
PSU2_Pout | na | Watts | na | na | na | na | na | na | na
State_PSU2 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
PSU2_Temp | na | degrees C | na | na | na | na | na | na | na
Redundancy_PSU | 0x0 | discrete | 0x0080| na | na | na | na | na | na
System_Power | 0.000 | Watts | ok | na | na | na | na | na | na
Watchdog2 | 0x0 | discrete | 0x0080| na | na | na | na | na | na
SEL_Full | 0x0 | discrete | 0x0080| na | na | na | na | na | na
BIOS_FW_Update | 0x0 | discrete | 0x0080| na | na | na | na | na | na
BMC_FW_Update | 0x0 | discrete | 0x0080| na | na | na | na | na | na
BMC_Reset | 0x0 | discrete | 0x0080| na | na | na | na | na | na
BMC_System | 0x0 | discrete | 0x0080| na | na | na | na | na | na
CPU_Power | 13.000 | Watts | ok | na | na | na | na | na | na
MEM_Power | 6.200 | Watts | ok | na | na | na | na | na | na
May be added nomce in grub.conf。However, the issue of system freezing still arises.... May be added nomce in grub.conf。However, the issue of system freezing still arises. ...You have 64 bit CPUs, don't you? That boot code only applies to 32 bit CPUs:
nomce [X86-32] Disable Machine Check ExceptionFound here:
acpi=ht
That is no longer a valid boot code:maxcpus=1
1. See if the overheating stops occurring.