WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: How to fix failed firmware loading in TC16  (Read 1735 times)

Offline CNK

  • Wiki Author
  • Sr. Member
  • *****
  • Posts: 365
How to fix failed firmware loading in TC16
« on: June 01, 2025, 09:57:53 AM »
In TC16 on x86_64 loading firmware-radeon.tcz then graphics-6.12.11-tinycore64.tcz causes the screen on my laptop to go blank. When I force a reboot with Ctrl-Alt-Delete, the display returns and I can see an error something like "radeon_cp: failed to load firmware: radeon/R520_cp.bin" before it reboots.

This is fixed (as root) either by creating this symlink:
Code: [Select]
ln -s /usr/local/lib/firmware /lib/firmware
Or by writing to this file in /sys:
Code: [Select]
echo -n /usr/local/lib/firmware > /sys/module/firmware_class/parameters/path

The kernel documentation doesn't list /usr/local/lib/firmware as a default firmware search path, so I'm not sure how this worked in TC15 on the same PC, but it seems to need help finding it now.

Hmm it seems this bug was reported already in alpha testing. I'll post this in "Tips and Tricks" then instead of the Bugs section, since I guess a fix wasn't found (although I wonder why not).

Offline nick65go

  • Hero Member
  • *****
  • Posts: 893
Re: How to fix failed firmware loading in TC16
« Reply #1 on: June 01, 2025, 01:22:50 PM »
Hi CNK. I think in TC15 (and previous versions) was like you said,
Code: [Select]
ln -s /usr/local/lib/firmware /lib/firmwaresearch for firmware in two places, because in some TCZ firmware blob was not placed by default in /lib/firmware of TCZ. The sim-link was already present in core.

My solution for intel firmware (generation 13 CPU & generation 12 GPU) is to bind (concatenate or remaster) firmware into core.gz. It is a few minutes task, used for your specific CPU /GPU. The rest of firmware can be loaded later on-demand (for WiFi, Bluetooth etc) because they are not critical (no blank / freeze of display).
BTW: I think when you compile the kernel a path, where to search for firmware, could be provided (I think I read that in gento linux), sorry if this is may be wrong info.
« Last Edit: June 01, 2025, 01:27:18 PM by nick65go »

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 15192
Re: How to fix failed firmware loading in TC16
« Reply #2 on: June 01, 2025, 02:45:55 PM »
I can’t check for a week or so, but I thought there might have been a kernel configuration setting for the firmware location?

Offline CNK

  • Wiki Author
  • Sr. Member
  • *****
  • Posts: 365
Re: How to fix failed firmware loading in TC16
« Reply #3 on: June 01, 2025, 07:15:34 PM »
@nick65go:
I checked in TC15 on the same laptop and there's no symlink nor is there a path set at /sys/module/firmware_class/parameters/path, but loading graphics-KERNEL works fine there whereas in TC16 I get stuck at a black screen unless I make one of those changes first.

@juanito:
You did mention some before.But based on the descriptions for EXTRA_FIRMWARE and EXTRA_FIRMWARE_DIR it does seem like they're only used for specifying firmware files built into the kernel.

The search paths are here in the kernel source code. The first path defaults to empty and is skipped unless it's set from the firmware module's "path" parameter below (such as when set at /sys/module/firmware_class/parameters/path or on the kernel command line). Unfortunately I don't see anywhere that a custom path is set from a build-time configuration value.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1616
Re: How to fix failed firmware loading in TC16
« Reply #4 on: June 02, 2025, 12:13:02 PM »
My solution for intel firmware (generation 13 CPU & generation 12 GPU) is to bind (concatenate or remaster) firmware into core.gz.
Another option is to put the firmware in its own initramfs. This approach makes it obvious to you how your setup is different from vanilla TCL, which is helpful to remember when upgrading. It's also convenient because you don't have to remaster  core.gz  every time you upgrade.

If you use grub, you specify the multiple initramfs's on the  initrd  line, like this:
Code: [Select]
initrd /boot/core.gz /boot/firmware.gz

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 15192
Re: How to fix failed firmware loading in TC16
« Reply #5 on: June 03, 2025, 04:15:54 AM »
If things work in tc-15.x and not in tc-16.x either something changed, there’s a kernel bug or the hardware driver has the firmware path hardcoded.

Offline nick65go

  • Hero Member
  • *****
  • Posts: 893
Re: How to fix failed firmware loading in TC16
« Reply #6 on: June 03, 2025, 04:45:14 AM »
Another option is to put the firmware in its own initramfs. This approach makes it obvious to you how your setup is different from vanilla TCL, which is helpful to remember when upgrading. It's also convenient because you don't have to remaster  core.gz  every time you upgrade.
Right, better. but with one little remark: the firmware.gz should have only the necessary blobs. I mean to avoid loading in RAM something like 16MB instead of just few KB of few files g-zipped. The  necessary files can be discovered from dmesg kernel log.
For the hard-coded path into firmware, someone can see "interesting things" with a hex viewer/editor, or strings, if modinfo is clueless.

[rant] We are captive to greedy IP -- intellectual property bullshit, close source firmware in software, in hardware UEFI, etc. So no security is possible in this environment. Like: you can not safety build your house on other people land... Maybe we just naively bet on attacker lack of resources/hardware/technology, or lack of knowledge, or security by obscurity (0 days bugs). [/rant]
« Last Edit: June 03, 2025, 05:04:10 AM by nick65go »

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1616
Re: How to fix failed firmware loading in TC16
« Reply #7 on: June 03, 2025, 07:53:25 AM »
Quote from: nick65go link=topic=27668.msg178910#msg178910
  only the necessary blobs.
Of course.

I run TCL on several different machines. I found it simpler to create a single firmware initramfs that contains all the blobs I need on all my various devices. Even so, my  firmware.gz  is only 1.1 MB. To me, the slight bloat is worth the simplicity of having the same  firmware.gz  on all my devices. I also have a tiny text file in there with my notes on the purpose of each blob.
« Last Edit: June 03, 2025, 07:58:13 AM by GNUser »

Offline CNK

  • Wiki Author
  • Sr. Member
  • *****
  • Posts: 365
Re: How to fix failed firmware loading in TC16
« Reply #8 on: June 03, 2025, 07:06:18 PM »
If things work in tc-15.x and not in tc-16.x either something changed, there’s a kernel bug or the hardware driver has the firmware path hardcoded.

If you want the specifics, it seems that there's an old and new way to do firmware loading. The old way was using udev, which appears to call the script "/lib/udev/firmware.sh", where in TC the "/usr/local/lib/firmware" path is specified. The new way is for the kernel to do it without udev using the code I pointed to before.

Apparantly this all started with a big argument in 2012, leading to this solution:
Quote
An alternative is to simply short out udev for firmware loading altogether. That is, in fact, what has been done; the 3.7 kernel will include a patch (from Linus) that causes firmware loading to be done directly from the kernel without involving user space at all. If the kernel is unable to find the firmware file in the expected places (under /lib/firmware and variants) it will fall back to sending a request to udev in the usual manner. But if the kernel-space load attempt works, then udev will never even know that the firmware request was made.

"CONFIG_FW_LOADER_USER_HELPER=y" and "CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y" in TC16's kernel configuration enables the fall-back method.

Either udev is failing during the fall-back method (an issue like what prompted the change), or someone's changed the kernel's Radeon GPU driver code to not call udev anymore (in drivers this can be done by calling request_firmware_direct() instead of request_firmware()). The kernel documentation for the uevent fall-back firmware loader says support for it was removed from Systemd udev in 2014, so it might easily be overlooked. The udev fall-back code is still there in the kernel though.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 12139
Re: How to fix failed firmware loading in TC16
« Reply #9 on: June 03, 2025, 11:46:46 PM »
Hi CNK
... If you want the specifics, it seems that there's an old and new way to do firmware loading. The old way was using udev, which appears to call the script "/lib/udev/firmware.sh", where in TC the "/usr/local/lib/firmware" path is specified. ...
I think that method may have been broken for some time:
Code: [Select]
tc@E310:~$ cat /etc/udev/rules.d/50-firmware.rules
# do not edit this file, it will be overwritten on update

# firmware-class requests, copies files into the kernel
#SUBSYSTEM=="firmware", ACTION=="add", RUN+="firmware --firmware=$env{FIRMWARE} --devpath=$env{DEVPATH}"
SUBSYSTEM=="firmware", ACTION=="add", RUN+="firmware.sh"
tc@E310:~$
There is no path included with "firmware.sh", so I don't think
it will be found.

Offline CNK

  • Wiki Author
  • Sr. Member
  • *****
  • Posts: 365
Re: How to fix failed firmware loading in TC16
« Reply #10 on: June 04, 2025, 12:40:34 AM »
In Devuan udev(7) it says that when using RUN "the program is expected to live in /lib/udev; otherwise, the absolute path must be specified" and /lib/udev is where I find firmware.sh.

Although another man page for udev online says /usr/lib/udev and there's no udev-doc.tcz for TC's udev.

I could try running udev debugging via SSH on the laptop as described here while loading graphics-KERNEL with the default firmware configuration if that's worthwhile?

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 15192
Re: How to fix failed firmware loading in TC16
« Reply #11 on: June 04, 2025, 03:57:37 AM »
If you could try the debugging that would be much appreciated.

Offline CNK

  • Wiki Author
  • Sr. Member
  • *****
  • Posts: 365
Re: How to fix failed firmware loading in TC16
« Reply #12 on: June 05, 2025, 08:29:03 AM »
It turns out the Radeon driver does time-out after 60sec when the firmware loading fails and the display returns to the text-mode terminal, I wasn't waiting long enough before.

udev correctly receives the firmware event:
Code: [Select]
KERNEL[366.250566] add      /devices/pci0000:00/0000:00:01.0/0000:01:00.0/firmware/radeon!R520_cp.bin (firmware)
ACTION=add
ASYNC=0
DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/firmware/radeon!R520_cp.bin
FIRMWARE=radeon/R520_cp.bin
SEQNUM=4721
SUBSYSTEM=firmware
TIMEOUT=60

In the syslog (with extra udev logging enabled using "sudo udevadm control --log-priority debug") you can see that firmware.sh seems to hang until udevd aborts due to the timeout:

Code: [Select]
Jun  5 10:57:11 TPT60 daemon.info udevd[4267]: seq 4721 running
Jun  5 10:57:11 TPT60 daemon.info udevd[4267]: device 0x3ff90340 has devpath '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/firmware/radeon!R520_cp.bin'
Jun  5 10:57:11 TPT60 daemon.info udevd[4267]: no db file to read /run/udev/data/+firmware:radeon!R520_cp.bin: No such file or directory
Jun  5 10:57:11 TPT60 daemon.info udevd[4267]: RUN 'firmware.sh' /etc/udev/rules.d/50-firmware.rules:5
Jun  5 10:57:11 TPT60 daemon.info udevd[4267]: device 0x3ffb9b70 has devpath '/devices/pci0000:00/0000:00:01.0/0000:01:00.0'
Jun  5 10:58:10 TPT60 daemon.err udevd[4308]: timeout '/sbin/modprobe -bv pci:v00001002d00007149sv000017AAsd00002005bc03sc00i00'

dmesg shows this:
Code: [Select]
[  360.789186] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[  365.319064] Linux agpgart interface v0.103
[  365.431641] ACPI: bus type drm_connector registered
[  366.176654] [drm] amdgpu kernel modesetting enabled.
[  366.176881] amdgpu: Virtual CRAT table created for CPU
[  366.176917] amdgpu: Topology: Add CPU node
[  366.240728] [drm] radeon kernel modesetting enabled.
[  366.241067] Console: switching to colour dummy device 80x25
[  366.241187] radeon 0000:01:00.0: vgaarb: deactivate vga console
[  366.241387] [drm] initializing kernel modesetting (RV515 0x1002:0x7149 0x17AA:0x2005 0x00).
[  366.241458] resource: resource sanity check: requesting [mem 0x00000000000c0000-0x00000000000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dbfff window]
[  366.241465] caller 0xffffffff814c86b3 mapping multiple BARs
[  366.241576] ATOM BIOS: M64CSP/M62CSP/M54CSP/M52CSP
[  366.241619] [drm] Generation 2 PCI interface, using max accessible memory
[  366.241628] radeon 0000:01:00.0: VRAM: 128M 0x0000000000000000 - 0x0000000007FFFFFF (64M used)
[  366.241635] radeon 0000:01:00.0: GTT: 512M 0x0000000008000000 - 0x0000000027FFFFFF
[  366.241662] [drm] Detected VRAM RAM=128M, BAR=128M
[  366.241666] [drm] RAM width 64bits DDR
[  366.241849] [drm] radeon: 64M of VRAM memory ready
[  366.241855] [drm] radeon: 512M of GTT memory ready.
[  366.241874] [drm] GART: num cpu pages 131072, num gpu pages 131072
[  366.242756] [drm] radeon: power management initialized
[  366.247987] [drm] radeon: 1 quad pipes, 1 z pipes initialized.
[  366.250396] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
[  366.250430] radeon 0000:01:00.0: WB enabled
[  366.250437] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000008000000
[  366.250647] radeon 0000:01:00.0: radeon: MSI limited to 32-bit
[  366.250680] [drm] radeon: irq initialized.
[  366.250699] [drm] Loading R500 Microcode
[  366.250752] radeon 0000:01:00.0: Direct firmware load for radeon/R520_cp.bin failed with error -2
[  366.250757] radeon 0000:01:00.0: Falling back to sysfs fallback for: radeon/R520_cp.bin
[  426.227308] radeon_cp: Failed to load firmware "radeon/R520_cp.bin"
[  426.227329] [drm:0xffffffffa0a9b3a8] *ERROR* Failed to load firmware!
[  426.227345] radeon 0000:01:00.0: failed initializing CP (-4).
[  426.227356] radeon 0000:01:00.0: Disabling GPU acceleration

I've uploaded full logs here, however there's a lot of noise from other udev events triggered when graphics-KERNEL is loaded. It seems to hang running /lib/udev/firmware.sh, or perhaps after it.
« Last Edit: June 05, 2025, 08:30:51 AM by CNK »

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 15192
Re: How to fix failed firmware loading in TC16
« Reply #13 on: June 11, 2025, 06:39:30 AM »
There is no path included with "firmware.sh", so I don't think
it will be found.

Could you try with the full path to firmware.sh and see if it helps?

i.e.
Code: [Select]
SUBSYSTEM=="firmware", ACTION=="add", RUN+="/lib/udev/firmware.sh"

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 12139
Re: How to fix failed firmware loading in TC16
« Reply #14 on: June 11, 2025, 07:38:39 AM »
Hi Juanito
It's starting to sound like the simplest fix would be to add a
link in rootfs.gz from /usr/local/lib/firmware to /lib/firmware.

If the kernel is loading firmware from /lib/firmware, why fight it.
Existing firmware extensions could remain as is, and just work.

Just my 2 cents.
« Last Edit: June 11, 2025, 08:04:46 AM by Rich »