WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: [SOLVED] Wrong locale / charset for FAT flash drive, maybe Unicode problem…  (Read 6261 times)

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Hi! I have a problem with the name of files on my FAT-formatted flash drive, which is annoying because I boot TC from there so I don’t get a chance to change mount options.

Whether in Linux or in Windows, I can give French file names without a problem, and all is fine. With TC however, I don’t really know where I stand.

On the one hand, urxvt+coreutils seems mostly happy with the current configuration; it’s unclear though (color added):
Quote
tc@tinycore:/mnt/sdb1/Tan$ ls
95 - Horaires du 01-10-12 au 12-07-13.pdf  Express Beaujoire Carquefou - Horaires du 01-10-12 au 12-07-13.pdf
C1 - Horaires du 01-10-12 au 12-07-13.pdf  Plan g?n?ral du r?seau 2012-2013.pdf
tc@tinycore:/mnt/sdb1/Tan$ ls P*
ls: cannot access P*: No such file or directory
tc@tinycore:/mnt/sdb1/Tan$ ls P<TAB>lan\ général\ du\ réseau\ 2012-2013.pdf
Plan g?n?ral du r?seau 2012-2013.pdf
tc@tinycore:/mnt/sdb1/Tan$ cp P<TAB>lan\ général\ du\ réseau\ 2012-2013.pdf test.pdf
tc@tinycore:/mnt/sdb1/Tan$ ls -l P<TAB>lan\ général\ du\ réseau\ 2012-2013.pdf test.pdf
-rwxrwxrwx 1 root root 959224 nov.  14  2012 Plan g?n?ral du r?seau 2012-2013.pdf
-rwxrwxrwx 1 root root 959224 juin  18 13:24 test.pdf
(<TAB> means I hit the TAB key for auto-completion)
More over, when an UTF-8 character appears in the command line, urxvt loses sync between the displayed and actual position of the caret (to the point where I can backspace into the prompt!), whereas aterm at least remains consistent.

On the other hand, some applications plainly seem to consider something is not as it should be:
— ROX tells me “This filename is not valid UTF-8. You should rename it.” But then, ROX itself seems to have problems managing the caret while renaming where such filenames appear.
— In Evince’s Open dialog, the filename is “Plan g[FFFD]n[FFFD]ral du r[FFFD]seau 2012-2013.pdf”.
— In MS Office, such a filename would appear as “Plan gnral du rseau 2012-2013”.
— And so on…

Some more info:
Code: [Select]
tc@rescue16g:~$ ls -l $(which ls)
lrwxrwxrwx 1 root root 38 juin  18  2013 /usr/local/bin/ls -> /tmp/tcloop/coreutils/usr/local/bin/ls
tc@rescue16g:~$ env | grep -iE 'lang|lc_'
LANG=fr_FR.utf8
« Last Edit: June 19, 2013, 04:23:50 AM by theYinYeti »

Offline curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 10963
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #1 on: June 18, 2013, 05:02:35 AM »
What are the mount options?
The only barriers that can stop you are the ones you create yourself.

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #2 on: June 18, 2013, 05:11:12 AM »
Hi curaga. Here is the requested information:
Code: [Select]
tc@tinycore:~$ mount | grep sdb
/dev/sdb1 on /mnt/sdb1 type vfat (rw,relatime,fmask=0000,dmask=0000,allow_utime=0022,codepage=cp437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
These are (or should be) the default TinyCore options, since /dev/sdb1 is my boot device, and it automatically mounted on boot.
« Last Edit: June 18, 2013, 05:13:09 AM by theYinYeti »

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #3 on: June 18, 2013, 05:14:46 AM »
Here’s some more information:
Code: [Select]
tc@tinycore:~$ tr '\0' ' ' </proc/cmdline
BOOT_IMAGE=/.boot/corelnx/vmlinuz waitusb=5 tce=LABEL=FLASH/.boot/corelnx/tce loop.max_loop=256 showapps lang=fr_FR.utf8 kmap=azerty/fr-latin1 tz=Europe/Paris noutc restore=LABEL=FLASH/.boot/corelnx lst=onboot.lst desktop=fluxbox

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11229
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #4 on: June 18, 2013, 05:18:22 AM »
Hi theYinYeti
Quote
Code: [Select]
tc@tinycore:/mnt/sdb1/Tan$ ls P*
ls: cannot access P*: No such file or directory
That might be caused by the  tab  character in the directory name. Embedding spaces in path and directory names
in my opinion can also sometimes cause problems.

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #5 on: June 18, 2013, 05:22:47 AM »
No no :D
The <TAB> I wrote above is for when I hit the TAB key for auto-completion ;-)
(original post updated with this tip)
« Last Edit: June 18, 2013, 05:25:48 AM by theYinYeti »

Offline tinypoodle

  • Hero Member
  • *****
  • Posts: 3857
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #6 on: June 18, 2013, 05:31:48 AM »
The boot device is totally irrelevant to your issue.
It's the "tce=" option which makes you end up with default mount options for vfat.
You would have to omit "tce=" and add "base" to be able to use explicit mount options suiting your purposes best.
"Software gets slower faster than hardware gets faster." Niklaus Wirth - A Plea for Lean Software (1995)

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #7 on: June 18, 2013, 05:42:42 AM »
@tinypoodle: If I use “base” instead of “tce=…”, then ok my drive won’t get auto-mounted, but then I won’t have my favourite apps loaded, and the whole TinyCore experience in general, will I?
When I wrote “boot device”, I meant it in the TC meaning of the word, not the BIOS meaning. I do want to use “tce=…”, unless there’s something I’m missing.

I know I can use the “file” command to know what encoding is used inside a file. I know of no such thing for checking actual filenames encoding… I mean, “ls -1 | hexdump -C” is nice’n’all, but how can I trust the output? There are layers I don’t master between what is actually on the disk and what is output by ls: kernel, mount command, ls command, what else…
By the way, the above command actually works and indicates ISO-8859-1 encoding (1 byte per “é”), not UTF-8 (there would be 2 bytes per “é”).

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #8 on: June 18, 2013, 05:45:49 AM »
I forgot to mention: I surmise the problem lies in the initrd (core.gz), and I’m perfectly willing to alter it to get rid of the problem (been there, done that). But I have to understand the source of the problem first, and where it comes from…

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 14554
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #9 on: June 18, 2013, 06:10:55 AM »
Does loading any of the eglibc* extensions improve things?

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #10 on: June 18, 2013, 06:19:52 AM »
Currently, “eglibc_gconv.tcz” gets auto-loaded by “mylocale.tcz”, which was generated by “getlocale.tcz”.
By the way, my current setup with “mylocale.tcz” is:
Code: [Select]
tc@tinycore$ locale -a
C
fr_FR
fr_FR@euro
fr_FR.iso88591
fr_FR.iso885915@euro
fr_FR.utf8
POSIX
and I choose “fr_FR.utf8” at boot.

I tried loading “eglibc_apps.tcz” as well, but there’s no change in subsequently opened terminal windows, even though I ran “sudo ldconfig” just before.

Offline curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 10963
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #11 on: June 18, 2013, 12:30:21 PM »
There's no way to autodetect the charset of a FAT stick. You could

- always mount FAT as UTF-8, by editing the fat32 default options in rebuildfstab
- boot with base and waitusb, having the mount command in bootsync.sh. You can call tce-setup manually to load your extensions in setups like this.
The only barriers that can stop you are the ones you create yourself.

Offline tinypoodle

  • Hero Member
  • *****
  • Posts: 3857
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem...
« Reply #12 on: June 18, 2013, 09:51:23 PM »
Be aware that having empty spaces in file names (.incl dirs) is a totally distinct issue.
"Software gets slower faster than hardware gets faster." Niklaus Wirth - A Plea for Lean Software (1995)

Offline theYinYeti

  • Full Member
  • ***
  • Posts: 177
    • YetI web site
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem…
« Reply #13 on: June 18, 2013, 11:40:09 PM »
Hi! I think I’ll go with curaga’s first option. My core.gz is already a custom one anyway. It is still interesting to know that the second solution exists, though, may anyone encounter the same issue.
Let me try…

Offline tinypoodle

  • Hero Member
  • *****
  • Posts: 3857
Re: Wrong locale / charset for FAT flash drive, maybe Unicode problem...
« Reply #14 on: June 19, 2013, 12:18:15 AM »
You could try to pack your modified 'rebuildfstab' (.incl full path) into a separate gzipped cpio archive and load that after core.gz, so it would overwrite the original, thus making upgrading more simple.
"Software gets slower faster than hardware gets faster." Niklaus Wirth - A Plea for Lean Software (1995)