WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: GTK2 and GTK3 cannot handle Unicode characters in file names  (Read 4714 times)

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
GTK2 and GTK3 cannot handle Unicode characters in file names
« on: December 09, 2019, 09:55:52 AM »
I'm on Pure64 10.1. In all GTK2 and GTK3 applications, I get an error ("Invalid file name") in file selection box whenever I include a Unicode character in the filename. Here's an example when trying to save a file while using a GTK3 application:



I'm using a Unicode locale:
Code: [Select]
bruno@box:~$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
BTW I do not have this problem with applications that use a different graphical toolkit: Xfe file manager uses fox toolkit and can create new files and directories with Unicode characters in their names without any problem.

What do I need to do in order for GTK2 and GTK3 applications in Pure64 to support Unicode characters in file names?

P.S. The same GTK2 and GTK3 applications running in different OS (e.g., Devuan) can handle Unicode characters in filenames. So my guess is that this is either a base system issue in Pure64 or else an issue with how GTK2 and GTK3 were compiled for Pure64.
« Last Edit: December 09, 2019, 10:01:41 AM by GNUser »

Offline curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 10957
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #1 on: December 09, 2019, 11:39:39 PM »
That "invalid file name" dialog in gtk2 comes from gtkfilechooserdefault.c. That then calls it because g_file_get_child_for_display_name from glib2 failed. Maybe this helps in pinpointing.
The only barriers that can stop you are the ones you create yourself.

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 14516
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #2 on: December 10, 2019, 12:00:45 AM »
Since the error occurs when accessing the file system, does your mount command need a utf8 switch?

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #3 on: December 10, 2019, 05:26:36 AM »
juanito - No, mount command in urxvt just works (even if there are Unicode characters in the mountpoint). No need for utf8 switch:

Code: [Select]
bruno@box:~$ which mount
/bin/mount
bruno@box:~$ ls -l /bin/mount
lrwxrwxrwx    1 root     root            12 Jun  9  2019 /bin/mount -> busybox.suid

bruno@box:~$ sudo mkdir /mnt/eĥoŝanĝoĉiuĵaŭde
bruno@box:~$ sudo mount /dev/sdc1 /mnt/eĥoŝanĝoĉiuĵaŭde
bruno@box:~$ ls /mnt/eĥoŝanĝoĉiuĵaŭde
somefile.txt   test1.txt      test2.txt      some_dir/
« Last Edit: December 10, 2019, 05:28:46 AM by GNUser »

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #4 on: December 11, 2019, 08:26:30 AM »
At lowest level, the C library in Pure64 can handle UTF-8 in filenames beautifully:

Code: [Select]
bruno@box:~$ cat test.c
#include <stdio.h>

int main(void)
{
FILE *fp;

fp = fopen("/home/bruno/eĥoŝanĝoĉiuĵaŭde.txt", "w+");
fprintf(fp, "hello world");
fclose(fp);
return 1;
}
bruno@box:~$ tce-load -wi compiletc
bruno@box:~$ gcc test.c
bruno@box:~$ ./a.out
bruno@box:~$ cat eĥoŝanĝoĉiuĵaŭde.txt
hello world

Also, it seems that GTK is UTF-8 aware out of the box, so the GTK2 and GTK3 libraries themselves are probably innocent. There's some helpful information here: https://wiki.gentoo.org/wiki/UTF-8

Since both GTK2 and GTK3 are affected, my guess is that the problem lies with one of their shared dependencies responsible for parsing filenames.

Alas, I know very little about the GUI stack. I will not be able to investigate this further without guidance from someone more knowledgeable.
« Last Edit: December 11, 2019, 08:44:30 AM by GNUser »

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #5 on: December 11, 2019, 08:33:15 AM »
P.S. The gentoo wiki (see link above) has a section for "filenames" in the UTF-8 page. It mentions convmv (not in Pure64 repository) and iconv (part of glibc_apps.tcz). Loading glibc_apps.tcz does not solve the problem.

One last tidbit: My GTK2 and GTK3 applications running in Pure64 can display UTF-8 characters just fine. I can also type UTF-8 characters into those applications. The problem seems limited to filenames.

« Last Edit: December 11, 2019, 08:36:32 AM by GNUser »

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #6 on: December 11, 2019, 03:49:28 PM »
A helpful GNOME/GTK user (developer?) suggested this issue may be due to unicode data tables missing from glib2. He recommended that I try installing glib2-locale.tcz: https://discourse.gnome.org/t/support-for-unicode-characters-in-gtk2-3-file-selection-box/2338/5

Alas, glib2-locale.tcz is not available in the Pure64 repository :( I put in an extension request.

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11178
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #7 on: December 11, 2019, 05:49:21 PM »
Hi GNUser
I just ran a  diff  between the 32 and 64 bit  base-locale.tcz  extensions and they appear to be identical. See if this works for you:
http://tinycorelinux.net/10.x/x86/tcz/glib2-locale.tcz

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #8 on: December 11, 2019, 07:40:24 PM »
Hi, Rich. Thank you for that. I loaded glib2-locale.tcz from x86 using the link you provided. It has no deleterious effects, but makes no difference with my issue :(

I looked at the contents of glib2-locale.tcz and noticed that my locale (en_US) is not represented. Could that have something to do with why it doesn't help with my issue?

I always use the lang=en_US.UTF-8 boot code and mylocale.tcz (which contains only en_US.UTF-8) is in my onboot.lst. You'd think that would be sufficient to provide the unicode data tables that glib2 needs? (Sorry if I sound clueless. Truth is I am clueless with regard to this issue.)
« Last Edit: December 11, 2019, 07:41:56 PM by GNUser »

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11178
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #9 on: December 11, 2019, 08:04:45 PM »
Hi GNUser
There's a previous version that contains:
Code: [Select]
usr/local/share/locale/en_GB/LC_MESSAGES/glib20.mo
usr/local/share/locale/en_CA/LC_MESSAGES/glib20.mo
found here:
http://tinycorelinux.net/9.x/x86/tcz/glib2-locale.tcz
I don't know how version dependent these files are. TC9 glib2 is version 2.52.3 but the locale file is 2.45.2.

I also noticed this:
http://tinycorelinux.net/10.x/x86/tcz/gtk2-locale.tcz

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #10 on: December 11, 2019, 08:31:55 PM »
Thanks, Rich. Loading gtk2-locale.tcz made no difference to my gtk2 application (thunderbird).

To hopefully take advantage of en_CA in glib2-locale.tcz I deleted mylocale.tcz from tcedir/optional/ then loaded getlocale.tcz then generated a new mylocale.tcz containing en_CA.UTF-8. Rebooted with the new locale then loaded glib2-locale.tcz. No difference.

Quite the stubborn problem!

P.S. Truth of the matter is that I only rarely need to create filenames with fancy characters. Where I get hit with this bug the most (sometimes daily) is when I try to print a webpage to PDF from my browser (iridium-browser, which uses GTK3) and there is a dash (not an ASCII hyphen/minus) somewhere in the webpage's name. As an example, check out duckduckgo.com's homepage. The dash in the page name triggers this bug--if I try to print the page to a PDF file, three "invalid file name" dialogs appear. The dialogs must be closed (in the correct order UGH) before I can delete the dash and proceed with printing.
« Last Edit: December 11, 2019, 08:40:18 PM by GNUser »

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 14516
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #11 on: December 11, 2019, 08:53:18 PM »
As I understand it, en_US is the default in linux, so no additional locale files are required.

Offline curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 10957
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #12 on: December 12, 2019, 12:43:08 AM »
Indeed, the .mo files are translated messages, not tables/etc. They would let you get glib2 errors in German for example.
The only barriers that can stop you are the ones you create yourself.

Offline GNUser

  • Hero Member
  • *****
  • Posts: 1343
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #13 on: December 12, 2019, 05:14:21 AM »
Thank you, juanito and curaga. So we can confidently eliminate glib2-locale as the missing piece here.

Online Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11178
Re: GTK2 and GTK3 cannot handle Unicode characters in file names
« Reply #14 on: December 12, 2019, 06:00:55 AM »
Hi GNUser
Thank you, juanito and curaga. So we can confidently eliminate glib2-locale as the missing piece here.
As well as any other  -locale  extension since they only translate pre canned messages (i.e "Invalid file name") compiled into
the library.