Author Topic: UTF-8 (Read 11480 times)

bmarkus · « **on:** August 21, 2010, 01:13:00 PM »

As TC users are all around the world localozation is important. I see the fear that it will increase size when small footprint is one of the key differentiators of TC/MC. Fortunately the other key differentiator is the modularity.

Localization requires many thing. Tools to easy setup and change language, font, keyboard, availability of translated applications, moving to UTF-8, etc. For sure it is a long process. The good thing is that most of it can be done parallel to base development on extension level.

Moving ncurses out of the base is one of the crucial initial steps. I will submit the UTF-8 (wide) version of ncurses. I think to establish a small ad hoc team (task force) would be useful to start working on.

Feedbacks are welcome

tinypoodle · « **Reply #1 on:** August 21, 2010, 11:07:40 PM »

Perhaps a meta-extension (similar to compiletc) for localization could be useful?

curaga · « **Reply #2 on:** August 22, 2010, 12:21:41 AM »

I suppose a script to create a locale-archive would be useful. That way it has to be done only once, and doesn't contain bloat unnecessary to the user.

There is no stable FLTK with utf-8, but everything else should be possible.

hiro · « **Reply #3 on:** August 22, 2010, 02:40:50 AM »

It's always good if the core gets smaller. If that also makes UTF-8 support easier, great!
I don't like ncurses anyway.

curaga · « **Reply #4 on:** August 22, 2010, 04:56:31 AM »

Said script posted:

http://distro.ibiblio.org/pub/linux/distributions/tinycorelinux/3.x/tcz/getlocale.tcz.info

Quote

Title: getlocale.tcz
Description: Script to build customized locale support
Version: 1.0
Author: Curaga
Original-site: http://tinycorelinux.com
Copying-policy: GPLv3
Size: 4k
Extension_by: Curaga
Comments: To avoid having one huge locale support extension, this
      script builds a customized one according to your
      selections.
-
      If you load this in the console, the script is called
      getlocale.sh.
-
      The new extension will be called mylocale.tcz.
Change-log:
Current: 2010/08/22 Original

SvOlli · « **Reply #5 on:** August 22, 2010, 07:28:24 AM »

Hello Bélà,

moving ncurses out of base sounds like a good idea. The only program in the base that depends on ncurses is /usr/bin/tset, which itself is not executed from any other program of the base. Once the decision has been made, on what to with it (remove or replace), there's only one thing left to do before ncurses can be moved out: write a script that peeks into the extensions if they link against ncurses, and add a line to a .dep file, if they do so. Both requirements can be taken care of on a rather short timeframe, probably even during the current rc phase of 3.1.

Greetings,
SvOlli

curaga · « **Reply #6 on:** August 22, 2010, 07:33:55 AM »

The transition is done already

roberts · « **Reply #7 on:** August 22, 2010, 08:02:36 AM »

Quote

moving ncurses out of base sounds like a good idea. The only program in the base that depends on ncurses is /usr/bin/tset, which itself is not executed from any other program of the base

Already removed for 3.1rc2, and libncurses.so* and tset is already in the recently posted ncurses.tcz.

TaoTePuh · « **Reply #8 on:** August 22, 2010, 08:53:50 AM »

@curaga

Thank you for the script!

Maybe you want to change line 66 to avoid double entries in onboot.lst :

Code: [Select]

[ -z `grep mylocale.tcz  ${TCEDIR}/onboot.lst` ]  && echo "mylocale.tcz" >> ${TCEDIR}/onboot.lst

curaga · « **Reply #9 on:** August 22, 2010, 09:06:41 AM »

Quote from: TaoTePuh on August 22, 2010, 08:53:50 AM

@curaga

Thank you for the script!

Maybe you want to change line 66 to avoid double entries in onboot.lst :

Code: [Select]
[ -z `grep mylocale.tcz ${TCEDIR}/onboot.lst` ] && echo "mylocale.tcz" >> ${TCEDIR}/onboot.lst

Updated to check for that.

Syun · « **Reply #10 on:** August 22, 2010, 10:53:30 AM »

Hello. I'm Japanese. Please excuse my weakly English.

I make Japanese versions of Tiny Core Linux.
And I redistribute it for some Japanese user.
[^thehatsrule^: removed, remaster?]

I need to remaster tinycore.gz for several reasons.
Most of USB stick is formatted by VFAT.
It must be mounted with "codepage=932,iocharset=utf8" options to display Japanese character.

a) need Japanese NLS module for kernel.
b) need to change mount options in rebuildfstab.

I can't implement this with localized extensions.
And I need to change usbinstall script to change kernel boot options.

Code: [Select]

APPEND initrd=/boot/$ROOTFS.gz quiet waitusb=5:"$TARGETUUID" tce="$TARGETUUID" lang=ja_JP.utf8 kmap=jp106 tz=Asia/Tokyo noutc showapps

In tc-config, execute hwclock command before loading extensions.
Therefore it can't adjust clock after loading extensions.

I think remastering is smart method rather than localized extensions.
But I don't deny localized extensions.

Thank you.

tinypoodle · « **Reply #11 on:** August 22, 2010, 01:10:28 PM »

Quote from: Syun on August 22, 2010, 10:53:30 AM

In tc-config, execute hwclock command before loading extensions.
Therefore it can't adjust clock after loading extensions.

I don't think that is an aspect particular to localization, same is the case if hwclock is desired to be synced through net which may require drivers/firmware.
There are several threads in the forum referring to this subject.

eluring · « **Reply #12 on:** August 23, 2010, 12:06:35 PM »

@curaga
I am pleasantly surprised about the speed in which proposals are adopted and implemented.
Thank you for the script!

@all
Just an idea for improvement

Code: [Select]

71 echo "Reboot with lang=xx_YY to start using this."could avoid confusion
and create confusion: if UTF-8 is chosen that has to be lang=xx_YY.UTF-8

My proposal:
As you have been happily living without locales and the demand is UTF-8, cut the old braids (ISOs)

Code: [Select]

grep 'UTF-8/' < SUPPORTED | cut -d '/' -f 1 > SUPPORTED2.utf8number of locales decreases from 415 to 148,
we don`t need to copy bloated stuff, do we? Let TC stay tiny in extensions, too.
If every locale is UTF-8 only, we can can totally omit the term 'UTF-8' in getlocales.sh dialog and in bootcode lang.
Just use it!

Less is more (usability)

Discussion?

ps: The dialog looks very nice in Aterm, even nicer if long terms like 'de_AT@euro/ISO-8859-15' would fit into the frame. After proposed cut everything will be fitting.

curaga · « **Reply #13 on:** August 24, 2010, 06:23:16 AM »

Quote from: eluring on August 23, 2010, 12:06:35 PM

@curaga
I am pleasantly surprised about the speed in which proposals are adopted and implemented.
Thank you for the script!

I was bored

You're welcome.

Quote

@all
Just an idea for improvement
Code: [Select]
71 echo "Reboot with lang=xx_YY to start using this."could avoid confusion
and create confusion: if UTF-8 is chosen that has to be lang=xx_YY.UTF-8

Since there are various encodings for each locale, plus possible @euro, I doubt saying the utf-8 bit would help much.

Quote

My proposal:
As you have been happily living without locales and the demand is UTF-8, cut the old braids (ISOs)
Code: [Select]
grep 'UTF-8/' < SUPPORTED | cut -d '/' -f 1 > SUPPORTED2.utf8number of locales decreases from 415 to 148,
we don`t need to copy bloated stuff, do we? Let TC stay tiny in extensions, too.
If every locale is UTF-8 only, we can can totally omit the term 'UTF-8' in getlocales.sh dialog and in bootcode lang.
Just use it!

I don't think cutting on the selection would be that useful. The extension would not drop in size at all, since squashfs rounds up to the blocksize (4k). Users may want any supported locale.

Also, any utf-8 locale creates a much bigger locale-archive than the corresponding ISO-encoded one.

Quote

Less is more (usability)

Discussion?

ps: The dialog looks very nice in Aterm, even nicer if long terms like 'de_AT@euro/ISO-8859-15' would fit into the frame. After proposed cut everything will be fitting.

The three numbers (0 0 10) correspond to height, width, and items to show. If you wish to tune the middle number, post what number looks good.
It's in characters, so 22 would be a minimum.

eluring · « **Reply #14 on:** August 24, 2010, 09:33:38 AM »

Quote

The three numbers (0 0 10) correspond to height, width, and items to show. If you wish to tune the middle number, post what number looks good.

Code: [Select]

42 will satisfy any potential user from Estonia
'et_EE.ISO-8859-15/ISO-8859-15'

Tiny Core Linux

News:

Author Topic: UTF-8 (Read 11480 times)

bmarkus

UTF-8

tinypoodle

Re: UTF-8

curaga

Re: UTF-8

hiro

Re: UTF-8

curaga

Re: UTF-8

SvOlli

Re: UTF-8

curaga

Re: UTF-8

roberts

Re: UTF-8

TaoTePuh

Re: UTF-8

curaga

Re: UTF-8

Syun

Re: UTF-8

tinypoodle

Re: UTF-8

eluring

Re: getlocale.tcz

curaga

Re: getlocale.tcz

eluring

Re: UTF-8