WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: Sugestion: XZ compression in kernel + tcz; squashfs with xz + biger block size  (Read 13257 times)

Offline nick65go

  • Hero Member
  • *****
  • Posts: 799
A summary of storage space for ALL the tcz of TC10x64 as of 16/05/19

      size [bites]   File        count
4,341,977,088   Total        2458
   679,399,424   dev only    689
     87,920,640   doc only     114
     22,732,800   gir only      134
     54,812,672   lang only     60
   232,706,048   kernel          71
3,264,405,504   rest          1390

This shows that  less than 8GB storage (USB/HDD/SDD) is necesary to hold all packages.
Storage is not a big constrain anymore today (ok, lets not abuse it). Masive crowd linux users with a USB/HDD less than 8 GB, but can aford x64 CPUs?
Remark-1: storage is not a real problem that TCx64 solves today.

RAM is hard (near imposible) to upgrade in modern laptops. So lets optimize for RAM size :)
Any x64 CPU is (a little) more powerful than x86 CPU.
So a better compresion (and less size) for core.gz (=rootfs.gz + module.gz) will help.
Even better is, if bootcode is used  (by some users) to copy (not mount) some firmware*.tcz or *-tinycore64.tcz in RAM

Proposal for x64 (not for x86):
- add xz compression built into kernel for loading modules
- add xz compression in {mk,un}squashfs.tcz,
- eventualy biger (> 4kb) block size for larger extension (such as > 50 MB?)

 51,507,200   palemoon.tcz
 54,022,144   fpc.tcz
 54,099,968   wine.tcz
 63,582,208   openshot.tcz
 63,787,008   libclc.tcz
 66,842,624   firefox-ESR.tcz
 68,485,120   dotnet-runtime.tcz
 68,829,184   qt-5.x-webengine.tcz
 85,229,568   mono.tcz
 91,643,904   chromium-browser.tcz
101,867,520   lazarus.tcz
105,447,424   clang.tcz
147,341,312   libreoffice.tcz
147,984,384   timidity-freepats.tcz

Starting with x64 versions, the TC target moves to powerfuly CPU (decompresion speed is not limiting anymore).
- x64 started in 2004, 15 years ago. What target CPU/audience is TCx64 for in 2019?

The future atraction for TC come from using less RAM and diversity of appl.
Summary: The strong points for TinyCore x64 (versus other/bigger linux distributions) remains only modularity and small RAM-size packages.
« Last Edit: May 17, 2019, 05:44:45 AM by nick65go »

Offline jazzbiker

  • Hero Member
  • *****
  • Posts: 933
Hi, nick65go!

I've read the Corebook - it is my favorite book since not long ago! And there were explained the reasons to use gzip against xz for extensions. So if squashed extension is loop-mounted, it is not moved to ram as a whole, only compression overhead is consuming ram. And for gzipped extensions this overhead is much less than for xz. So using gzipped extensions decrease ram usage and increases disk space needed, everything as you want! Modern storage sizes will accept even uncompressed extensions.

Offline andyj

  • Hero Member
  • *****
  • Posts: 1020
Summary: The strong points for TinyCore x64 (versus other/bigger linux distributions) remains only modularity and small RAM-size packages.

Ever worked on embedded systems? Yes, they do exist, and yes, they are resource limited and frequently without local writable storage.

The RAM only design also works well for internet facing servers which face a continuous barrage of attack bots. If one were to be compromised or go down, a reboot would restore the root file system and executables to their new state (like the ghost twins in the Matrix).

Offline Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 14516
- add xz compression in {mk,un}squashfs.tcz,

squashfs-tools supports several compression formats, so you can try xz and others to see what difference it makes if you so wish  :)

Offline hiro

  • Hero Member
  • *****
  • Posts: 1217
lz4 might be more practical.

ram shouldn't be any issue.

also, don't use those big packages in the first place :P

btw, i always use copy2fs.flg. still plenty free. and my laptop is now nearly a decade old hahaha
« Last Edit: May 17, 2019, 06:54:08 AM by hiro »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 799
hi andyj, agreed, run from RAM is a very strong advantaje, sorry that  i missed to list it in TC strong points! (like in other distro like alpine linux, etc)

As for laptops (same as for embeded systems) i would rather prefer to boot with kernel+modules (xz compressed) and change-root in x86 for almost appls; exception maybe for appls such as servers(SQL, MariaDB), SMP parallel proccessing (librreoffice-calc) or video/audio editors (blending?) or virtual machine hosts (qemu, vbox).
Only where x64 make the difference visible.

hi jazzbiker, corebook is beautiful. it was written with tc x86 in mind  :); i agree, MOUNTING squash tcz does not matter much; but as i specified, LOADING them in RAM, then mounting from RAM (copy flags) will matter for a 6MB core. the overhead gz versus xz matters for x86, not so much for x64, do you agree?

So, I repeat my main dilema: What target CPU/audience is TCx64 for in 2019?
maybe i am wrong - something like intel core-duo 1.7ghz with 2 Mb RAM is a minimum taget audience for a useful MINIMUM linux from 2014 -> 5 years ago?
i take care of my devices, but maximum 5-6 years (they are designed to fail!, so the maintenace cost, -battery mostly- is not worth for me).
[ex: 30-50$ /batery /2years x 3-4 years = 100-200$ (a new pc), without considering video card overheating/fail, keyboard keys partialy functioning etc; sheet happnes over time.]
« Last Edit: May 17, 2019, 07:31:16 AM by nick65go »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 799
Quote
squashfs-tools supports several compression formats, so you can try xz and others to see what difference it makes if you so wish  :)
thanks juanito, I did not know that you updated squashfs.tcz to have xz inside as an option. it was not for very long time. i will check again.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11178
Hi nick65go
For more information, try this ::):
Code: [Select]
tc@box:~$ mksquashfs --help
SYNTAX:mksquashfs source1 source2 ...  dest [options] [-e list of exclude
dirs/files]

Filesystem build options:
-comp <comp> select <comp> compression
Compressors available:
gzip (default)
lzo
xz
zstd
-b <block_size> set data block to <block_size>.  Default 128 Kbytes
Optionally a suffix of K or M can be given to specify
Kbytes or Mbytes respectively
-no-exports don't make the filesystem exportable via NFS
-no-sparse don't detect sparse files
-no-xattrs don't store extended attributes (default)
-xattrs store extended attributes
-noI do not compress inode table
-noD do not compress data blocks
-noF do not compress fragment blocks
-noX do not compress extended attributes
-no-fragments do not use fragments
-always-use-fragments use fragment blocks for files larger than block size
-no-duplicates do not perform duplicate checking
-all-root make all files owned by root
-force-uid uid set all file uids to uid
-force-gid gid set all file gids to gid
-nopad do not pad filesystem to a multiple of 4K
-keep-as-directory if one source directory is specified, create a root
directory containing that directory, rather than the
contents of the directory

Filesystem filter options:
-p <pseudo-definition> Add pseudo file definition
-pf <pseudo-file> Add list of pseudo file definitions
Pseudo definitions should be of the format
filename d mode uid gid
filename m mode uid gid
filename b mode uid gid major minor
filename c mode uid gid major minor
filename f mode uid gid command
filename s mode uid gid symlink
-sort <sort_file> sort files according to priorities in <sort_file>.  One
file or dir with priority per line.  Priority -32768 to
32767, default priority 0
-ef <exclude_file> list of exclude dirs/files.  One per line
-wildcards Allow extended shell wildcards (globbing) to be used in
exclude dirs/files
-regex Allow POSIX regular expressions to be used in exclude
dirs/files

Filesystem append options:
-noappend do not append to existing filesystem
-root-becomes <name> when appending source files/directories, make the
original root become a subdirectory in the new root
called <name>, rather than adding the new source items
to the original root

Mksquashfs runtime options:
-version print version, licence and copyright message
-exit-on-error treat normally ignored errors as fatal
-recover <name> recover filesystem data using recovery file <name>
-no-recovery don't generate a recovery file
-quiet no verbose output
-info print files written to filesystem
-no-progress don't display the progress bar
-progress display progress bar when using the -info option
-processors <number> Use <number> processors.  By default will use number of
processors available
-mem <size> Use <size> physical memory.  Currently set to 503M
Optionally a suffix of K, M or G can be given to specify
Kbytes, Mbytes or Gbytes respectively

Miscellaneous options:
-root-owned alternative name for -all-root
-o <offset> Skip <offset> bytes at the beginning of the file.
Default 0 bytes
-noInodeCompression alternative name for -noI
-noDataCompression alternative name for -noD
-noFragmentCompression alternative name for -noF
-noXattrCompression alternative name for -noX

-Xhelp print compressor options for selected compressor

Compressors available and compressor specific options:
gzip (default)
  -Xcompression-level <compression-level>
<compression-level> should be 1 .. 9 (default 9)
  -Xwindow-size <window-size>
<window-size> should be 8 .. 15 (default 15)
  -Xstrategy strategy1,strategy2,...,strategyN
Compress using strategy1,strategy2,...,strategyN in turn
and choose the best compression.
Available strategies: default, filtered, huffman_only,
run_length_encoded and fixed
lzo
  -Xalgorithm <algorithm>
Where <algorithm> is one of:
lzo1x_1
lzo1x_1_11
lzo1x_1_12
lzo1x_1_15
lzo1x_999 (default)
  -Xcompression-level <compression-level>
<compression-level> should be 1 .. 9 (default 8)
Only applies to lzo1x_999 algorithm
xz
  -Xbcj filter1,filter2,...,filterN
Compress using filter1,filter2,...,filterN in turn
(in addition to no filter), and choose the best compression.
Available filters: x86, arm, armthumb, powerpc, sparc, ia64
  -Xdict-size <dict-size>
Use <dict-size> as the XZ dictionary size.  The dictionary size
can be specified as a percentage of the block size, or as an
absolute value.  The dictionary size must be less than or equal
to the block size and 8192 bytes or larger.  It must also be
storable in the xz header as either 2^n or as 2^n+2^(n+1).
Example dict-sizes are 75%, 50%, 37.5%, 25%, or 32K, 16K, 8K
etc.
zstd
  -Xcompression-level <compression-level>
<compression-level> should be 1 .. 22 (default 15)
tc@box:~$

Offline nick65go

  • Hero Member
  • *****
  • Posts: 799
thanks rich, my fault, i did not check it, lesson learnt!

Offline hiro

  • Hero Member
  • *****
  • Posts: 1217
try out zstd then, nick :)

Offline andyj

  • Hero Member
  • *****
  • Posts: 1020
Remember to keep your eye on the prize when it comes to compression. The goal might be the smallest size when transferring large files over the internet because bandwidth would be the limiting factor. When booting however, size is less important because you are only compressing it once then decompressing it many times so the fastest decompression would be preferred. If you want to make a convincing argument make a test we can run on the different systems we all have so we can get some data on the compression flavors.

Offline hiro

  • Hero Member
  • *****
  • Posts: 1217
thanks andyj. that's kinda what i would have preferred to have said, hehe.
https://code.fb.com/core-data/zstandard/

Offline curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 10957
Just wondering what the target case here is. "Copy tcz to ram, then mount from there" is not any mode designed by us. copy2fs unpacks to RAM. XZ compression would not save any RAM for copy2fs.

There are also slow x64 cpus, Atoms and Bobcats mainly.

I do think x64 does not have the constraints of x86, and could use higher block sizes or a different algo. The question is what makes the most sense.
The only barriers that can stop you are the ones you create yourself.

Offline nick65go

  • Hero Member
  • *****
  • Posts: 799
Thank you all who helped with their ideas/resons. Of course I can re-compress the tcz, re-master the core, etc; this is not the point, otherwise I could just recompile the kernel to tune-up to my CPU and it is done. but is not portable. and will not help the tc user community.

hi curaga; as you said
Quote
I do think x64 does not have the constraints of x86, and could use higher block sizes or a different algo. The question is what makes the most sense.
I just asked, to clarify for myself and maybe other users, what the experts/developers intend to do in future for TCx64, what target audience is covered, regarding resources (minimum CPU), speed gains (if any) by changing compresion algorithm for booting or internet band-wise, any lower RAM usage, etc.

EDIT: I am just a user, without multiple equipments/devices to test, neither the time (or knowledge) to PROPERLY test and measure the various data sets. Plus if the gains are too small, maybe it does not worth my wasted time. So I better trust experts/experiments.
« Last Edit: May 17, 2019, 10:41:38 AM by nick65go »

Offline hiro

  • Hero Member
  • *****
  • Posts: 1217
i already gave my rather naive opinion (that everything but lz4 will be rather too slow to utilize the System (in this case the bottleneck is beleived in the IO, not the cpu) to it's fullest available extent), BUT i think testing it out in practice would be a very very valuable experience. :)

about higher block size: if you *really* care about all performance scenarios, more modern algorithms utilizing the dictionary approach seems to be more practical than higher block sizes.

it's an exciting topic, so i hope you have fun :)
« Last Edit: May 17, 2019, 02:16:08 PM by hiro »