Tiny Core Linux
Tiny Core Extensions => TCE Talk => Topic started by: gutmensch on October 29, 2010, 09:12:18 AM
-
Heya,
since the man viewer is able to decompress gzip'ed man pages on the fly (at least it should be if configured correctly, says the Makefile), I'd suggest putting man pages into extensions only compressed and with suffix .gz like in other distros. What do you think? Maybe even the extension audit script could check for uncompressed man pages...
Best regards,
Robert
-
Perhaps a good point. To do that and preserve the symlinks that are broken by the renaming to .gz, Slackware uses this routine in the Slackbuild. I can also add a "check and repack if not gzipped" function in the audit routine. I will add and test it before using it on live packages.
if [ -d $PKG/usr/man ]; then
( cd $PKG/usr/man
for manpagedir in $(find . -type d -name "man*") ; do
( cd $manpagedir
for eachpage in $( find . -type l -maxdepth 1) ; do
ln -s $( readlink $eachpage ).gz $eachpage.gz
rm $eachpage
done
gzip -9 *.?
)
done
)
fi
-
Just an observation:
busybox gzip does not appear to have a "-9" option.
However, when specifying this option from command line, it simply gets ignored, it seems.
-
Actually, tonight I will test and see what difference it makes to gzip the individual files versus just letting the squashfs compression take care of it.
But gzipping the files themselves will help those who use the copy-to-filesystem loading as opposed to mounted mode.
-
Oh, and I would likely use gnu gzip in the audit routine to use the -9 option, but I will see what difference it makes too.
-
I made a script to compress man pages, I will post it here for testing for now and I will upload it to programming and scripting after any needed changes are made. I can also plug this into the audit script when checking. Basically it compresses man pages, fixes the symlinks, and rebuilds the package in the case uncompressed man pages are found.
Also, the man command supports bzip2, so that is what is used here.
I see there are various places that man directories are found in extensions, and I think we should standardize on one, and /usr/local/man would likely be the best.
##Script moved to programming and scripting - 11.4.10
-
Was it smaller than just bare squashfs?
-
I see a space savings of about 10% in the tests I ran for the size of the actual tcz. But there is an issue with hard links that exist in some doc extensions, I will have to work that in.
But for copy-2-fs mode the space savings would be a much higher margin. Since man pages take up only a small part of the average person's system, we will need to ponder if the extra trouble is worth it, but it at least is worth pondering.
-
As a n00b I would massively appreciate the integration of the man pages into apps. Often I have too google around for info and more often than not i run across, 'man xyz' and lament I can't do it.
-
There are many internet man pages readable from TCL.
EX:
http://www.manpagez.com/man/1/ls/
-
As a n00b I would massively appreciate the integration of the man pages into apps. Often I have too google around for info and more often than not i run across, 'man xyz' and lament I can't do it.
http://distro.ibiblio.org/pub/linux/distributions/tinycorelinux/3.x/tcz/man-pages.tcz.info
-
I am aware of these and others but specific apps man pages is what I was referring to.
-
check out the -doc.tcz extensions, that is where they are normally found.
I have an aversion to tampering with files in extensions submitted that is not absolutely necessary to, so I am still thinking if this is a road we want to go down just to gain 10% reduction in extension size of doc extensions. And with the hard link situation I ran into, it may present a host of potential issues that can creep up. Still pondering.
-
I see the compressing of the man pages just as one step and an objective to obtain a clear and optimal file structure in extensions. There are so many extensions, which don't have a -dev although they should have (e.g. jack), some with really big man pages (like lvm), some with libs in them, which should be outsourced, etc.
It would even be possible to patch the man page viewer to use xz, which could reduce the size even more and since it's in base, why not make use of it?
So in the tradition of trying to "make it better" than other distros I'd of course prefer xz for man pages ;-)
-
Ok, I have dealt with the hard link situation by converting them to softlinks to allow compression of the files. Tcl-doc was originally 1.187mb and after compressing man pages with gzip it is not 1015mb. A decent reduction in tcz size, 85% of the original. Also, for those who copy-2-fs, the space taken up by the extension in RAM is 3.2mb before compression, and 1.4mb after, around 44% of the original space taken up. However, I only noticed about a 1 or 2 percent space savings by using bzip2 instead of gzip, which is consistent with other reports I have read.
Most of the largest -doc extensions are made up of html pages, which we cannot compress. And man pages as well as html abound on the internet. But for sake of principal if it can be done simply and reliably, I think we can compress the man pages as a general rule. I would like for it to be used or tested a bit before trying to do any wholesale conversion of doc extensions. For now, those who are interested can use the script and check the result.
I have edited the above script with the changes, works with gzip or bzip2 by changing the variable, but may as well use gzip as no real benefit to bzip2.
-
That makes me wonder if it would be possible to squeeze out a bit more by using gzip and additionally invoke advdef (as recommended for initrd's).
-
Makes sense that advdef should work, can at least try it out.
Also, info pages can be compressed too and read from info readers, I will add that into the routine.
-
Added info files compressing, which can also be read with bzip2. But these normally larger files show a significant size reduction with bzip2, so still pondering the best overall compression format. Info files can benefit from bz2, and man pages won't suffer from it. Performance is not critical when compressing man pages or reading them, so I am leaning back to bz2 though the compression difference is insignificant for the smaller files. At least for the info pages.
-
Moved the script to programming and scripting.
-
I am going to compress the man/info pages of extension of mine I upload. Bzip2 gives a good advantage for the larger info files, and it would probably be good to standardize on one compression method for both man and info files though the advantage is much smaller for man pages. My preference is to standardize on bzip2, but it is open for discussion.
-
Just submitted updated man.tcz which now supports .gz .bz2 and .xz compressed man pages. xz is provided by base, so xz compression do not need any more dependency or the more advanced xz.tcz extension but gives smaller result. So I propose use of xz.
-
posted, thanks bela! I'll go with xz for man pages too (and already have)... makewhatis and apropos didn't work in man-1.6f, would be worth figuring out if they do now :)
-
posted, thanks bela! I'll go with xz for man pages too (and already have)... makewhatis and apropos didn't work in man-1.6f, would be worth figuring out if they do now :)
Well, lets collect some feedback and I will check the other uses. I'm using man pages only not apropos, so to be honest I did not test it :)
-
I think man pages were once a very powerful tool. It's definitely the way to document a programming environment with all it's tools and libraries.
But now that modern application's man pages are often longer than a full-blown book, I seldom find meaningful documentation inside them.
Because of this and various "standards", i.e. gzipped man pages, html files and other undocumented formats, I have stopped caring and now use today's equivalent of grep to search through dozens of exabytes of data first, and then get all the information I need, but more up-to-date, shorter and easier to read.
Also, now that nobody knows how to read all the different types of manpages anymore, developers have started to simply include all the documentation inside their executables. They now have options called --usage, --full-usage, --short-help, or --long-help. Try to compress that away...
-
developers have started to simply include all the documentation inside their executables. They now have options called --usage, --full-usage, --short-help, or --long-help. Try to compress that away...
What about UPX?
-
Also, now that nobody knows how to read all the different types of manpages anymore, developers have started to simply include all the documentation inside their executables. They now have options called --usage, --full-usage, --short-help, or --long-help. Try to compress that away...
IMHO it is not the case. Nothing wrong with manpages regarding formats, they are still readable on 'big' distros I use. What is happening that man pages are loosing importance due to available storage capacity, bandwidth and alternative formats, like pdf or HTML. Novadays it is much more easier and common as well practical to create pdf from any system on any platform, while creation of man pages requires extra work.
However, man pages are still alive and provided by many packages. It is good question wheter they are uptodate or not. Specially usable for the classic LINUX tools.
-
Look at the busybox manual as an example for what I said.
What you say is true of course as I've also been using a lot of man pages on debian. Most tools still have them, although some are getting too long or outdated. And I don't want a different reader for every program. Even a web search is easier than that.
But creating man pages is not by any means difficult. Why should it be any extra work?
-
But creating man pages is not in any means difficult. Why should it be any extra work?
I'm writing docs in WORD and publishing in pdf. Do you have a 'Man page printer' ?
-
I don't know what you mean, nroff, troff?