Tiny Core Linux

Tiny Core Base => TCB Talk => Topic started by: mocore on October 21, 2022, 04:26:40 PM

Title: tce/app-browser , sparing of storage or network
Post by: mocore on October 21, 2022, 04:26:40 PM

thinking about repository and related tools 

using app browser or tce to browse the extention info , request's separatly each .info/ect file viewed

and also afair separate request for each dependent tcz

i have in the past wandered ,..

0) could the time/bandwidth be reduced ?
if repo metadata (as mentioned in the below quote ) and  individual extention metadata files could be
downloaded once then used/viewed localy?.. as apposed to repeatedly requested

i guess this makes less diffrence to those with a local mirror or who know which packages are required

for the case of browseing the apps and searching/exploring posilities before testing ( if the build works at all / has required dep's ect )
waiting for many request to compleate takes time , at this point storage and bandwidth are abubdent for the most part , time is however constrained constantly

1) when downloading extentions & dependencies  could the initial conection be reused ??!!


which rased the question *exactly* what version of http is suported by tce/app-browser / tce-fetch.sh /wget and or the by the server
as this is sort of an implicit dependency of tc operation

one imho woth some consideration
and which seem a fair question(s) considering tc philosophies
 http://tinycorelinux.net/concepts.html - 'Easy, fast, and simple renew-ability and stability is a principle goal of Tiny Core.'
this appears to fall under speed consideration

i wander if anyone else has given this any thaught ( dont remenber finding any post with similar jist in the past )

all though i except ( statisticly) nothing has explicitly changed since 1.x ( apart from perhaps the server to nginx https://forum.tinycorelinux.net/index.php/topic,20499.0.html )
so it is statistically unlikely to do so in the future  :P


@mocore: I'm not certain the purpose you're after, but we already have a number of text based files found in the repo itself which may suit your purpose.
Bare in mind they're a part of the repository as opposed to the forum.

tcz/info.lst is basically a file listing of TCZs
tcz/sizelist is the above plus file lengths
tcz/tags.db is a g-zipped version of info plus tag words
tcz/provides.db is similar to an INI file with file listings for each extension
tcz/md5.db is, as you guessed it, the signatures for each file
* most of the items above are available in g-zip form, some are gz only.




Title: Re: tce/app-browser , sparing of storage or network
Post by: mocore on December 08, 2022, 09:47:15 AM

some similar/relevant observations
wrt network connections ect ... apparently inspired an alternative backed  for the core package manager
( at least that's the gist i get from scanning the thread )

http://forum.tinycorelinux.net/index.php/topic,12458.0.html - "corepkg - a new core package and updates manager"
to match all current md5 files against the ones in the repository, which generates a LOT of small web server requests to the repository and can take a long time, depending on how many extensions you have,
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on February 27, 2023, 06:41:22 AM
@mocore: I'm inclined to think like you.

To change a little the main scripts, like tce-fetch.sh for (tags.db.gz + provides.db.gz + sizelist.gz + md5.db.gz), such as it will first check (for example) and if the files are ALREADY in /tmp then it will not download them again  :)
 (eventually check also they date, hm.. not sure why these files could not-authorized arrived in  /tmp, so maybe not-necessary to touch/check they date).

[rant] Time is vital in our life! because is finit on this planet. Storage space not so. So speed (time involved) is the focus, for me. Even [hated] M$soft at one moment break-it with DOS compatibility, from win 9.x to advance in winNT.

I am not suggesting to break tcz compatibility regarding running locally on host machine/PC/laptop; however some tcz do not fit with core 486 CPU compatibility (see firefox which needs CPU with SSE instructions, etc).
But TC started in 2009, we are now in 2023, so servers (or their MIRRORS) storage and protocols advanced in the mean time. The assumptions today are not like they were 14 years ago. [/rant]
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on February 27, 2023, 08:29:39 AM
looking in /usr/bin/tce-ab, it download specific *.tcz.tree or *.tcz.dep for each on-demand; But on the server there is no file to contain all / concatenated  *.tcz.tree or *.tcz.dep;

So, when the used look-around in tce or it app-GUI, lets say for 100 individual programs (because the user just evaluate the TC software repository potential), the user will download (connect + disconnect) to server for 100 times (with wget).

I re-discovered (LOL, some will say: the cold water, :P ) a small program in TC14 also:
Code: [Select]
Title:          tdb.tcz
Description:    trivial database
Version:        1.2.12
Size: 48KB
Extension_by:   juanito
                ----------
Change-log:     first version
Current:        2013/11/21

I wonder (play with it) if is faster /more usefully instead of classic tce (awk, sed, grep), to process a local database, created from downloaded server files (tags.db.gz + provides.db.gz + sizelist.gz + md5.db.gz). Anybody can share their experience of tcz alternative-management?

Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on February 27, 2023, 09:52:03 AM
Anybody can share their experience of tcz alternative-management?
My tcz alternative for large applications that I only use rarely (e.g., libreoffice, gimp, qbittorrent, electrum) is AppImage. AppImages work perfectly, load quickly, and add nothing to boot time.

Two caveats:
1. Most AppImages these days are 64-bit only, so 64-bit TCL is needed
2. A few minor tweaks are needed to allow AppImages to work on TCL (see here (http://forum.tinycorelinux.net/index.php/topic,26113.msg167596.html#msg167596))
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on February 27, 2023, 10:37:13 AM
@GNUser: Thanks, good tip. But now you must trust TC and AppImage.
But.. you do not use TC applications. I was asking for something like scripts, to search TCZ, for example:
- show me all apps maintained by GNUuser, when I run from aterm
- list all tcz which depend on fltk-1.3.tcz, when nothing is installed (just kernel + busy-box).
PS:
- for big apps, like ex: ffmpeg (used by firefox), I have a big ffmpeg-all.tcz (containing all its dependencies, and it will populate also /usr/local/tc-local/ with null dep loaded, etc)

- for new versions of (lets say) VLC, I chroot and un-squash an Alpinelinux-VLC.sqfs, but usually I have also a partition with full Alpinelinux installed in its root.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on February 27, 2023, 11:09:47 AM
@GNUser: Thanks, good tip. But now you must trust TC and AppImage.
But.. you do not use TC applications. I was asking for something like scripts, to search TCZ, for example...
Hi nick65go. A few thoughts about this.

If you download AppImage directly from the developer, there is actually less trust needed on your part. If you were to use electrum.tcz (it is not available in the repo, but let's pretend) you would have to trust the electrum developers and the TCL contributor who put together the extension. This is a sensitive application because it manages one's bitcoin wallet. Downloading AppImage directly from developer eliminates the need to trust a middle man creating the TCL extension.

I actually do use TC extensions for most applications. I'm comfortable trusting the extension contributors in vast majority of cases.

I'm not sure how to search for extensions by maintainer (short of having your own TCL repo and grepping through the .info files). If you discover a way to do that, please share.

Your approach for big apps is ingenious but sounds labor-intensive. I'm too lazy to do something like create ffmpeg-all.tcz :) (BTW, there is a statically-linked--i.e., no dependencies--version of ffmpeg. It's available here (https://johnvansickle.com/ffmpeg/).)
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on February 27, 2023, 12:10:25 PM
@GNUser: Each with their paranoia security level :) mine is big I think.

1. G: "grepping through the .info files)"
-  so that I propose to the server admins small / insignificant sized files like tcz.info.db.gz, tcz.dep.db.gz, tcz.tree.db.gz etc, to allow users to build/query their own local database with META data for tcz.

2. G: "Your approach for big apps is ingenious but sounds labor-intensive."
- you only do it one time in life (OK, maybe few times), and only for few (let's say 10) tczs, like Xorg-3D, gtk3, ffmpeg. ex:
Code: [Select]
for i in $your_list; do unshashfs -F $i /tmp/x; done. So all tcz into same /tmp/x folder. Your list is seen from cpanel /stat, to see all loop AFTER all mandatory Xorg loops already loaded. Do not forget to populate /usr/local/tclocal with fake null dep and have a start script named like your extension (ex: gnumeric needs this script for gkt-icons/menu/schema init).

3. G:"there is a statically-linked--i.e., no dependencies--version of ffmpeg"
- No, all built-in libs will/ are already loaded by Xorg or other apps later, so waste of RAM.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on February 27, 2023, 12:26:35 PM
n: "I propose to the server admins small / insignificant sized files like tcz.info.db.gz..."
That's an excellent idea. It sounds like curaga is receptive to these files existing on the server. If you have the time and inclination, please build the search functionality into tce (CLI extension explorer). I think it would be a significant contribution to the distro.

n: "foo-all.tcz"
It is a very interesting idea. I do care about boot time, but not that much. I may give your idea a shot sometime.

n: "waste of RAM"
Good point. That's one of the drawbacks of all portable packaging strategies, whether AppImage or statically-linked binary or whatever else. Convenience always comes at a price.
Title: Re: tce/app-browser , sparing of storage or network
Post by: curaga on February 28, 2023, 02:17:45 AM
Info and tree combined files would have less of use, IMHO.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on February 28, 2023, 05:26:37 AM
Info and tree combined files would have less of use, IMHO.
Well, in the absence of a good html page (as was with tc9), the contactenation of *.tcz,info into sothing like tcz.info.db.gz will allow to search for few criteria/fields, like Version:, Current: Extension_by: Comments:;
Right, it is not for functionality in TC, but skimming on version, meta data.
I think the size will be small, but could you give a size for a tc13_x86 sum(*.tcz.info.) and its gzip size? just to see its impact for server. Thanks.
Title: Re: tce/app-browser , sparing of storage or network
Post by: Rich on February 28, 2023, 09:14:46 AM
Hi nick65go
... I think the size will be small, but could you give a size for a tc13_x86 sum(*.tcz.info.) ...
The sum of the apparent sizes is 3801251 bytes.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on February 28, 2023, 01:13:34 PM
Rich: "The sum of the apparent sizes is 3801251 bytes." sum(*.tcz.info.) ...
IF not already gzip-ed these 3.8MB, then will be 390 KB gzip-ed, as statistically after compression remain at 10 %. 
And... the app-GUi, or tce-ab needs an improvement to use it to search by field criteria.
So we can happy spy the [GN :) ]user nuggets contributions, which we/some uses. Frankly, I first looked for/ at curaga (and others TC founders) for small / nice gold nuggets before I jump to fat-equivalents.
Title: Re: tce/app-browser , sparing of storage or network
Post by: curaga on March 01, 2023, 02:24:54 AM
That's the thing, I expect searching by submitter to be a very rare use case.
Title: Re: tce/app-browser , sparing of storage or network
Post by: mocore on March 01, 2023, 09:41:30 AM
tree combined files would have less of use,

this makes me wander ..
...how are the .tree files created ? ( is/colud the script be published) ?

i thik iv read they are created by combining the .dep files ? *somehow*

given that now dep.db.gz exists
i guess with a/the mk_tree script it would be possible to (dynamicly)create .tree from dep.db ??...


i think tcz.info.db.gz or repo-info.tcz (all .info) or even! repo-meta.tcz collecting  all the .gz files ( could be easily scripted )
 would be nice/convenient  to have  all the repo meta data (&  perhaps meta data creation scripts!! ) accessible to tc system / user .
Title: Re: tce/app-browser , sparing of storage or network
Post by: curaga on March 01, 2023, 10:29:17 AM
The tree files are simply the deps recursed and KERNEL replaced. It used to be a awk script that created them, and then in 2016 I made a faster C++ version. There's not exactly anything secret in there, just not very high priority to share that.

Yes, tree files could be created locally now. For faster CPUs it may make sense, for slower ones it could be slower.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 01, 2023, 12:19:58 PM
In the past I downloaded ALL .info and all .dep; (of course not from main repo, but mirrors) and gzip-ed them, because they are small and waste space on disk (clusters 4K, files few bytes).
With little patience we get all we need. But why force each interested user repeat this task?
Anyway, the .dep will not change very often. But the info does, because version: field.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 01, 2023, 01:02:21 PM
Hi nick65go. I think it would be interesting to search extensions by maintainer. I have a private TCL mirror and also operate my own http server at home, so I will come up with an unofficial solution. Stay tuned.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 01, 2023, 01:09:51 PM
Rich, you are the man! 
This is not a competition, but.. in 2023 for TC14  (as an example), will show, for nosy people like me, who is the tcz maintainer, how many/often tcz they created etc. So we could see how tc evolves (from tc-5 to tc-10 to tc-14 etc).
Title: Re: tce/app-browser , sparing of storage or network
Post by: Rich on March 01, 2023, 01:38:21 PM
Hi nick65go
Hi nick65go
... I think the size will be small, but could you give a size for a tc13_x86 sum(*.tcz.info.) ...
The sum of the apparent sizes is 3801251 bytes.
Correction:
The sum of the apparent sizes for x86 is    1698939 bytes.
The sum of the apparent sizes for x86_64 is 2102312 bytes.
                                      Total 3801251 bytes.

Title: Re: tce/app-browser , sparing of storage or network
Post by: Greg Erskine on March 01, 2023, 04:15:19 PM
Can I place a seed of thought here for a future requirement for extensions?

I would like a mechanism to record which repository an extension was downloaded from. The existing scripts assume that the different repositories are always mirror copies, so there is not a problem.

If the repositories are not actual mirrors, you need to know which repository to update from.

Maybe just an extra field in the info file?

Title: Re: tce/app-browser , sparing of storage or network
Post by: Rich on March 01, 2023, 09:47:44 PM
Hi Greg Erskine
I'm not sure I follow what you are looking for. Repositories should always be
mirrors, unless maybe you are running private repository.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 01, 2023, 10:49:36 PM
we could see how tc evolves (from tc-5 to tc-10 to tc-14 etc).
Hi nick65go. I only sync my local mirror to the TCL version I'm currently using (TCL14 x86_64 right now). So unfortunately you will not be able to track evolution of TCL contributors over time with my unofficial solution.

Another caveat is that my wife says I have a computer problem and made me promise to skip TCL releases. So I'm stuck on TCL14 until TCL16 comes out. It's either that or the doghouse ;D

Anyway, I came up with an unofficial solution for you, the attached contributor.sh script. Anybody is welcome to use it. Please give it a shot and let me know if you find any bugs.

The script can now be downloaded from the link provided here:
https://forum.tinycorelinux.net/index.php/topic,25982.msg167922.html#msg167922

    [EDIT]: Attachment removed. Source for download link added.  Rich
Title: Re: tce/app-browser , sparing of storage or network
Post by: Greg Erskine on March 02, 2023, 03:30:59 AM
Hi Greg Erskine
I'm not sure I follow what you are looking for. Repositories should always be
mirrors, unless maybe you are running private repository.

Yes I am referring to private repositories.

For example, piCorePlayer has its own repository, containing common extensions and piCorePlayer specific extensions. But a user may load a piCore extension, ending up with a mixture of extensions from different repositories and no way to tell where they came from.
Title: Re: tce/app-browser , sparing of storage or network
Post by: patrikg on March 02, 2023, 03:55:15 AM
Maybe it's time for some cert checking with curl och wget, to ensure the packet come from the correct repo.
And warn the user when installing some packet from outside.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 09:35:05 AM
Another caveat is that my wife says I have a computer problem and made me promise to skip TCL releases. So I'm stuck on TCL14 until TCL16 comes out. It's either that or the doghouse ;D
Fortunately we do not have a doghouse, but my better half still has few methods to convince (blackmail) me that computer "work" is not a priority. Some wise person said: "happy wife, happy life"  :)
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 09:39:51 AM
Wisest saying ever ;)
Is the contributor.sh script to your liking?
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 09:43:32 AM
@GNUser, please provide here the full path to the contributor.sh script. I was busy (wasting time) with daily job. I did not convince her [yet] to let me retire early.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 09:57:14 AM
Oh, I see now. The "problems" was that I mostly read forum off-line, just digesting info.
So the post/replay number is Not shown if I am not log-in. 

The other "issue" is that it is a SHAME that good piece of scripts are ONLY seen in the forum; And IF someone can not login (captcha bugs, you now), the attachment is not seen. If the programmer has a "site" then a link, in the forum, to his/her git hub, maybe is permitted by "forum rules".

EDIT: from employer win10 (even in my spare time) the stupid filters /blocker said:
Code: [Select]
Sorry, you don't have permission to visit this site.
Website blocked as per Company policy.
Not allowed to access this file type
You tried to visit:http://forum.tinycorelinux.net/index.php?action=dlattach;topic=25982.0;attach=6373.


EDIT2: even stupid Google will not me attach a script to MY email (as a draft), saying retry / help (dead end).
I 7zip-ed a the script with a password of 25 characters, and if google A.I. can not break it then it forbids the A.7z attach. Oh boy.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 10:10:47 AM
Hi nick65go. Between the official depends-on.sh and the unofficial contributor.sh hopefully all your metadata dreams have come true ;D
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 10:12:46 AM
Hi nick65go. The script is attached to Reply #22.

Edit:
Here is a direct link to the script for folks who cannot login to download it from Reply #22. It's better to use the link below anyway, as it's easier for me to make any necessary changes to it:
https://gnuser.ddns.net/public/contributor.sh (https://gnuser.ddns.net/public/contributor.sh)
Code: [Select]
Sorry, you don't have permission to visit this site.
Website blocked as per Company policy.
Not allowed to browse Dynamic DNS Host category
You tried to visit:https://gnuser.ddns.net/public/contributor.sh
If you believe you received this message in error, please click here to request a review of this site.
hm.. the internet Zscaler in action. I need to wait until I get on my linux laptop.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 10:56:09 AM
That's your workplace's firewall interfering. I don't have a static public IP address so am stuck with DDNS.
When you are at home (or using workplace's visitor wifi) you should be able to reach my server.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 12:38:52 PM
Code: [Select]
tc@box:~/Downloads$ ./contributor.sh

  Find which extensions are maintained by a particular user.
  Usage example:
         contributor.sh juanito
 
   List extensions that contain juanito (case-insensitive) in Extension_by field
   To search everywhere in the .info file, use -e flag.


Code: [Select]
tc@box:~/Downloads$ ./contributor.sh GNU
tar: invalid magic
tar: short read
./contributor.sh: cd: line 59: can't cd to /etc/sysconfig/tcedir/infofiles: No such file or directory
cat: can't open 'disclaimer.txt': No such file or directory
tc@box:~/Downloads$

Code: [Select]
tc@box:~/Downloads$ ./contributor.sh GNUuser
tar: invalid magic
tar: short read
./contributor.sh: cd: line 59: can't cd to /etc/sysconfig/tcedir/infofiles: No such file or directory
cat: can't open 'disclaimer.txt': No such file or directory
tc@box:~/Downloads$


I miss the database or infos.
EDIT: I downloaded also the tbz, now "works". Read with less ./contributors.sh what file needs.
Code: [Select]
tc@box:~/Downloads$ ./contributor.sh GNUuser
Note: These results are for TCL14 x86_64 only. My mirror was last synced with upstream on 03/01/23. -GNUser
tc@box:~/Downloads$ ./contributor.sh Rich
AutoCursor.tcz.info
HideMouse.tcz.info
PicFormat.tcz.info
..
Note: These results are for TCL14 x86_64 only. My mirror was last synced with upstream on 03/01/23. -GNUser
tc@box:~/Downloads$
Thank you! Could you also keep these tbz for few old versions, or else I need to "solicit" the servers for all .info, in other versions.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 12:45:29 PM
Hi nick65go. The wget step on line 52 (which gets the database) is failing, so everything after that is also failing.

How did you get the script? I'm guessing from forum link. It seems you're still behind a firewall that's not allowing access to my server.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 12:49:45 PM
No firewall, i am on linux; I truncated the URL, go at your website
https://gnuser.ddns.net/public/
and manually copy two files, with firefox.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 12:53:34 PM
What happens if you run this command in a terminal?
Code: [Select]
wget -q -O /etc/sysconfig/tcedir/infofiles.tbz https://gnuser.ddns.net/public/infofiles.tbz
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 12:54:28 PM
I edit my previous replay #33, now it works. THANK YOU.
Now I can choose a contributor and I can see all his/her tcz extensions.
How about a list of all contributors with the number of tcz for each?
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 12:59:44 PM
I'm glad you sorted it out.

For the benefit of other potential users it would be nice to know what the problem was and how you fixed it. If user has to manually download the database then the script is not working as intended.

Can you please run the wget command in a terminal and share the output?

P.S. Yes, no problem--when I upgrade to a new TCL version I will rename old database file (e.g., to infofiles-tcl14.tbz) and will keep it on the server.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 01:14:05 PM
The pseudo-problem was my aproach:
You provided a URL link.I was in the forum with Firefox, so I downloaded the one file from your post, the sh script. I tried to run it and I got the errors. Then I read its content with cat / less and I saw what was missing. I download the missing file an moved it in my $TCE. job done. It works.

PS: the listed command does not work
Code: [Select]
tc@box:~$ wget -q -O /etc/sysconfig/tcedir/infofiles.tbz https://gnuser.ddns.net/public/infofiles.tbz
wget: error getting response: Connection reset by peer

tc@box:~$ wget https://gnuser.ddns.net/public/infofiles.tbz
Connecting to gnuser.ddns.net (73.198.149.97:443)
wget: error getting response: Connection reset by peer
tc@box:~$
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 01:17:53 PM
That command works just fine for me. There is a networking problem somewhere between you and my server. Oh, well. I'm glad you got it working. Enjoy!
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 01:35:22 PM
Now I can choose a contributor and I can see all his/her tcz extensions.
How about a list of all contributors with the number of tcz for each? [I know that I am a little picky]. No rush.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 01:40:00 PM
That command works just fine for me. There is a networking problem somewhere between you and my server. Oh, well. I'm glad you got it working. Enjoy!
SOLUTION: should not check for certificates
Code: [Select]
tc@box:~$ wget --spider --no-check-certificate https://gnuser.ddns.net/public/infofiles.tbz
Connecting to gnuser.ddns.net (73.198.149.97:443)
remote file exists
tc@box:~$ wget --no-check-certificate https://gnuser.ddns.net/public/infofiles.tbz
Connecting to gnuser.ddns.net (73.198.149.97:443)
saving to 'infofiles.tbz'
infofiles.tbz        100% |*******************   ***|  304k  0:00:00 ETA
'infofiles.tbz' saved
tc@box:~$

Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 01:56:22 PM
Hi nick65go. Can you please try this version? (If the problem has to do with SSL certificates, this should work.)
Code: [Select]
$ wget -q -O /etc/sysconfig/tcedir/infofiles.tbz http://gnuser.ddns.net/public/infofiles.tbz

To get number of extensions for a particular contributor, just do
Code: [Select]
$ contributor.sh juanito | wc -l
You need to subtract 1 from the number because one of the lines is my little disclaimer. Listing all contributors with number of extensions for each should not be too difficult. Feel free to send me a patch, or I'll tinker with it tomorrow when I have more time.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 02:11:30 PM
yes, with HTTP ( not HTTPS) it does not require certs option AND will overwrite the possible existing one file in destination.
Code: [Select]
tc@box:~$ wget -O /etc/sysconfig/tcedir/infofiles.tbz http://gnuser.ddns.net/public/infofiles.tbzConnecting to gnuser.ddns.net (73.198.149.97:80)
saving to '/etc/sysconfig/tcedir/infofiles.tbz'
infofiles.tbz        100% |******** ******|  304k  0:00:00 ETA
'/etc/sysconfig/tcedir/infofiles.tbz' saved
tc@box:~$

G: "or I'll tinker with it tomorrow when I have more time."
me: Better  :)
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 02:19:41 PM
Thank you very much for that, nick65go. I think I understand what's going on here.

Web browsers come bundled with CA certificates. CLI utilities such as wget don't--they rely on the certificates being present on your system. So if you don't have ca-certificates.tcz loaded, https will only work in your browser.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 02:28:09 PM
I changed the script to use http. This link will always point to the most recent version of the script:
http://gnuser.ddns.net/public/contributor.sh

Rich: Would you please delete the attachment to Reply #22 and change the link in Reply #28 to http (or, preferably, delete Reply #28 altogether)?
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 03:23:55 PM
Hi nick65go. As you requested, I modified the script to tally number of extensions per contributor. Not my finest work--I'm sure there's a more elegant way to do it--but it works. Please grab the latest version of the script and try:
Code: [Select]
$ contributor.sh -tI also found a little bug: Script assumed Extension_by was present in every .info file. In fact, some .info files have Extension-by instead. Latest version of the script (Version 2.0) doesn't care either way.
Title: Re: tce/app-browser , sparing of storage or network
Post by: Rich on March 02, 2023, 03:48:54 PM
Hi GNUser
Reply #22: Attachment removed. Note directing users to reply #45 for download link added.
Reply #28: Removed.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 04:31:29 PM
Thank you, Rich. Looks great!
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 05:55:06 PM
in the mean time, my naive script shows what you found: misspelling; however I found even different \s, or \t instead of \s etc:
Code: [Select]
#!/bin/sh
LIST="/tmp/authors.lst"
[ -f $LIST ] && rm -f $LIST   

cd /home/tc/Downloads/infofiles/
for i in `ls -1 .` ; do grep "Extension_by:" $i >> "$LIST" ; done
echo "From total infos " `ls -1 . | wc -l` ", I found number of authors: `cat "$LIST" | wc -l` "

cat /tmp/authors.lst | sort | uniq -c > /tmp/authors2.lst
echo "total rows: " `cat / /tmp/authors2.lst | wc -l`
The results are:
Code: [Select]
tc@box:~$ ./A.sh
From total infos  2840 , I found number of authors: 2725
total rows:  90
tc@box:~
These show Extension_by vs. Extension-by misleading.

Code: [Select]
tc@box:~$ grep "gnuser" /tmp/authors2.lst
      6 Extension_by:   gnuser    # here is a TAB
     31 Extension_by:   gnuser    # here are 3 spaces
      1 Extension_by:  gnuser     # here are 2 spaces
tc@box:~$

or things like:
Code: [Select]
      1 Extension_by:       coreplayer2  # here an extra \t
      1 Extension_by:   Corplayer2           # with capital letter
      6 Extension_by:   coreplayer2
      1 Extension_by:   aus9, coreplayer2
      1 Extension_by:    aus9                # here an extra space
     13 Extension_by:   aus9 at gmx dot com

 PS: because is about only few rows of max 90 final result, maybe they could be corrected directly on server, in their info files.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 02, 2023, 06:35:03 PM
Hi nick65go. As you requested, I modified the script to tally number of extensions per contributor. Not my finest work--I'm sure there's a more elegant way to do it--but it works.
Thank you, I tried it and it works.
One small possible improvement: you extract the infofiles.tbz (which will be on my disk) into a folder below it. So you populate hdd with 2841 small files. And they you read them (from the slow disk) to process them. My suggestion is to un-gzip them also in /tmp/whatever, as you did with /tmp/contributors. So everything will be processed in RAM. And maybe is no need to display each recursive step in terminal, "parsing info file ### of 2839", send them to /dev/null.
Anyway we will not run this script every day, maybe once in a while, basically the speed should not mater.

All in all, an excellent job, and very fast accomplished by you.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 02, 2023, 06:57:16 PM
Hi nick65go. These are all excellent suggestions, thank you. I've incorporated them into version 3.0 of the script.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 03, 2023, 02:44:37 AM
My initial idea was to:
- harness the power of zcat and pipes (as  I saw in core-remaster script), like zcat | grep ;
- use auto-sum power of "uniq -c" ;
- for the "grep" to use a special regular expression "^Extension:[-_]" aka start with "Exp and catch few variants of it.
- then use sed, to eliminate ^Extension[-_ ]by:[\s, \t] until author name.
Code: [Select]
zcat | grep $X | sed $Y > /tmo/Z.txt | sort | uniq -c
Maybe we need and "xarg" or "tee" somewhere, I do not know.
Until we need to parse a line with multiple authors, by awk. When I hit this, I abandoned.
Now that the task is done (for tcz.info files we have  its infofiles.tbz, THAT IS THE CORE), we can afford the luxury to fool around for version 4 :)
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 03, 2023, 04:56:40 AM
Do you want to solve a riddle, who should be, the author or the updater for this?
https://mirrors.dotsrc.org/tinycorelinux/14.x/x86/tcz/netsurf-gtk3.tcz.info (https://mirrors.dotsrc.org/tinycorelinux/14.x/x86/tcz/netsurf-gtk3.tcz.info)
I bet on "jazzbiker", do you?
"Jason W" created it in year 2010 (congratulations!) but "jazzbiker" maintains (STILL do the same work!) in 2021.
Title: netsurf-gtk3.tcz 
 Extension_by: Jason W
 Change-log: 2010/11/20 first version
            2019/01/17 updated 2.6 -> 3.8 (juanito)
            2020/02/16 updated 3.8 -> 3.9 (neonix)
Current:    2021/04/14 updated 3.9 -> 3.10 (jazzbiker)

Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 03, 2023, 06:22:29 AM
@GNUser: few small proposals for YOUR script:

1. a typo, provides.db instead of infofile, but is just in a commented row, no influence.

2. will be nice to say also date issued, aside version, in script help, no influence.

3. From total (ex: 2839 info files), to process only those WITHOUT *-doc.tcz,[134] *-locale.tcz [58], *-dev.tcz [752], *-gir.tcz [164];

so, calc 2839-134-58-752-164 ->1731; calc 1731/2839 ->0.60972, basically process only 61% info files.
Because the extension creator sent "tcz + doc.tcz + locale.tcz", so do not count it 3 times. (like when someone use "submitqc" procedure for a tcz, will create also its complementary files).
And "-dev.tcz + -gir.tcz" by juanito [1891 infos], (with calc 1891/2839 -> 66.6% of total) I hope he will not be upset. IMMV.

If my proposal is undesirable, then maybe to add a parameter in calling the script.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 03, 2023, 08:02:52 AM
Hi nick65go. I implemented your three proposals into version 3.5 of the script, including new -ts flag for "tally subset". The new flag makes my ranking go up ;D

No matter how we slice and dice, Juanito will always be supreme champion. It boggles my mind how he can be so prolific--it seems superhuman. I don't think any distro has a more prolific packager. Without him, TCL would just be a great concept, not a usable distro.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 03, 2023, 09:23:47 AM
Juanito will always be supreme champion. It boggles my mind how he can be so prolific--it seems superhuman. I don't think any distro has a more prolific packager. Without him, TCL would just be a great concept, not a usable distro.
"Render unto Caesar" -- the things that are Caesar's.  :) But in any war you need also the peons.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 03, 2023, 09:34:25 AM
Yes. Three cheers for Juanito!
I'm okay just being a grateful peon.

Do you want to solve a riddle, who should be, the author or the updater for this?
https://mirrors.dotsrc.org/tinycorelinux/14.x/x86/tcz/netsurf-gtk3.tcz.info (https://mirrors.dotsrc.org/tinycorelinux/14.x/x86/tcz/netsurf-gtk3.tcz.info)
Haha. I'm also betting on jazzbiker but I will ask Google to train my contributor.sh script on billions of example .info files. Then, when the script becomes sentient, we can ask it for the correct answer :D
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 03, 2023, 10:52:25 AM
Now on general subject in the title of post: "tce/app-browser , sparing of storage or network ".
With the addition of dep.db.gz (thank you curaga), and infolist.tbz (thank you GNUser) now we just need one more NEW file to be on the server : tree.db.gz

Because with all these SMALL files : dep.db.gz, md5.db.gz,size.gz, tags.gz,  provides.db.gz, and infolist.tbz + tree.db.gz, then tce-ab can be used almost off-line to look-up for tcz packages, showing the full tiny-core potential.

As long as we do not need to download any new tcz package, then the search can be done FULLY off-line, we will have all necessary meta-data. No need to contact the server to download *.tcz.tree;

ex: if I switch 5 times between searching the trees, first for abiword.tcz tree and then for gnumeric.tcz tree, and repeat this 5 times (ex: because I forgot something I saw previously), then the server will download those two trees for five times. (originally they are downloaded and, after seen them, they are deleted).
Title: Re: tce/app-browser , sparing of storage or network
Post by: Rich on March 03, 2023, 05:00:38 PM
Hi nick65go
... ex: if I switch 5 times between searching the trees, first for abiword.tcz tree and then for gnumeric.tcz tree, and repeat this 5 times (ex: because I forgot something I saw previously), then the server will download those two trees for five times. (originally they are downloaded and, after seen them, they are deleted).
If you select the  Size  option a copy of the  .tree  file is saved in  /tmp.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 03, 2023, 06:24:56 PM
Yes Rich. But the main point is that I prefer, if it is possible,to do not go online,just for searching / reading almost static info. Plus no wifi means more battery life, etc

Is like an html book which has chapters in separate html pages online, I prefer the full html documentation downloaded all once and read it in own time, even if / when internet is not available or expensive in that location. Anyway in the final I still need to download all pieces to read the full book, sometimes many times back and forth.

The point being that (even a not-malicious) foreign entity should not profile me, when I read, what I read, from which IP/location, how many time I read, how fast, etc.
If someone /A.I. is determined then it could in the end aggregate this info about me, but why should I do this easy for them.
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 07, 2023, 08:08:15 PM
GNuser built an excellent script contributor.sh in a record time, by my measures.

Unfortunately in win10, using Qemu without acceleration, the original script took near 2m 48s (168 seconds), for "-t" option. I build my script which runs in  near 8 seconds. All measured are after the tbz was downloaded already, to measure apple for apple. How I tested / debugged:

Code: [Select]
time zcat "$TCE"/infofiles.tbz | grep -i ^extension > /tmp/a1.txt           # real 0m 8.21s
time cat /tmp/a1.txt | sed s/^Extension.*:\\W*// > /tmp/a2.txt              # real 0m 0.03s
#echo "remains only: 14[,] 13[aus9 at gmx dot com] 5{/] 1[(]"
cat a2.txt | sed s/,*// | sed s/aus9.*/aus9/ | sed sX\\/XX > /tmp/a3.txt     # real 0m 0.02s
time awk `{x[$1]++} END{ for (i in X) {print X[i], i}}` a3.txt | sort -r     # real 0m 0.06s

the final script is this:
Code: [Select]
zcat "$TCE"/infofiles.tbz | grep -i ^extension | sed s/^Extension.*:\\W*// | sed s/,.*// | sed s/aus9.*/aus9/ | awk `{x[$1]++} END{ for (i in X) {print X[i], i}}` | sort -r
One small issues: the final numbers are not digits, so the sort is "alphabetical".At least I tried my hand on sed, and awk and regex.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 07, 2023, 10:34:37 PM
Hi nick65go. I had not encountered zcat before. Working with the .tbz archive would be a much more elegant solution than extracting the archive and working with 2000+ files. I'll try to refactor my script using zcat when I get a chance.

Your script is not working for me, unfortunately:
Code: [Select]
$ ./tally
./tally: line 3: {x[]++}: not found
BusyBox v1.36.0 (2023-01-17 09:43:30 UTC) multi-call binary.

Usage: awk [OPTIONS] [AWK_PROGRAM] [FILE]...

-v VAR=VAL Set variable
-F SEP Use SEP as field separator
-f FILE Read program from FILE

Does it require GNU awk? If so, I'll try to come up with a strategy that uses zcat and only what's available in base TCL.

Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 08, 2023, 02:24:44 AM
@GNUser: short answer, I think we can replace awk with:
Code: [Select]
cat /tmp/a3.txt | uniq -c | sort - rThe main gain was working only in RAM. Second gain came from pipes, then 3-rd to cleanup the strings "Extension* + not-Words" in front, and then not used strings ( , / ) at backed. I used awk just for sum of unique strings, so it can be replaced.
FYI: regarding your error, I think maybe a typo, your X[ ]++ should be corrected as X[$1]++ because in awk the $1 is the first field, filling of array X[] with indexes as string (not scalars), like X[aus9], X[gnuser] and count them in END block etc.
Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 08, 2023, 09:54:07 AM
Hi nick65go. Your zcat and uniq -c ideas translate into a 91.5% improvement in contributor.sh -t speed on my machine: 0.58 seconds now vs. 6.68 seconds before. This is the elegant solution I was looking for but couldn't find. Well done.

I revised the script to include your ideas and uploaded it to the usual location.

I'd like to achieve similar improvement in speed with contributor.sh -ts but it would be much more tricky. See what you can come up with :)



Title: Re: tce/app-browser , sparing of storage or network
Post by: GNUser on March 08, 2023, 10:16:08 AM
I'd like to achieve similar improvement in speed with contributor.sh -ts but it would be much more tricky. See what you can come up with :)
If we recreate the .tbz file with only the .info files we're interested in, then it's simple. I implemented this. Version bump to 5.1. Many thanks for your ideas. The gain in efficiency is dramatic.
Title: Re: tce/app-browser , sparing of storage or network
Post by: Rich on March 08, 2023, 10:36:15 AM
Hi nick65go
@GNUser: short answer, I think we can replace awk with:
Code: [Select]
cat /tmp/a3.txt | uniq -c | sort - r ...

You may want to take the advice from the GNU uniq --help message:
Quote
... Note: 'uniq' does not detect repeated lines unless they are adjacent.
You may want to sort the input first, or use 'sort -u' without 'uniq'.

Also, comparisons honor the rules specified by 'LC_COLLATE'. ...

If you are looking for a count, piping a list through  wc -l  will give that to you:
Code: [Select]
tc@E310:~$ cat /etc/init.d/tc-config | wc -l
631
tc@E310:~$
Title: Re: tce/app-browser , sparing of storage or network
Post by: nick65go on March 08, 2023, 11:40:50 AM
GNUser and Rich, thank you for your feed-back and lessons. I just propose ideas, and I am happy to let the proficient persons to implement them for TC community, if it is suitable and not burden for them  :)

The stake is low in this game here, it is mostly a brain challenge for elegance merely, in tiny(core) spirit.
 
Title: Re: tce/app-browser , sparing of storage or network
Post by: mocore on March 21, 2023, 08:07:22 PM
i wander if anyone else has given this any thought ( dont remember finding any post with similar gist in the past )

correction ftr
 @ Topic: tgrex.pl - tcz/scm full text info search and download tool
https://forum.tinycorelinux.net/index.php/topic,14237.msg80232.html#msg80232 -

Quote
tgre update - downloads all the .info files
Yikes, that's hitting the mirror hard :P
all .info files  compressed
  gz or even xz
seem to me at least use full to have locally
...
+ it could make app browser and ab faster to browse .info for each app
   if tcz-info.xz was downloaded once
   and the files viewed from the archive localy
   rather than downloaded individual while browsing through the apps
...
 .. tho i geus that having a local mirror of the repo is the alternative option

...
Info and tree combined files would have less of use, IMHO.

@curaga
 i wander for what reason info & tree combined would be considered less use ?  ???

at least one compressed file would be fewer connections than
all though perhaps *now* (for the server) that's less of a concern?

Now on general subject in the title of post: "tce/app-browser , sparing of storage or network ".
With the addition of dep.db.gz (thank you curaga)
+1
Title: Re: tce/app-browser , sparing of storage or network
Post by: curaga on March 22, 2023, 02:56:51 AM
The typical session may access just a couple infos and trees. People wanting the full set (for various analyses) are going to be quite rare.
Title: Re: tce/app-browser , sparing of storage or network
Post by: mocore on August 17, 2024, 02:34:00 PM
The typical session may access just a couple infos and trees. People wanting the full set (for various analyses) are going to be quite rare.

considering the above and also the conflicting aspect mentioned @ "Proposed improvement of repo and related tools" https://forum.tinycorelinux.net/index.php/topic,4786.msg25203.html#msg25203
 ( all though this thread and others seam to have move forward the core idea ;P of indexing this and that metadata  ) 

my thinking  is a compromise would be a script to create an extension containing "the full set" of repo meta data
for the rare ones!
out there ... leaching the repo's

I guess one of the main reasons for creating my own private TCZ directory mirror was that I wanted to know the content of all the extensions (and not only via the appbrowser on a per session basis).

... i guess the option of rsync
[1] https://unix.stackexchange.com/questions/366583/can-rsync-update-a-large-file-that-has-only-changed-partially-without-full-retra/366607#366607
appears to reduce bandwidth by default when changes to existing copy's of local files are downloaded