WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: tce/app-browser , sparing of storage or network  (Read 14321 times)

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1530
Re: tce/app-browser , sparing of storage or network
« Reply #45 on: March 02, 2023, 02:28:09 PM »
I changed the script to use http. This link will always point to the most recent version of the script:
http://gnuser.ddns.net/public/contributor.sh

Rich: Would you please delete the attachment to Reply #22 and change the link in Reply #28 to http (or, preferably, delete Reply #28 altogether)?

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1530
Re: tce/app-browser , sparing of storage or network
« Reply #46 on: March 02, 2023, 03:23:55 PM »
Hi nick65go. As you requested, I modified the script to tally number of extensions per contributor. Not my finest work--I'm sure there's a more elegant way to do it--but it works. Please grab the latest version of the script and try:
Code: [Select]
$ contributor.sh -tI also found a little bug: Script assumed Extension_by was present in every .info file. In fact, some .info files have Extension-by instead. Latest version of the script (Version 2.0) doesn't care either way.
« Last Edit: March 02, 2023, 03:25:34 PM by GNUser »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11702
Re: tce/app-browser , sparing of storage or network
« Reply #47 on: March 02, 2023, 03:48:54 PM »
Hi GNUser
Reply #22: Attachment removed. Note directing users to reply #45 for download link added.
Reply #28: Removed.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1530
Re: tce/app-browser , sparing of storage or network
« Reply #48 on: March 02, 2023, 04:31:29 PM »
Thank you, Rich. Looks great!

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #49 on: March 02, 2023, 05:55:06 PM »
in the mean time, my naive script shows what you found: misspelling; however I found even different \s, or \t instead of \s etc:
Code: [Select]
#!/bin/sh
LIST="/tmp/authors.lst"
[ -f $LIST ] && rm -f $LIST   

cd /home/tc/Downloads/infofiles/
for i in `ls -1 .` ; do grep "Extension_by:" $i >> "$LIST" ; done
echo "From total infos " `ls -1 . | wc -l` ", I found number of authors: `cat "$LIST" | wc -l` "

cat /tmp/authors.lst | sort | uniq -c > /tmp/authors2.lst
echo "total rows: " `cat / /tmp/authors2.lst | wc -l`
The results are:
Code: [Select]
tc@box:~$ ./A.sh
From total infos  2840 , I found number of authors: 2725
total rows:  90
tc@box:~
These show Extension_by vs. Extension-by misleading.

Code: [Select]
tc@box:~$ grep "gnuser" /tmp/authors2.lst
      6 Extension_by:   gnuser    # here is a TAB
     31 Extension_by:   gnuser    # here are 3 spaces
      1 Extension_by:  gnuser     # here are 2 spaces
tc@box:~$

or things like:
Code: [Select]
      1 Extension_by:       coreplayer2  # here an extra \t
      1 Extension_by:   Corplayer2           # with capital letter
      6 Extension_by:   coreplayer2
      1 Extension_by:   aus9, coreplayer2
      1 Extension_by:    aus9                # here an extra space
     13 Extension_by:   aus9 at gmx dot com

 PS: because is about only few rows of max 90 final result, maybe they could be corrected directly on server, in their info files.
« Last Edit: March 02, 2023, 05:58:47 PM by nick65go »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #50 on: March 02, 2023, 06:35:03 PM »
Hi nick65go. As you requested, I modified the script to tally number of extensions per contributor. Not my finest work--I'm sure there's a more elegant way to do it--but it works.
Thank you, I tried it and it works.
One small possible improvement: you extract the infofiles.tbz (which will be on my disk) into a folder below it. So you populate hdd with 2841 small files. And they you read them (from the slow disk) to process them. My suggestion is to un-gzip them also in /tmp/whatever, as you did with /tmp/contributors. So everything will be processed in RAM. And maybe is no need to display each recursive step in terminal, "parsing info file ### of 2839", send them to /dev/null.
Anyway we will not run this script every day, maybe once in a while, basically the speed should not mater.

All in all, an excellent job, and very fast accomplished by you.

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1530
Re: tce/app-browser , sparing of storage or network
« Reply #51 on: March 02, 2023, 06:57:16 PM »
Hi nick65go. These are all excellent suggestions, thank you. I've incorporated them into version 3.0 of the script.

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #52 on: March 03, 2023, 02:44:37 AM »
My initial idea was to:
- harness the power of zcat and pipes (as  I saw in core-remaster script), like zcat | grep ;
- use auto-sum power of "uniq -c" ;
- for the "grep" to use a special regular expression "^Extension:[-_]" aka start with "Exp and catch few variants of it.
- then use sed, to eliminate ^Extension[-_ ]by:[\s, \t] until author name.
Code: [Select]
zcat | grep $X | sed $Y > /tmo/Z.txt | sort | uniq -c
Maybe we need and "xarg" or "tee" somewhere, I do not know.
Until we need to parse a line with multiple authors, by awk. When I hit this, I abandoned.
Now that the task is done (for tcz.info files we have  its infofiles.tbz, THAT IS THE CORE), we can afford the luxury to fool around for version 4 :)
« Last Edit: March 03, 2023, 03:12:07 AM by nick65go »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #53 on: March 03, 2023, 04:56:40 AM »
Do you want to solve a riddle, who should be, the author or the updater for this?
https://mirrors.dotsrc.org/tinycorelinux/14.x/x86/tcz/netsurf-gtk3.tcz.info
I bet on "jazzbiker", do you?
"Jason W" created it in year 2010 (congratulations!) but "jazzbiker" maintains (STILL do the same work!) in 2021.
Title: netsurf-gtk3.tcz 
 Extension_by: Jason W
 Change-log: 2010/11/20 first version
            2019/01/17 updated 2.6 -> 3.8 (juanito)
            2020/02/16 updated 3.8 -> 3.9 (neonix)
Current:    2021/04/14 updated 3.9 -> 3.10 (jazzbiker)

« Last Edit: March 03, 2023, 05:00:34 AM by nick65go »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #54 on: March 03, 2023, 06:22:29 AM »
@GNUser: few small proposals for YOUR script:

1. a typo, provides.db instead of infofile, but is just in a commented row, no influence.

2. will be nice to say also date issued, aside version, in script help, no influence.

3. From total (ex: 2839 info files), to process only those WITHOUT *-doc.tcz,[134] *-locale.tcz [58], *-dev.tcz [752], *-gir.tcz [164];

so, calc 2839-134-58-752-164 ->1731; calc 1731/2839 ->0.60972, basically process only 61% info files.
Because the extension creator sent "tcz + doc.tcz + locale.tcz", so do not count it 3 times. (like when someone use "submitqc" procedure for a tcz, will create also its complementary files).
And "-dev.tcz + -gir.tcz" by juanito [1891 infos], (with calc 1891/2839 -> 66.6% of total) I hope he will not be upset. IMMV.

If my proposal is undesirable, then maybe to add a parameter in calling the script.
« Last Edit: March 03, 2023, 06:28:44 AM by nick65go »

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1530
Re: tce/app-browser , sparing of storage or network
« Reply #55 on: March 03, 2023, 08:02:52 AM »
Hi nick65go. I implemented your three proposals into version 3.5 of the script, including new -ts flag for "tally subset". The new flag makes my ranking go up ;D

No matter how we slice and dice, Juanito will always be supreme champion. It boggles my mind how he can be so prolific--it seems superhuman. I don't think any distro has a more prolific packager. Without him, TCL would just be a great concept, not a usable distro.
« Last Edit: March 03, 2023, 08:12:48 AM by GNUser »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #56 on: March 03, 2023, 09:23:47 AM »
Juanito will always be supreme champion. It boggles my mind how he can be so prolific--it seems superhuman. I don't think any distro has a more prolific packager. Without him, TCL would just be a great concept, not a usable distro.
"Render unto Caesar" -- the things that are Caesar's.  :) But in any war you need also the peons.
« Last Edit: March 03, 2023, 09:26:35 AM by nick65go »

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1530
Re: tce/app-browser , sparing of storage or network
« Reply #57 on: March 03, 2023, 09:34:25 AM »
Yes. Three cheers for Juanito!
I'm okay just being a grateful peon.

Do you want to solve a riddle, who should be, the author or the updater for this?
https://mirrors.dotsrc.org/tinycorelinux/14.x/x86/tcz/netsurf-gtk3.tcz.info
Haha. I'm also betting on jazzbiker but I will ask Google to train my contributor.sh script on billions of example .info files. Then, when the script becomes sentient, we can ask it for the correct answer :D
« Last Edit: March 03, 2023, 10:01:06 AM by GNUser »

Offline nick65go

  • Hero Member
  • *****
  • Posts: 839
Re: tce/app-browser , sparing of storage or network
« Reply #58 on: March 03, 2023, 10:52:25 AM »
Now on general subject in the title of post: "tce/app-browser , sparing of storage or network ".
With the addition of dep.db.gz (thank you curaga), and infolist.tbz (thank you GNUser) now we just need one more NEW file to be on the server : tree.db.gz

Because with all these SMALL files : dep.db.gz, md5.db.gz,size.gz, tags.gz,  provides.db.gz, and infolist.tbz + tree.db.gz, then tce-ab can be used almost off-line to look-up for tcz packages, showing the full tiny-core potential.

As long as we do not need to download any new tcz package, then the search can be done FULLY off-line, we will have all necessary meta-data. No need to contact the server to download *.tcz.tree;

ex: if I switch 5 times between searching the trees, first for abiword.tcz tree and then for gnumeric.tcz tree, and repeat this 5 times (ex: because I forgot something I saw previously), then the server will download those two trees for five times. (originally they are downloaded and, after seen them, they are deleted).

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11702
Re: tce/app-browser , sparing of storage or network
« Reply #59 on: March 03, 2023, 05:00:38 PM »
Hi nick65go
... ex: if I switch 5 times between searching the trees, first for abiword.tcz tree and then for gnumeric.tcz tree, and repeat this 5 times (ex: because I forgot something I saw previously), then the server will download those two trees for five times. (originally they are downloaded and, after seen them, they are deleted).
If you select the  Size  option a copy of the  .tree  file is saved in  /tmp.