Tiny Core Linux

dCore Import Debian Packages to Mountable SCE extensions => dCore X86 => Topic started by: sm8ps on January 02, 2019, 10:24:42 AM

Title: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 02, 2019, 10:24:42 AM
Greetings and happy new year! I have not been very successful with my dCore adventures last year and so I am hoping that 2019 will allow more break-throughs, for myself and also for everybody else!

I have been tinkering with dCore upgrades for a long time and while at it I had quite enough opportunities for wondering why sce-update takes so long. It even seems that it takes more time to update a package than to simply re-import it. Today I decided to take a closer look on a netbook with the a base extension called X-LIST that contains the following packages:
Code: [Select]
pm-utils, graphics-4.14.10-tinycore, xorg, xorg-intel, xserver-xorg-input-synaptics,
arandr, openbox, flwm, lxpanel, xinput, dbus, dbus-x11, suru-icon-theme

Including dependencies, there are 380 packages to be imported. I compared updating with 'sce-update -rn' to (re-)importing with 'sce-import -rln'. I repeated the test twice in order to eliminate the effect of downloading (because packages are taken from 'tce/import/debs/' if available).

Furthermore, I compared two different sets of repositories under '/opt/debextra/': A) only three PPAs, B) the same three PPAs plus all the following Ubuntu repositories:
Code: [Select]
bionic-backports-main, bionic-backports-multiverse,bionic-backports-restricted, bionic-backports-universe,
bionic-main, bionic-multiverse, bionic-restricted, bionic-universe,
bionic-updates-main, bionic-updates-multiverse, bionic-updates-restricted, bionic-updates-universe
The PPAs just happened to be there and do not play any particular role.

Here the results:
Code: [Select]
sce-update -rn X-LIST    A)  4'55"    B) 13'29"
sce-import -rln X-LIST   A)  1'59"    B)  6'54"
Interpretation:
This is on dCore-bionic:2018.08.10.22.24. Grep is imported and thus not Busybox-grep.

Is this all expected behavior?
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 02, 2019, 08:40:49 PM
Hi sm8ps

The short answer is yes, sce-update by it's nature takes longer than sce-import.  And extra repos do add to that time for both sce-update and sce-import. 

Much effort has been spent in streamlining sce-update and sce-import for performance.  100% accuracy is essential, but performance is also important.  I have tried different routines using variations of awk, grep, sed, and other commands in the scripts to speed things up but the current is the best performing way found so far.




Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 03, 2019, 01:26:11 PM
Thanks for your answer, JasonW! I am aware that there is much effort behind all the sce-tools. Nevertheless, I am wondering why to bother with using sce-update at all? Probably because it does tell when there are no updates necessary as opposed to sce-import. Even so, could a combination of sce-update check and "blindly" sce-importing anew perform better? -- This is all very naive as I do not know the code behind the tools. So I would be grateful for another short answer.

Cheers!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 04, 2019, 08:27:47 PM
Here is my result from using sce-update on my SCE directory, all SCEs being checked, with below the size of my SCE directory, which is about 9.2 GB not counting the update subdirectory:

Code: [Select]
jason@box:/mnt/sda1/tceimport-bionic64/sce$ du -h
1.8G ./update
11G .


Below is the sce-update of the SCE directory, with checking all SCEs and no re-importing:

Code: [Select]
jason@box:~$ time sce-update -an
* Using nice level 19.
* Using the -a option.
* Using the -n option.
* DEBINX sync based on ubuntu bionic.
* Debian index sync: ubuntu_bionic_main_amd64_Packages
* Debian security index sync: ubuntu_bionic_security_amd64_Packages
* Using repo: http://security.ubuntu.com/ubuntu bionic main
Checking all system SCEs for updates:
 
         200-bionic.sce update check.
         alsa-utils.sce update check.
         atril.sce update check.
         audacity.sce update check.
         avidemux2.6-qt.sce update check.
         bionic-desk.sce update check.
         brasero.sce update check.
         cdrdao.sce update check.
         cheese.sce update check.
         chromium-browser.sce update check.
         dCore-chrome-stable-installer.sce update check.
         dCore-firefox-installer.sce update check.
         dCore-google-chrome-stable-installer.sce update check.
         dCorePlus-bionic64.sce update check.
         dCore-usbinstall.sce update check.
         devede.sce update check.
         dosfstools.sce update check.
         e17.sce update check.
         e3.sce update check.
         emelfm2.sce update check.
         enlightenment.sce update check.
         evince.sce update check.
         ffmpeg.sce update check.
         file.sce update check.
         firefox.sce update check.
         firefox-latest.sce update check.
         fonts-freefont-ttf.sce update check.
         gdb-dbg.sce update check.
         git.sce update check.
         gnome-screensaver.sce update check.
         google-chrome.sce update check.
         gparted.sce update check.
         graphics-4.14.10-tinycore64.sce update check.
         grep.sce update check.
         icewm.sce update check.
         imagemagick.sce update check.
         k3b.sce update check.
         kernel-all-4.14.10-tinycore64.sce update check.
         leafpad.sce update check.
         libbullet2.87.sce update check.
         libgail-3-0.sce update check.
         libgconf2-4.sce update check.
         libgles2.sce update check.
         libgtk-3-bin.sce update check.
         libk3b7-extracodecs.sce update check.
         libmadlib.sce update check.
         libmad-ocaml.sce update check.
         libmagickwand-6.q16-3.sce update check.
         libopenjp2-7.sce update check.
         libpaper1.sce update check.
         libpaper-utils.sce update check.
         libpoppler73.sce update check.
         libpoppler-glib8.sce update check.
         libpostproc54.sce update check.
         libraw16.sce update check.
         libreoffice-gtk3.sce update check.
         librsvg2-bin.sce update check.
         libspectre1.sce update check.
         linux-headers-generic.sce update check.
         mesa-utils.sce update check.
         minitube-2.9.sce update check.
         mkvtoolnix.sce update check.
         module-assistant.sce update check.
         ndiswrapper-modules-4.14.10-tinycore64.sce update check.
         nouveau-4.14.10-tinycore64.sce update check.
         nvidia-340.106-4.14.10-tinycore64.sce update check.
         nvidia-340-dev.sce update check.
         picard.sce update check.
         poppler-data.sce update check.
         python3-sip.sce update check.
         python-sip.sce update check.
         sce-ppa-add.sce update check.
         sce-update.sce update check.
         smplayer.sce update check.
         ssh.sce update check.
         suru-icon-theme.sce update check.
         terminology.sce update check.
         tumbler.sce update check.
         unetbootin.sce update check.
         usb-creator-gtk.sce update check.
         veracrypt-cli.sce update check.
         veracrypt-gtk.sce update check.
         virtualbox-5.2.10-host-modules-4.14.10-tinycore64.sce update check.
         virtualbox-dkms.sce update check.
         virtualbox-qt.sce update check.
         vlc.sce update check.
         webp.sce update check.
         wifi.sce update check.
         wine1.6.sce update check.
         wireless.sce update check.
         wireless-4.14.10-tinycore64.sce update check.
         xdg-user-dirs.sce update check.
         xdg-utils.sce update check.
         xfburn.sce update check.
         xfig.sce update check.
         xorg-all.sce update check.
         xorg-dev.sce update check.
 
No updates available for main or any dependency SCEs.
Command exited with non-zero status 1
real 8m 28.30s
user 5m 40.03s
sys 1m 53.19s
jason@box:~$

8 and a half minutes to check the entire SCE directory, 9.2 GB of files.

And below is simply re-importing all the above mentioned SCEs with the below script run in the SCE directory:

Code: [Select]
#!/bin/sh

for I in `ls *.sce`; do E=`basename "$I" .sce`; sce-import -n "$E"; done



Results:

Code: [Select]
* Imported xorg-dev.sce.
real 1h 17m 34s
user 45m 34.98s
sys 11m 48.18s



Only the last snippet was posted as it was too long.  1 hour and 17 min versus 8 and a half minutes.  So simply re-importing is good for an install with few or small SCEs, but the savings in time with sce-update shows with larger installs. 

Hope this explains the usefulness of sce-update. 
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 05, 2019, 06:23:25 AM
Many thanks for taking the time to do the test on your own, JasonW! Obviously, there is a great advantage in not re-importing extensions for which there is no update available. This is what sce-update is really useful for, no doubt about it and your results make that very clear.

Though If updates are available, I expected that sce-update and sce-import would perform about the same with a slight penalty for sce-update due to its searching for updates. What struck me is how big the penalty is, increasing the total time by a factor of 2.5. That lead me to the preliminary conclusion that sce-update was fast at checking but slow at importing.

After re-considering your statements, I realized that I was confused by the fact that sce-update worked almost instantly in my cases. It took less than 20" even with all additional repositories. This must be due to the checking for changes in the debinx-files which had not happened when I tested it after performing the update/import.

In reversed order, however, it shows that the checking part of sce-update indeed does take considerable time which explains it all. So I step back from my suspicion that sce-update might not perform well enough. I had been hoping for shorter system upgrade times but I understand now why this won't easily happen. All in all, sce-update is an indispensable tool and does work well. Thanks again for all your time and effort!

Cheers!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 05, 2019, 09:33:45 PM
Hi sm8ps,

I will look into the performance of sce-update when there are updates available, which I did not test in my last post.  I am always looking for better performance.  Thanks for your observations. 
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 06, 2019, 09:58:05 AM
Thanks for the follow-up, Jason! -- I was going to say: "why bother?" Though after re-considering the results, I do think it would be interesting to compare.

In my case, it seems that sce-update adds 150% of the time of sce-import for checking if it should upgrade or not. This does sound inefficient but only pertains to one single extension with updates indeed available. If no updates are available, it does save a lot of time, of course.

I do not really see how to compare your results to mine because the difference in performance is so striking in your case. For a one-to-one comparison, you could add or remove some extra repositories and compare sce-update to sce-import for one single extension.

I have not really understood the code, although I took a closer look at it. (My avatar says it all when it comes to my understanding of Awk.) So maybe there is no much more efficient way of finding out if an extension is due for an update or not. Still I do not really understand why the searching for an update should take more time (+150%) than the actual upgrade.

Cheers!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 08, 2019, 05:10:01 PM
Hi sm8ps,

I don't have extra repos, but sce-import and sce-update use the same routines in parsing them.  But I did test 'sce-import -n firefox', 'sce-update -n firefox' with no updates, and tweaked the resulting firefox.sce to trigger an update and used 'sce-update -n firefox' that caused an update.  The results are below, and none of these required downloads:

Code: [Select]
dCore-stretch:

firefox initial import:
real    1m 12.98s
user    0m 38.45s
sys     0m 12.56s

'sce-update -n firefox' full checking but with no re-import:
real    1m 32.82s
user    0m 14.15s
sys     0m 5.76s

'sce-update -n firefox' full checking with re-import:
real    2m 36.13s
user    0m 51.99s
sys     0m 17.67s

I am seeing an expected time result in sce-update that triggers an import that is roughly equal to the times of the sce-update checking plus sce-import.  I am running a machine with 6GB of RAM (~4GB recoginzed on dCore-stretch) and a 3 Ghz dual core.
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 13, 2019, 01:08:12 PM
Thanks again for taking the time, Jason! Your results do look like the expected behavior and there is no real point in striking a debate. Doing so nevertheless is not for the sake of argument but because you said you were always looking for ways to increase performance. Unfortunately, my Bash- and Awk-foo is not so strong that I could simply understand the source code (see my avatar ...) so I can only formulate conjectures, unfortunately.

I would expect sce-update to be quicker in discovering if an extension can be updated. Because it is enough to discover any single package in the chain of dependencies that can be updated and there is no need for checking all of the packages. sce-import on the other hand does have to touch every single package and therefore should take considerably more time. The difference in performance may vary by the position of the first package that can be updated but on average it should be about 50%.

Judging from the results, it looks to me like sce-update does not only check if some single package can be updated but seems to to a full checking for all the upgradeable packages. I do not know how exactly your tweaking of firefox.sce worked but in my case of a base extension consisting of 300+ individual packages it almost certainly was not the last one to which triggered the upgrade. I would have expected sce-update to signal much quicker that an update has been found. In my case, however, the full process of checking for updates takes more time than the full of importing the extension anew.

It is acceptable for sce-update to take (at most) as much time as sce-import to determine that there is no update available if this query is so complex. However, it should be no more and, at best, much less. Does that make sense? As I said, I have not understood the programming logic involved in sce-update and my reasoning pure speculation. If I am going astray then I shall be just happy to learn so.

Cheers!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 13, 2019, 06:50:10 PM
Hi sm8ps,

You make a point in that sce-update can simply trigger an update on the first found instance that requires an update on any possible package, startup scripts, or data.tar.gz.  Rather than having to go through the full checking for updates in the SCE.  I think most folks just want to know if an SCE needs updating, as any startup script, data.tar.gz, or .deb package can be a very important change but don't need to know why.

I will look into how it can be done, perhaps with a "-f" "full check" option for sce-update for those who want a full checking and output. 
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 15, 2019, 10:22:32 PM
That sounds promissing! I am looking forward to testing any new versions.

Cheers!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 18, 2019, 10:04:07 AM
It seems it is not so much the checking of all packages in an SCE that was causing the performance difference between sce-import and sce-update, but the sce-update routines needed to be adjusted with the changes made recently to sce-import.  Below is a comparison of the performance of sce-update before and after the changes made now.  All seems well, so sce-update has been updated, please test.

Code: [Select]
Size of 200-bionic.sce:
$ ls -lh 200-bionic.sce  ... 1.3G

Number of packages in 200-bionic.sce (lines in below file):
$ wc -l /usr/local/sce/200-bionic/200-bionic.md5sum
1498 /usr/local/sce/200-bionic/200-bionic.md5sum

--------

New sce-update, simple mode:
$ time sce-update -c 200-bionic.sce
....
real    1m 30.34s
user    1m 7.85s
sys     0m 25.47s

$ cat /tmp/updateavailable
====  200-bionic updates  ====
chromium-browser    ubuntu bionic security-updates

--------

New sce-update, full check mode:
$ time sce-update -cf 200-bionic.sce
....
real    2m 1.88s
user    1m 27.87s
sys     0m 36.10s

$ cat /tmp/updateavailable
====  200-bionic updates  ====
chromium-browser    ubuntu bionic security-updates
chromium-codecs-ffmpeg    ubuntu bionic security-updates
chromium-codecs-ffmpeg-extra    ubuntu bionic security-updates
dirmngr    ubuntu bionic security-updates
gnupg    ubuntu bionic security-updates
gnupg-l10n    ubuntu bionic security-updates
gnupg-utils    ubuntu bionic security-updates
gpg    ubuntu bionic security-updates
gpg-agent    ubuntu bionic security-updates
gpgconf    ubuntu bionic security-updates
gpgsm    ubuntu bionic security-updates
gpgv    ubuntu bionic security-updates
gpg-wks-client    ubuntu bionic security-updates
gpg-wks-server    ubuntu bionic security-updates
libarchive13    ubuntu bionic security-updates
libcaca0    ubuntu bionic security-updates
libexiv2-14    ubuntu bionic security-updates
libgssapi-krb5-2    ubuntu bionic security-updates
libjavascriptcoregtk-4.0-18    ubuntu bionic security-updates
libk5crypto3    ubuntu bionic security-updates
libkrb5-3    ubuntu bionic security-updates
libkrb5support0    ubuntu bionic security-updates
libnss3    ubuntu bionic security-updates
libpam-systemd    ubuntu bionic security-updates
libpolkit-agent-1-0    ubuntu bionic security-updates
libpolkit-backend-1-0    ubuntu bionic security-updates
libpolkit-gobject-1-0    ubuntu bionic security-updates
libsystemd0    ubuntu bionic security-updates
libudev1    ubuntu bionic security-updates
libwebkit2gtk-4.0-37    ubuntu bionic security-updates
libzmq5    ubuntu bionic security-updates
policykit-1    ubuntu bionic security-updates
systemd-sysv    ubuntu bionic security-updates

--------

Old sce-update:
$ time sce-update -c 200-bionic.sce
....
real    7m 27.23s
user    1m 56.17s
sys     0m 44.86s

$ cat /tmp/updateavailable
====  200-bionic updates  ====
chromium-browser    ubuntu bionic security-updates
chromium-codecs-ffmpeg    ubuntu bionic security-updates
chromium-codecs-ffmpeg-extra    ubuntu bionic security-updates
dirmngr    ubuntu bionic security-updates
gnupg    ubuntu bionic security-updates
gnupg-l10n    ubuntu bionic security-updates
gnupg-utils    ubuntu bionic security-updates
gpg    ubuntu bionic security-updates
gpg-agent    ubuntu bionic security-updates
gpgconf    ubuntu bionic security-updates
gpgsm    ubuntu bionic security-updates
gpgv    ubuntu bionic security-updates
gpg-wks-client    ubuntu bionic security-updates
gpg-wks-server    ubuntu bionic security-updates
libarchive13    ubuntu bionic security-updates
libcaca0    ubuntu bionic security-updates
libexiv2-14    ubuntu bionic security-updates
libgssapi-krb5-2    ubuntu bionic security-updates
libjavascriptcoregtk-4.0-18    ubuntu bionic security-updates
libk5crypto3    ubuntu bionic security-updates
libkrb5-3    ubuntu bionic security-updates
libkrb5support0    ubuntu bionic security-updates
libnss3    ubuntu bionic security-updates
libpam-systemd    ubuntu bionic security-updates
libpolkit-agent-1-0    ubuntu bionic security-updates
libpolkit-backend-1-0    ubuntu bionic security-updates
libpolkit-gobject-1-0    ubuntu bionic security-updates
libsystemd0    ubuntu bionic security-updates
libudev1    ubuntu bionic security-updates
libwebkit2gtk-4.0-37    ubuntu bionic security-updates
libzmq5    ubuntu bionic security-updates
policykit-1    ubuntu bionic security-updates
systemd-sysv    ubuntu bionic security-updates
 

Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 18, 2019, 10:36:20 AM
Now that does sound exciting! The time savings really do look spectacular.
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: jls on January 19, 2019, 04:14:54 AM
Hi
Thanks Jason for the update and sm8ps for pushing Jason
-f needs to be adjusted in sce-update -h
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 19, 2019, 06:16:51 AM
Thanks jls, -f option has been corrected in the -h help menu. 
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 19, 2019, 11:48:14 AM
Now that I have remembered that sce-update is a separate extension and thus it does not make sense waiting for a new release candidate (I am either old-school or old and slow to adapt or all of it) I shall test my case in the next few days and report back.
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 21, 2019, 10:42:49 AM
Here are my new test results. The new sce-update does show any change in performance.

As before, case A) uses three (non-relevant) PPAs whereas case B) uses the same three PPAs plus 12 Ubuntu repositories (cf. previous posts for detail).

Old sce-update for reference:
Code: [Select]
sce-update -crn X-LIST    A) 3'30"    B) 7'00"
sce-update -rn X-LIST     A) 4'56"    B) 13'30"

New sce-update:
Code: [Select]
sce-update -crn X-LIST    A) 0'47"    B) 6'48"
sce-update -rn X-LIST     A) 2'38"    B) 13'18"

For comparison the time for importing the extension. I have not tested case B) yet.
Code: [Select]
sce-import -rpln    A) 1'55"

The situation was set up such that by the changing of the repos there was a definite need for update.  The timing was taken upon changing from case A to B or vice versa.

My conclusion is that the new update routines do have a considerable effect. This is great! Case A) shows that the time for importing is much lower than the importing as was hoped for. The result look rather solid to me and they add up as expected.

I cannot explain why the effect in case B) is so small. Does anybody have a clue?
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 21, 2019, 05:58:42 PM
When sce-update or sce-import is dealing with extra repos, it uses the slower awk routine in fetching package data rather than the faster grep routine that is used with the standard and security update repos.  The Packages files of the standard and security repos that are enabled by default have been formatted to where grep can be used.  Any Packages files from extra repos have not been formatted, so the awk routine must be used, which is not as fast but accurately deals with them.  So there is a performance penalty in using extra repos both in import and update.  The more extra repos would mean slower performance depending on the size of the Packages files of those extra repos.

Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 22, 2019, 05:54:00 AM
Okay, that makes perfect sense. Thank you, Jason, master of Awk (among many other martial arts), for sharing your insight!

Stretching the topic of this thread and combining it with the decision to only support LTS versions plus the current release (I cannot find the thread at the moment): would it make sense (also in terms of effort) to pre-format also the {main, backports, update}-{multiverse, restricted, universe} repo lists, at least for the LTS version?
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 22, 2019, 08:22:27 AM
Also for limiting the expectations, I would like to add that the official repos are at the same time relevant for many users and huge in size. Both aspects make them stick out among the various other repos like PPAs etc.
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 22, 2019, 06:16:33 PM
One question, do you have GNU awk installed?  GNU awk is almost twice as fast as Busybox awk in getting package info during sce-import and sce-update, just did some tests on it.  And awk is what is used in extra repo functions.  Awk is always used in sce-import/sce-update, but grep is used to quickly get a snippet of the Packages files in the main and security repo before awk is run on it.  Grep gives an about 20% performance increase over using only the awk routine when GNU awk is installed. 

I will think of how I can truncate extra repo files during use like the main and security ones are done on the server to only include what is needed.  That will save some time. 

Thanks


Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: jls on January 23, 2019, 12:48:50 AM
Hi
do you mean the gawk package?
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 23, 2019, 02:44:19 AM
Hi.  Yes, gawk is the Debian/Ubuntu package name.   
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 23, 2019, 02:22:50 PM
Right you are, Jason, GNU Awk does make a big difference compared to Busybox Awk which had been in use before. Here are my updated results. Sorry to repeat most of it but I believe it is easier to read the data when it is all in one place.

Case A) only three (non-relevant) PPAs: case B) the same three PPAs plus 12 Ubuntu repositories. The tests were always run at least twice and showed to be consistent after the DEBINX-file had been updated. In retrospect it would have been better to exclude all extra repositories but the influence of the PPAs does not seem very important.

Old sce-update with Busybox Akw for reference:
Code: [Select]
sce-update -rn X-LIST     A) 4'56"    B) 13'30"
sce-update -crn X-LIST    A) 3'30"    B) 7'00"

New sce-update with Busybox Akw:
Code: [Select]
sce-update -rn X-LIST     A) 2'38"    B) 13'18"
sce-update -crn X-LIST    A) 0'47"    B) 6'48"

New sce-update with GNU Awk:
Code: [Select]
sce-update -rn X-LIST      A) 2'40"    B) [b]9'05"[/b]
sce-update -crn X-LIST     A) 0'41"    B) [b]4'17"[/b]

Time for sce-import:
Code: [Select]
sce-import -rpln X-LIST     A) 1'51"    B) 4'49"

A) As observed before, the overall time is reduced to about 55% from the original state for standard repositories due to the new sce-update routine. The checking time (option -c) is reduced to about 20% which is pretty spectacular.
B) Using now also GNU Awk reduces the overall time for several big extra repositories to about 65% and the checking time to about 60% which is quite an improvement given the absolute values.

A) Comparing to sce-import, which had been my original motivation, the checking time of sce-update has decreased to about 40% for standard repositories.
B) With several big extra repositories, it is reduced to about 90%. Comparing to the almost 150% from the original state this is impressive and makes sce-update much more performant and thus usable than before.

Many, many thanks for your efforts, Jason!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: sm8ps on January 24, 2019, 02:01:20 PM
I got to test the new debGet* routines that will be in the next release-candidate. They have a similar effect for extra repositories as the new sce-update routine had for the standard repositories. Since the latter case can be considered all solved, I shall only state the results for my former case B) with the extra Ubuntu repositories. This is still with GNU Awk installed but I think this does not matter anymore.

I compare the checking time with the total update time.
Code: [Select]
    Check Update
1.) 7'00" 13'30"
2.) 6'48" 13'18"
3.) 4'17"  9'05"
4.) 2'20"  5'22"
Code: [Select]
sce-import:
1.-3.) 4'49"
4)     3'14"

The results were consistent across three runs. The difference should agree with the time for sce-import which does hold true indeed! Only note that that time has decreased by about 35% as well! I believe this is due to the new debGet* routines. Otherwise this change would point to a flaw in my measurings.

At first, I suspected that the pruning of the repo files only has to be performed once and the fact that I had to run the checking before the actual update is responsible for the miracle. However, this is not supported by the fact that the very first run by mistake was an update without prior checking.

So the new debGet* routines are tremendously effective. The checking time is down to about 75% of the import time which by itself has been reduced by about 35%! Naturally, these values depend on the actual changes in the package dependencies but the performance is lightning fast now.

It is absolutely stunning what JasonW has achieved. Congratulations and many thanks, Jason!
Title: Re: Why does sce-update take so much time (even more than sce-import)?
Post by: Jason W on January 24, 2019, 04:39:44 PM
Thanks for testing, the new RC is now uploaded. 

Now grep is used in sce-import/sce-update with extra repo DEBINX files just like the main and security repo, the extra DEBINX files are formatted to reduce size and standardize the contents.

The GNU grep package provides better performance over the Busybox version, though both function the same in this case.