WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: script that generates .tree files  (Read 1781 times)

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1842
Re: script that generates .tree files
« Reply #30 on: May 03, 2026, 06:55:33 AM »
I think we have a winner. ;D
Hi Rich. Good to hear. I'll be using this version myself :)

Out of curiosity, how long does this latest version take on your hardware?
Here you go:

Code: [Select]
$ time treegen vlc-dev.tcz >/tmp/vlc-dev.tcz.tree
real 0m 0.67s
user 0m 0.40s
sys 0m 0.22s
So 1.86 sec before, 0.67 sec now. Your optimizations help a whole lot: The new version gets the job done in less than half the time, even with the call to gsub.

Thank you for your collaboration with this. Always a pleasure.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 12734
Re: script that generates .tree files
« Reply #31 on: May 03, 2026, 08:39:16 AM »
Hi GNUser
Beautiful.

Overall, I'd say your script stacks up quite well against the C++ version:
Code: [Select]
awk C++
     Size     ~900 bytes      ~25K bytes
Time to generate vlc-dev.tcz.tree     0.67 Secs.      0.31 Secs.

Online Paul_123

  • Administrator
  • Hero Member
  • *****
  • Posts: 1543
Re: script that generates .tree files
« Reply #32 on: May 03, 2026, 09:55:18 AM »
For processing hundreds of tree files, I’ll take the speed of C++, but was there a robustness change needed for the repo, related to spaces and/or blank lines?

Offline GNUser

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 1842
Re: script that generates .tree files
« Reply #33 on: May 03, 2026, 08:01:39 PM »
was there a robustness change needed for the repo, related to spaces and/or blank lines?
Hi Paul_123. Yes, trailing whitespace in .dep files is causing problems for the c++ version of treegen. See Rich's analysis in Reply #20.
« Last Edit: May 03, 2026, 08:05:56 PM by GNUser »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 12734
Re: script that generates .tree files
« Reply #34 on: May 03, 2026, 08:24:40 PM »
Hi Paul_123
The C++ version does not recurse on entries that contain spaces.
From the TC17 x86_64 repo.
This is libsndfile-dev.tcz.dep with spaces converted to underscores:
Code: [Select]
rich@tcbox:~/libsndfile$ cat libsndfile-dev.tcz.dep | tr " " "_"
libsndfile.tcz
flac-dev.tcz_
libvorbis-dev.tcz___
The two -dev entries have trailing spaces.

This is the output of treegen:
Code: [Select]
rich@tcbox:~/libsndfile$ treegen libsndfile-dev.tcz 6.18.2-tinycore64
libsndfile-dev.tcz
   libsndfile.tcz
      flac.tcz
         libogg.tcz
      libvorbis.tcz
         libogg.tcz
      opus.tcz
      libmpg123.tcz
      lame.tcz
   flac-dev.tcz
   libvorbis-dev.tcz
The two dev entries get listed but not their dependencies.

Running treegen on those two entries shows they have dependencies:
Code: [Select]
rich@tcbox:~/libsndfile$ treegen flac-dev.tcz 6.18.2-tinycore64
flac-dev.tcz
   flac.tcz
      libogg.tcz
   libogg-dev.tcz
      libogg.tcz

rich@tcbox:~/libsndfile$ treegen libvorbis-dev.tcz 6.18.2-tinycore64
libvorbis-dev.tcz
   libvorbis.tcz
      libogg.tcz
   libogg-dev.tcz
      libogg.tcz

Online Paul_123

  • Administrator
  • Hero Member
  • *****
  • Posts: 1543
Re: script that generates .tree files
« Reply #35 on: May 03, 2026, 10:51:30 PM »
That's simple enough to fix.  we should just ignore spaces.  Leading/Trailing/ or just a blank line. (stray tabs or /r as well)

Change the "nukenewline" function to.

Code: [Select]
static void nukenewline_space(char buf[]) {
        unsigned src, dst;

        for (src = 0, dst = 0; buf[src] != '\0'; src++) {
                if (buf[src] == '\n')
                        break;
                if (isspace((unsigned char)buf[src]))
                        continue;
                buf[dst++] = buf[src];
        }
        buf[dst] = '\0';
}

Need to add ctype.h as an include too.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 12734
Re: script that generates .tree files
« Reply #36 on: Today at 11:42:46 AM »
Hi Paul_123
I think that looks OK.

If I'm reading it right:
1. You first test and break for a newline so that isspace can't remove it.
2. Then isspace is used to skip past any whitespace and increment the src index.
3. And the src index is copied to the dest index, then the dest index is incremented.
4. Repeat those steps until src index equals string terminator.
5. Write string terminator to dest index.

At that point I guess the calling function discards any strings containing only
a newline character.

Online Paul_123

  • Administrator
  • Hero Member
  • *****
  • Posts: 1543
Re: script that generates .tree files
« Reply #37 on: Today at 12:22:47 PM »
If I'm reading it right:

Yes, When the newline is encountered, that is the end of processing.....no need to waste time with anything else.   Only non whitespace characters are left in the string.   If there was no actual white space encountered, then it copies data from the source index to the destination index.


At that point I guess the calling function discards any strings containing only
a newline character.

Yes, the calling function right after processing the strings for newline (and now space)   There is a check to make sure the string length is not less than 4 (.tcz)