WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: How to find which extension provides a file  (Read 4959 times)

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11724
How to find which extension provides a file
« on: November 13, 2024, 09:53:34 AM »
Hi gadget42
... when i use the "provides" function and search for "ps" it reports many results. how would the uninitiated decide which to use? ...
Good question.

The following was performed on TC10 x86 using the command line for
provides instead of the GUI. The only difference is the command line
requires quotes and the GUI can not handle quotes.

Trying to find which extension provides a program can be a frustrating
experience, especially with short names like ps or dd which will return
many results. A search for ps returned 580 results:
Code: [Select]
tc@E310:~$ provides.sh ps | wc -l
580
tc@E310:~$

The reason for so many matches is provides searches for a match anywhere
in the string, not an exact match. Searching for the calendar program called
cal returns 2661 results:
Code: [Select]
tc@E310:~$ provides.sh cal | wc -l
2661
tc@E310:~$
That's because it's also matching on every  usr/local/  string in the provides.db
file. That's important because we can use that to create a more precise match by
including some path information.

We know cal is a program so it's probably in a bin or sbin directory. We can cover
both cases with a search like this:
Code: [Select]
tc@E310:~$ provides.sh "bin\/cal"
ax25-apps.tcz
fox-apps.tcz
fox.tcz
util-linux.tcz
valgrind.tcz
xastir.tcz
tc@E310:~$
Notice we need to escape the path separator  "/"  with a backslash  "\".
As I said earlier, command line, use quotes. GUI, don't use quotes.

Let's see what happens for ps:
Code: [Select]
tc@E310:~$ provides.sh "bin\/ps"
aspell-dev.tcz
ghostscript.tcz
gnutls3.6.tcz
gnutls.tcz
lcms2.tcz
libcap-ng.tcz
libpsl.tcz
pax-utils.tcz
postgresql-10-client.tcz
postgresql-10.tcz
postgresql-11-client.tcz
postgresql-11.tcz
postgresql-12-client.tcz
postgresql-12.tcz
postgresql-9.5-client.tcz
postgresql-9.5.tcz
postgresql-9.6-client.tcz
postgresql-9.6.tcz
procps-ng.tcz
procps.tcz
pstree.tcz
putty.tcz
sc.tcz
tc@E310:~$
23 results.

Lets lengthen the path a little:
Code: [Select]
tc@E310:~$ provides.sh "local\/bin\/ps"
aspell-dev.tcz
ghostscript.tcz
gnutls3.6.tcz
gnutls.tcz
lcms2.tcz
libcap-ng.tcz
libpsl.tcz
pax-utils.tcz
procps-ng.tcz
procps.tcz
pstree.tcz
putty.tcz
sc.tcz
tc@E310:~$
That's a little better.

The previous search was picking up this from the postgresql extensions:
/usr/local/pgsql10/bin/psql


That brings me to another point.
Some extensions have another sub-directory between local and bin, sbin, or lib
as you just saw:
/usr/local/pgsql10/bin/psql

Just something to keep in mind when performing searches.

Offline Leee

  • Full Member
  • ***
  • Posts: 125
Re: How to find which extension provides a file
« Reply #1 on: November 13, 2024, 05:08:01 PM »
While browsing through provides.db, I note that most file paths start with /usr/ but in a few extensions they start with usr/ or ./usr/Is this legit? irrelevant? indicative of a problem?
examples:  base-locale.tcz uses ./usr/  bash-completeion.tcz uses usr/
core 15.0 x86_64

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11724
Re: How to find which extension provides a file
« Reply #2 on: November 13, 2024, 09:47:48 PM »
Hi Leee
Yes, I've noticed that too. Whether tce-load does a copy to file
system or creates symlinks to the file system, the destination
is always "/".

That would translate to  //usr/ , /usr/ , or /./usr/ , all of which resolve
to the same location. Extra consecutive slashes are ignored. The
"./" means the current directory. So "/./" becomes root "current directory"
which is still the root directory. Hope that makes sense. Try this:
Code: [Select]
cd /./then do a directory listing and you'll see you're in the root directory.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11724
Re: How to find which extension provides a file
« Reply #3 on: November 14, 2024, 01:01:01 AM »
Hi gadget42
I've created a modified provides.sh script and copied it
to /usr/bin/provides.sh. Then I did:
Code: [Select]
sudo ln -rs /usr/bin/provides.sh /usr/bin/Provides.sh
tc@E310:~/Scripting/Provides$ ls -l /usr/bin/Provides.sh
lrwxrwxrwx 1 root root 11 Nov 14 00:02 /usr/bin/Provides.sh -> provides.sh

When called as provides.sh, it behaves as it always has, so the Apps GUI
won't know the difference.

When called as Provides.sh, it behaves like an anchor was added to
the end of the search term. So a search for ps would reject ps2pdf
because it does not end in ps.

Here is the search for ps repeated with both versions:
Code: [Select]
tc@E310:~/Scripting/Provides$ time provides.sh "\/local\/bin\/ps"
aspell-dev.tcz
ghostscript.tcz
gnutls3.6.tcz
gnutls.tcz
lcms2.tcz
libcap-ng.tcz
libpsl.tcz
pax-utils.tcz
procps-ng.tcz
procps.tcz
pstree.tcz
putty.tcz
sc.tcz
real    0m 1.81s
user    0m 0.90s
sys     0m 0.18s

Code: [Select]
tc@E310:~/Scripting/Provides$ time Provides.sh "\/local\/bin\/ps"
procps-ng.tcz
procps.tcz
real    0m 1.92s
user    0m 1.00s
sys     0m 0.20s

Here is the search for cal repeated with both versions:
Code: [Select]
tc@E310:~/Scripting/Provides$ time provides.sh "\/local\/bin\/cal"
ax25-apps.tcz
fox-apps.tcz
fox.tcz
util-linux.tcz
valgrind.tcz
xastir.tcz
real    0m 1.99s
user    0m 0.86s
sys     0m 0.24s

Code: [Select]
tc@E310:~/Scripting/Provides$ time Provides.sh "\/local\/bin\/cal"
util-linux.tcz
real    0m 1.96s
user    0m 0.98s
sys     0m 0.23s

This is the modified provides.sh:
Code: [Select]
#!/bin/busybox ash
. /etc/init.d/tc-functions
useBusybox

TARGET="$1"
[ -z "$TARGET" ] && exit 1

TCEDIR="/etc/sysconfig/tcedir"
DB="provides.db"

getMirror
cd "$TCEDIR"
if zsync -i "$TCEDIR"/"$DB" -q "$MIRROR"/"$DB".zsync
then
rm -f "$DB".zs-old
else
if [ ! -f "$TCEDIR"/"$DB" ]
then
  wget -O "$TCEDIR"/"$DB".gz "$MIRROR"/"$DB".gz
  gunzip "$TCEDIR"/"$DB".gz
fi
fi
cd - > /dev/null

if [ "${0##*/}" == "provides.sh" ]
then
awk 'BEGIN {FS="\n";RS=""} /'${TARGET}'/{print $1}' "$TCEDIR"/"$DB"
else
awk -v Find="$TARGET" -v FindEnd="$TARGET$" 'BEGIN {FS="\n";RS=""} {if ( $0 ~ Find ) { for (i=2; i <= NF; i++) { if ( $i ~ FindEnd ) {print $1; break} } } }' "$TCEDIR"/"$DB"
fi

chmod g+rw "$TCEDIR"/"$DB"

What I don't understand is the purpose of the chmod command
at the end that executes after the script has finished running.
« Last Edit: November 18, 2024, 12:54:52 AM by Rich »

Online polikuo

  • Hero Member
  • *****
  • Posts: 759
Re: How to find which extension provides a file
« Reply #4 on: November 14, 2024, 02:56:08 AM »
Hi

This is what I do in my systems.
Code: [Select]
#!/bin/sh
[ -z "$1" ] && exit 1

TCEDIR="/etc/sysconfig/tcedir"
DB="provides.db"

awk 'BEGIN {FS="\n";RS=""} /'"$1"'/{print $1}' "$TCEDIR"/"$DB"

I call it /usr/local/bin/prov, because it's a shrunken version of provides.sh that doesn't sync with the repo. (syncing takes time, especially if you're searching frequently.)

While searching, you can add "\n" to the end of the query.
Code: [Select]
tc@pi4:~$ prov 'bin\/ps\n'
procps-ng.tcz

You can do the same with the regular provides.sh (or the app browser) as well.

What I don't understand is the purpose of the chmod command
at the end that executes after the script has finished running.

I believe it was meant to make sure the file is modifiable by any user to sync with the repo.
« Last Edit: November 14, 2024, 03:02:07 AM by polikuo »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11724
Re: How to find which extension provides a file
« Reply #5 on: November 14, 2024, 11:28:21 AM »
Hi polikuo
... While searching, you can add "\n" to the end of the query. ...
I was so focused figuring out how to include $ as an anchor that I
forgot to try \n. Thank you for pointing out the obvious and getting
me back on track.

Quote
... I believe it was meant to make sure the file is modifiable by any user to sync with the repo.
Makes sense. It just seemed like an odd place to put that line.

Looks like I spoke too soon. It seems using \n does not work if the item
you are searching for is in the last field of the record.

This is the record for squashfs-tools.tcz:
Code: [Select]
squashfs-tools.tcz
/usr/local/bin/mksquashfs
/usr/local/bin/unsquashfs

Searching for mksquashfs works:
Code: [Select]
tc@E310:~/Scripting/Provides$ provides.sh "bin\/mksquashfs\n"
squashfs-tools.tcz
tc@E310:~/Scripting/Provides$

Searching for unsquashfs (last field) fails:
Code: [Select]
tc@E310:~/Scripting/Provides$ provides.sh "bin\/unsquashfs\n"
tc@E310:~/Scripting/Provides$

Online polikuo

  • Hero Member
  • *****
  • Posts: 759
Re: How to find which extension provides a file
« Reply #6 on: November 14, 2024, 12:46:38 PM »
It seems using \n does not work if the item
you are searching for is in the last field of the record.

Hi, Rich.
Interesting...
This is what I get on piCore64 15.0 (squashfs.tcz has extra files, so I'm using tar.tcz to demonstrate)

Code: [Select]
tc@pi4:/etc/sysconfig/tcedir$ grep -A3 squashfs provides.db
squashfs-tools.tcz
usr/local/bin/mksquashfs
usr/local/bin/unsquashfs
usr/local/bin/sqfscat
usr/local/bin/sqfstar

tc@pi4:/etc/sysconfig/tcedir$ grep -A3 'tar.tcz' provides.db
tar.tcz
usr/local/lib/tar/rmt
usr/local/bin/tar

tc@pi4:/etc/sysconfig/tcedir$ prov 'bin\/tar\n'
tc@pi4:/etc/sysconfig/tcedir$ prov 'bin\/tar$'
tar.tcz
tc@pi4:/etc/sysconfig/tcedir$ prov 'bin\/tar\n/||/bin\/tar$'
tar.tcz
tc@pi4:/etc/sysconfig/tcedir$ provides.sh 'bin\/tar\n'
tc@pi4:/etc/sysconfig/tcedir$ provides.sh 'bin\/tar$'
tar.tcz
tc@pi4:/etc/sysconfig/tcedir$ provides.sh 'bin\/tar\n/||/bin\/tar$'
tar.tcz

This is getting ridiculous...

Code: [Select]
tc@pi4:~$ p(){ awk 'BEGIN {FS="\n";RS=""} /'"$1"'\/'"$2"'\n/||/'"$1"'\/'"$2"'$/{print $1}' /etc/sysconfig/tcedir/provides.db; }
tc@pi4:~$ p bin tar
tar.tcz

Shrink the line
Code: [Select]
tc@pi4:~$ p(){ awk -v FS="\n" -v RS="" "/$1\/$2\n/||/$1\/$2$/{print \$1}" /etc/sysconfig/tcedir/provides.db; }
tc@pi4:~$ p bin tar
tar.tcz

Other cases

Code: [Select]
tc@pi4:/etc/sysconfig/tcedir$ grep -A3 submitqc.tcz provides.db
submitqc.tcz
/usr/local/bin/submitqc
/usr/local/share/doc/submitqc/COPYING

tc@pi4:/etc/sysconfig/tcedir$ p bin submitqc
submitqc.tcz
tc@pi4:/etc/sysconfig/tcedir$ p submitqc COPYING
submitqc.tcz
tc@pi4:/etc/sysconfig/tcedir$ grep -A3 wget.tcz provides.db
wget.tcz
/usr/local/etc/wgetrc
/usr/local/bin/wget

tc@pi4:/etc/sysconfig/tcedir$ p bin wget
wget.tcz
tc@pi4:/etc/sysconfig/tcedir$ p etc wgetrc
wget.tcz
« Last Edit: November 14, 2024, 01:14:48 PM by polikuo »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11724
Re: How to find which extension provides a file
« Reply #7 on: November 14, 2024, 01:48:55 PM »
Hi polikuo
That looks like it works, but I don't think a user should be
forced to use an exact match. A user may not be sure what
they are searching for. For example, they might want to
find all extensions with files that have crypt in their name.
I've done those types of searches myself on occasion.

If a person needs to get more specific with the path, they
still need to do this:
Code: [Select]
p "local\/bin" ps
These are not criticisms, just observations. Your awk solution
appears to be more efficient and shorter than mine. Though
mine doesn't need quotes for some reason:
Code: [Select]
tc@E310:~$ Provides.sh local\/bin\/ps
procps-ng.tcz
procps.tcz
tc@E310:~$

Offline Leee

  • Full Member
  • ***
  • Posts: 125
Re: How to find which extension provides a file
« Reply #8 on: November 14, 2024, 01:58:48 PM »
As so often happens, I learned a lot while messing around with provides.sh.
First, the format of provides.db is easy to produce and doesn't contain a lot of redundant data (does not have the TCZ name at the start of each line ), and remains human readable (unlike, for instance, an SQL datebase table), it really doesn't lend itself to simple searches.
Second, mucking around with FS and RS so that each record -does- begin with the TCZ name. Now -that- is neat.
Third, as usual, I learned a little more about awk,  Alas, I haven't yet learned enough about awk (I find the syntax to be... wait for it... awkward!) so...
I tried coding the exact search, using other tools, to search for lines ending in /${TARGET} and it worked perfectly... but took four minutes to run.
I'm going to continue following this thread (big thanks to all in the thread) and continue playing with my own script... but that will have to be -after- I finish dealing with some "old, failing hardware" issues.
core 15.0 x86_64

Offline Leee

  • Full Member
  • ***
  • Posts: 125
Re: How to find which extension provides a file
« Reply #9 on: November 14, 2024, 02:23:16 PM »
Hi polikuo
That looks like it works, but I don't think a user should be
forced to use an exact match. A user may not be sure what
they are searching for. For example, they might want to
find all extensions with files that have crypt in their name.
I've done those types of searches myself on occasion.
I think any search tool should default to an exact search, possibly with an option for a fuzzy search.  Less ideal (IMO) would be defaulting to a fuzzy search with an option for an exact search.  Having only a fuzzy search is just frustrating.

In the mean time, looking at provides.sh and realizing that it uses awk, provides the clue that  a regular slash character ( / ) needs to be escaped, makes the stock provides.sh usable with, for instance, provides.sh "ps \/ps\n" for an exact search.  I would never have otherwise have thought to escape the slash character.
core 15.0 x86_64

Offline gadget42

  • Hero Member
  • *****
  • Posts: 838
Re: How to find which extension provides a file
« Reply #10 on: November 14, 2024, 03:05:36 PM »
this forum is awesome...and at the same time humbling(much to learn grasshopper...sigh).
The fluctuation theorem has long been known for a sudden switch of the Hamiltonian of a classical system Z54 . For a quantum system with a Hamiltonian changing from... https://forum.tinycorelinux.net/index.php/topic,25972.msg166580.html#msg166580

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11724
Re: How to find which extension provides a file
« Reply #11 on: November 14, 2024, 05:05:34 PM »
Hi Leee
... I tried coding the exact search, using other tools, to search for lines ending in /${TARGET} and it worked perfectly... but took four minutes to run. ...
I once tried that too, but nothing came close to touching the speed of awk.
It's because awk is searching the entire record for a match, not individual
fields. A record looks like this:
Code: [Select]
Field1
field2
field3

Field1 is the .tcz name. Field2 and Field3 are filenames. The blank line at
the end marks the end of the record. In awk land, {print $1} means print
the contents of Field1. $2 would be Field2, and so on.

Not shown is $0 which represents the entire record and looks like this:
Code: [Select]
Field1\nField2\nField3The record is treated as one long string with embedded \n record
separators, which is what gets searched. Field3 has no field separator
because it's the end of the record. If I change {print $1} to {printf $0}
the whole record gets printed and printf doesn't add any \n characters:
Code: [Select]
tc@E310:~/Scripting/Provides$ ./prov.sh "bin\/egrep\n"
grep.tcz
/usr/local/bin/egrep
/usr/local/bin/fgrep
/usr/local/bin/greptc@E310:~/Scripting/Provides$
You can see it print Field1\nField2\nField3\n but Field4 is immediately
followed by the command prompt, which explains why search terms
in the last field were not found in reply #5.

It's because awk is scanning an entire record in one go that it's so fast.
If you tell awk to search records one field at a time, it too will run slower.

Offline patrikg

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 726
Re: How to find which extension provides a file
« Reply #12 on: November 14, 2024, 05:24:28 PM »
Maybe "some one"=me right now :) have little time to make a script that unsquashfs -l tcefile.tce list the tce files and make a db of that. It goes very fast with sqlite3 to search.

Here you go, make.sh makes a db and fill the file names and tcz package names in.
And search.sh make you search in the db, you can edit the file to change you needs.

Happy hacking:

Code: [Select]
make.sh
cat << 'EOF' > make.sh
#!/bin/bash
sqlite3 files.db "CREATE TABLE tcz_files (name TEXT NOT NULL,file TEXT NOT NULL);"
for filename in $(ls *.tcz)
do
   tczfile=$filename;for file in $(unsquashfs -lc $tczfile); do sqlite3 files.db "INSERT INTO tcz_files ( NAME , FILE ) VALUES ( '$tczfile' , '$(basename $file)' );" ; done
done
EOF

chmod u+x make.sh

search.sh
cat << 'EOF' > search.sh
sqlite3 -box files.db "SELECT * FROM tcz_files WHERE file LIKE '%$1%';"
EOF

chmod u+x search.sh

ls
files.db  gimp2.tcz  make.sh  search.sh  sqlite3-bin.tcz  sqlite3.tcz

./search.sh pdf
┌───────────┬───────────────┐
│   name    │     file      │
├───────────┼───────────────┤
│ gimp2.tcz │ file-pdf-load │
│ gimp2.tcz │ file-pdf-save │
└───────────┴───────────────┘

./search.sh sql
┌─────────────────┬─────────────────────┐
│      name       │        file         │
├─────────────────┼─────────────────────┤
│ sqlite3-bin.tcz │ sqlite3             │
│ sqlite3.tcz     │ libsqlite3.so       │
│ sqlite3.tcz     │ libsqlite3.so.0     │
│ sqlite3.tcz     │ libsqlite3.so.0.8.6 │
└─────────────────┴─────────────────────┘

sqlite3 -box files.db "SELECT COUNT(*) AS 'Number of filenames' FROM tcz_files;"
┌─────────────────────┐
│ Number of filenames │
├─────────────────────┤
│ 4191                │
└─────────────────────┘

ls -alh files.db
-rw-r--r-- 1 patrik users 164K 14 nov 22.34 files.db



« Last Edit: November 14, 2024, 05:30:06 PM by patrikg »

Offline yvs

  • Jr. Member
  • **
  • Posts: 70
Re: How to find which extension provides a file
« Reply #13 on: November 14, 2024, 09:30:07 PM »
I think that number of files not so big that even grep/sed through text file is negligible.
For example using sed search in shell command_not_found_handler() I got:
Code: [Select]
% as
as found in: binutils

% time as
as found in: binutils
0.00s user 0.01s system 97% cpu 0.007 total

% time cpp
cpp found in: gcc
0.00s user 0.00s system 97% cpu 0.008 total

% time users
users found in: coreutils
0.01s user 0.00s system 95% cpu 0.008 total

i.e. it's about 10msec (time comparable to switching between contexts)

Offline nick65go

  • Hero Member
  • *****
  • Posts: 841
Re: How to find which extension provides a file
« Reply #14 on: November 15, 2024, 09:34:49 AM »
so, maybe is premature, but then a conclusion?
the winner is sqlite (compiled), or awk (compiled), or sh shell/scripts?

OR, if the speed is not the main goal (but nice for multi-CPU core) then maybe the user confort for seach parameters (GUI options will be nice also). But of course, the pro[fessionals] already know the use of "" and back-quotes, etc.
« Last Edit: November 15, 2024, 09:38:59 AM by nick65go »