Tiny Core Linux
General TC => Programming & Scripting - Unofficial => Topic started by: Juanito on November 17, 2017, 03:23:18 AM
-
As I'm hopeless at scripting, help would be appreciated with a script to remove superfluous directory names from an extension file list.
As an example, to go from this: /usr
/usr/local
/usr/local/include
/usr/local/include/zbuff.h
/usr/local/include/zdict.h
/usr/local/include/zstd.h
/usr/local/include/zstd_errors.h
/usr/local/lib
/usr/local/lib/pkgconfig
/usr/local/lib/pkgconfig/libzstd.pc
..to this: /usr/local/include/zbuff.h
/usr/local/include/zdict.h
/usr/local/include/zstd.h
/usr/local/include/zstd_errors.h
/usr/local/lib/pkgconfig/libzstd.pc
-
It looks like these lines were generated via "find" command.
How about:
find /usr -not -type d
I'm having dinner at the moment, I'll try to script with this as soon as possible.
-
They were generated with the “unsquashfs - l -d ' '“ command
-
pipe to awk like
echo "$data" | awk '/\./{print $0}'
To select lines containing '.' dot char !
echo "$data" | awk '/\./{print $0}'
/usr/local/include/zbuff.h
/usr/local/include/zdict.h
/usr/local/include/zstd.h
/usr/local/include/zstd_errors.h
/usr/local/lib/pkgconfig/libzstd.pc
How ever this will not work with
-files with no '.ext' extention
-or folders containng a '.' char ...
-
They were generated with the “unsquashfs - l -d ' '“ command
Save this script as "strip-path.sh"
#!/bin/sh
OUTPUT_DIR=/tmp/tcz-list
mkdir -p $OUTPUT_DIR
strip_path() {
awk 'BEGIN {FS="\n";RS=""} {
print $NF
for (i=NF;i>1;i--) {
if ($i !~ $(i - 1)) print $(i - 1)
}
}' < /dev/stdin
}
for TCZ in $@; do
unsquashfs -l $TCZ | grep 'squashfs-root/' | cut -d '/' -f 2- | strip_path > ${OUTPUT_DIR}/"$(basename $TCZ)".list
done
To run the script
./strip-path.sh /etc/sysconfig/tcedir/optional/zstd*.tcz
Results
tc@box:/tmp/tcz-list$ ls
zstd-dev.tcz.list zstd.tcz.list
tc@box:/tmp/tcz-list$ cat zstd-dev.tcz.list
usr/local/lib/pkgconfig/libzstd.pc
usr/local/include/zstd_errors.h
usr/local/include/zstd.h
usr/local/include/zdict.h
usr/local/include/zbuff.h
tc@box:/tmp/tcz-list$ cat zstd.tcz.list
usr/local/bin/zstdmt
usr/local/bin/zstdless
usr/local/bin/zstdgrep
usr/local/bin/zstdcat
usr/local/bin/unzstd
Note that they're in reverse order.
If you're not OK with that, you can use "tac" command from "coreutils.tcz" to flip it back or this "sed" one-liner.
sed '1!G;h;$!d'
-
Hi Juanito
Here's my entry:
#!/bin/sh
SourceFile="$1"
SortedFile="sorted.lst"
DestFile="stripped.lst"
SubString=""
rm -f $DestFile
sort -o "$SortedFile" "$SourceFile"
while read -r String
do
case "$String" in
"$SubString"*)
;;
*)
echo "$SubString" >> "$DestFile"
;;
esac
SubString=$String
done < "$SortedFile"
echo "$SubString" >> "$DestFile"
Here's the result:
tc@box:~/remdir$
tc@box:~/remdir$ cat orig.lst
/usr/local/lib
/usr/local/include/zbuff.h
/usr/local/lib/pkgconfig/libzstd.pc
/usr/local/include/zdict.h
/usr/local
/usr/local/include/zstd.h
/usr
/usr/local/lib/pkgconfig
/usr/local/include
/usr/local/include/zstd_errors.h
tc@box:~/remdir$
tc@box:~/remdir$ ./RemoveDirs orig.lst
tc@box:~/remdir$
tc@box:~/remdir$ cat stripped.lst
/usr/local/include/zbuff.h
/usr/local/include/zdict.h
/usr/local/include/zstd.h
/usr/local/include/zstd_errors.h
/usr/local/lib/pkgconfig/libzstd.pc
tc@box:~/remdir$
tc@box:~/remdir$
The list is first sorted. Each line is then tested to see if it's a substring of the next line. If you have /usr/local then local has to
be either a file or a subdirectory, it can't be both. So if /usr/local is a substring of the next line, then it's a directory and gets
discarded.
-
How about using the double l switch with unsquashfs:
unsquashfs -ll -d '' some-extension.tcz | grep -v '^d' | sed 's#.* /#/#'
-
Thanks for all the suggestions 🙂
-
I like this:
$ unsquashfs -ll -d '' svn.tcz | grep -v '^d' | sed 's#.* /#/#'
..but: $ cat svn.tcz.list
Parallel unsquashfs: Using 4 processors
48 inodes (611 blocks) to write
/usr/local/bin/svn
/usr/local/bin/svnadmin
/usr/local/bin/svndumpfilter
/usr/local/bin/svnlook
/usr/local/bin/svnmucc
/usr/local/bin/svnrdump
/usr/local/bin/svnserve
/usr/local/bin/svnsync
/usr/local/bin/svnversion
/usr/local/lib/libsvn_client-1.so -> libsvn_client-1.so.0.0.0
/usr/local/lib/libsvn_client-1.so.0 -> libsvn_client-1.so.0.0.0
...
-
I knew the links would be that way when I tested it. I didn't know if that would be useful information or a problem, depending on how the resulting file list is used. Adding another regex to sed will clear that up:
unsquashfs -ll -d '' svn.tcz | grep -v '^d' | sed -e 's#.* /#/#' -e 's# -> .*##'
A question I also thought of when I was cobblling my script together is whether or not directories should be included if they are empty. No directories will be listed thanks to grep.
-
I guess empty directories should be included if they actually exist and are required for things to work, but this would almost never be the case as empty directories under /var, etc should be created by a startup script.
-
I've adjusted my script so it won't accidentally drop lines we need. :)
awk '{print $1}' should drop any "link redirections".
#!/bin/sh
OUTPUT_DIR=/tmp/tcz-list
mkdir -p $OUTPUT_DIR
strip_path() {
awk 'BEGIN {FS="\n";RS=""} {
for (i=1;i<NF;i++) {
if ($(i + 1) !~ $i"/") print $i
}
}' < /dev/stdin
}
for TCZ in $@; do
unsquashfs -l $TCZ | grep 'squashfs-root/' | cut -d '/' -f 2- | awk '{print $1}' | strip_path > ${OUTPUT_DIR}/"$(basename $TCZ)".list
done
Some explanations:
Each (awk) loop, I take two lines and compare the differences.
If the patterns does not look like something below a directory, then print the line we're checking.
For instance:
tc@box:/tmp$ unsquashfs -l svn.tcz | grep 'squashfs-root/' | cut -d '/' -f 2- | awk '{print $1}'
usr
usr/local
usr/local/bin
usr/local/bin/svn
usr/local/bin/svnadmin
...
In the awk function:
if ($(i + 1) !~ $i"/") print $i
loop1: if (usr/local !~ usr/) ==> "usr/local" contains "usr/" --> skip
loop2: if (usr/local/bin !~ usr/local/) ==> "usr/local/bin" contains "usr/local/" --> skip
loop3: if (usr/local/bin/svn !~ usr/local/bin/) ==> "usr/local/bin/svn" contains "usr/local/bin/" --> skip
loop4: if (usr/local/bin/svnadmin !~ usr/local/bin/svn/) ==> "usr/local/bin/svnadmin" does not fit "usr/local/bin/svn/"
--> print "usr/local/bin/svn"
And so on ~
By using this technique, empty directory can be preserved.
P.S.
I flip the listing order back to normal, it's no longer in reverse. ;)
P.P.S.
IIRC, the leading slash '/' of "/usr/local/bin/svn" should be removed, no ??? ?
-
In the end I've settled on the following, but thanks to all for your help.
$ unsquashfs -ll -d '' extension.tcz | grep -v '^d' | sed -e 's#.* /#/#' -e 's# -> .*##' -e 1,3d > extension.tcz.list