Off-Topic > Off-Topic - Tiny Tux's Corner

(m|s|p) locate / sql ("db") vs grep / awk ("db.txt")

<< < (2/3) > >>

mocore:

--- Quote from: yvs on November 21, 2024, 03:07:50 AM ---I want to add that format and tools also depend on what kind of information is needed.
...
I.e. goals define format and tools.

--- End quote ---

compression ( currently core uses gzip for provides.db ) likely dose abetter job of de-duplication that altering the structure of the db
 
but if considering altering structure some of the plocate modifications (eg n-gram search) might be interesting to attempt to implement in awk (just for fun; )



 

yvs:
> compression ( currently core uses gzip for provides.db ) likely dose abetter job of de-duplication that altering the structure of the db
> but if considering altering structure some of the plocate modifications (eg n-gram search) might be interesting to attempt to implement in awk (just for fun; )
>
  Different goals.
  In case of looking up for executable->extension pair in once collected text file (it's not provides.db and not compressed)
  and using simple sed it takes too little time to get result, so that if even that takes 100-500msec I'd not bother to use db/sql or awk coding for a onetime query.
  But the goal was only finding out extension that provides some executable, and no more. That what I had in mind.

  If that's kindof complicated queries with many options on big amount of data, yes totally agree plocate is good.

nick65go:
@mocore: :) your reminder about tcc (tiny C compiler) is a sample of the reasons I still (from time of time) check this forum. Even if I focus (mainly) on productivity/efficiency, sometime I use time won by my efficiency for intellectual challenges (no money involved).

even to "old" tools could be improved without user intervention: https://www.phoronix.com/news/zlib-2.2-RC1
( Zlib-ng 2.2 Speeds Up Compression By ~12% On x86_64 CPUs) if not binding to restricted architecture.

mocore:

--- Quote from: nick65go on November 17, 2024, 08:30:59 AM --- My understanding is that curaga had prefered not-compiled solutions (if I remember correctly).

--- End quote ---

i wander for what reasons this might be?!  ???

just happened to found some interesting "gray area"  of not-compiled
https://github.com/udem-dlteam/pnut - A Self-Compiling C Transpiler Targeting Human-Readable POSIX Shell

/wtf

no clue how useful / performant / bug-free it might be 
interesting nonetheless

mocore:

wrt performance / awk / "data" ect  https://benhoyt.com/writings/goawk-compiler-vm/


--- Quote ---Why are virtual machines faster than tree-walking?

It’s not immediately obvious why compiling to virtual instructions and then executing them with a virtual machine is faster than evaluating a syntax tree (“tree-walking”).

It’s actually more work up-front: instead of just lexing and parsing into a syntax tree, we now also have a compile step. That said, virtual machine compilers (including GoAWK’s) are usually very simple and non-optimizing, so that step is fast.

One reason it’s faster to execute is this: RAM – which stands for Random Access Memory – is not actually random access on modern processors. Memory blocks are loaded into fast CPU caches as needed, so when you have to access a new block, it takes about 10x as long as if it’s in the cache. Peter Norvig’s table of timings for various operations on a typical CPU shows how fetching from level 1 cache takes about 0.5 nanosecond, fetching from level 2 cache 14x that long, and fetching from main memory another 14x!

Programming with this in mind is called “data-oriented design”. I was reminded of how much impact this makes when watching Andrew Kelley’s excellent talk, [1] A Practical Guide to Applying Data-Oriented Design. Andrew is the creator of the Zig programming language, and his talk describes how he significantly sped up the Zig compiler by applying data-oriented design techniques. That talk was what pushed me to think about this for GoAWK. But back to why a virtual machine is faster than tree-walking…

--- End quote ---

[1] "Andrew Kelley Practical Data Oriented Design " @ https://www.youtube.com/watch?v=IroPQ150F6c

other topics descending into hw spec's minucia
Re: Oldest Pc @ https://forum.tinycorelinux.net/index.php/topic,3216.15.html
Open Source Firmware Conference : I have come to bury the BIOS, not to open it @ https://forum.tinycorelinux.net/index.php/topic,25959.msg166511.html#msg166511

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version