I have a long-running script that contains an outer loop and an inner loop where the outer loop process each of a continuous stream of integers and the inner loop performs an ever-increasing number of tests on the integer currently under consideration.
It's clear that the whole mess will eventually decay into uselessness as the number of iterations of the inner loop reaches some as yet undetermined value that will slow it down too much. FWIW, yes, I know I could have written the whole mess in C w/o much effort but this was more of a "recreational scripting" project than a "gotta get results" project. Still, knowing that the inner loop would be the critical performance point, I did put a little thought into making sure I wasn't do anything extravagant in the inner loop.
The script keeps it's results in a text file and the terminal output is really just so that, during otherwise idle times, I can tell at a glance if the system has hung (*) so the script can be stopped and started at my whim and it's progress persists across reboots.
But, with that terminal output in view, I noticed that sometimes the script runs noticeably faster than at other times. That got my curiosity up and I tracked it down to "runs faster when compiletc is loaded" then narrowed it down to "runs faster when grep.tcz is loaded. That got me back to the code in my script...
grep is used only in the outer loop and is called one, two, or three times per iteration (looking for a success but giving up if no success by the third try (**) ). grep is never used in the inner loop, yet even with that, metrics that I added showed that the entire script runs just over
ten times faster when using the grep binary as opposed to the busybox grep. This implies that in raw "grep performance" the grep binary is likely a lot more than ten times faster. Given that the grep binary is about one fifth the size of the busybox binary, its not surprising to me that the grep binary is faster to use, but it is surprising how
much faster.
*) system hangs - I think this is a hardware issue but it's intermittent. (For safety, backups every ten minutes, if anything has changed, via cron)
**) there's probably a way to do this in one go with a regex, but I hadn't had enough coffee and it is, after all, in the outer loop.