When Speed Matters
I’d be the first to tell you if, when confronted with a task that requires a degree of automation, you use a scripting language like Python, Ruby, Perl, Bash or some mix of the four. However, I recently had a problem that involved changing the permissions on a large number of files.
My initial approach used a small (10 lines or so) Bash script to traverse the filesystem hierarchy and change the permissions based on whether we were processing a file or a directory. The resulting script ran over more than 3000 files and directories in about a minute and a half. Which wasn’t exactly slow, but I had to run this script about twenty to thirty times throughout the day and it felt very unproductive to have to wait for a whole minute and a half before I could continue my work. So I did something crazy: I rewrote that little Bash program in C.
The resulting C program - of maybe 75 lines of code - finished in 1.3 seconds.
Initially I thought I had made a mistake, so I checked the processed files. They were all exactly as I expected them to be. I was astounded: I could execute this program almost 150 times before my old Bash solution finished even once! This decision made my day - the rest of my afternoon was much more productive. I even felt happier to know I wasn’t wasting so much time on something trivial and secondary to the actual task at hand.
Again, I’m not one to go preaching about how important performance is with respect to any given language - often it’s much, much easier to write a few Python/Ruby/Perl scripts and push data between them with some Bash glue. In this particular case, however, the choice of a lower-level language was clearly a massive win.
Even in retaining the pragmatic perspective, it really makes you wonder just how much time is lost to “inefficient” software stacks…
I think a key point to clarify is whether the script was a blocker for you or not. If it was a blocker, then there’s no doubt you took the right approach and looked to optimise it a bit.
If it wasn’t a blocker, and you can run the script behind the scenes, then it becomes moot. If you fire and forget, then there’s no huge deal between the two.
Of course, being a bit of a performance nut, I give you kudos for what you did
But in the “real” world, if it doesn’t slow you down by preventing you from doing anything then it’s no biggie.
When are you writing the ASM version?
Absolutely right, OJ: that one minute wait was the bottleneck to the task I was trying to accomplish. Although one minute isn’t a long time it *feels* like an eternity when you’re sitting there waiting for it to finish. And unfortunately, it had to be run as part of a series of steps - it was step 2 of 3. Dead bang in the middle, slowing me down to a crawl on every iteration. *sniff*
Makes me wonder how long a script in Python, Ruby, or Perl would take. I avoid bash scripting for anything time-sensitive because the time it takes to fork commands like chmod quickly adds up and, by OS standards, creating new processes is not a lightweight operation on UNIX-like OSes.