Bash lessons learned with AoC 2021

I completed all the exercises of the Advent of Code 2021 in bash! (see my previous post)

You can find my solutions with comments in my GitHub repository at https://github.com/ColasNahaboo/advent-of-code-my-solutions/tree/main/bash/2021

I must say I "cheated" a bit. My solution for the Day 24 was too slow in bash, and as I was a bit out of steam, I did not try to find a smart algorithm. I just noticed that I solved it by building a bash arithmetic expression (of 1 million characters...) that I then evaluated in bash, and since its syntax was exactly the one for C... I just made the bash script compile the bash expression in C and execute it, using the C compiler as a bash arithmetic just-in-time compiler :-)

I just discovered AoC this year, and I am impressed. Challenging, fun, and a great way to progress. I am going to do the previous years, too, but in "real" languages this time. At least ones with data structures... I will start wig Go.

What I learned:

  • Coding in bash is not so bad, even if it can border on insanity at times :-).
  • Modern bash features are often overlooked but very useful.
  • Passing shellcheck should be a mandatory goal for each bash programmer. I resented it at first since I thought it was adding unecessary syntaxic sugar to my code until I realized that it was a symptom that my coding style was the problem.
  • Use [[...]] for strings and ((...)) for integers other any of the legacy constructs like [...], test, etc... The code is then much cleaner and safer (and a tad faster), as you do not have to quote as much. E.g: [[ -z foo ]] instead of [ -z "$foo" ], or even ((i=j)) instead of i="$j"
    • but, this makes traditional debugging with set -x less useful as the values of variables are not displayed anymore. E.g if j is 2, the tracing of i="$j" shows i=2 whereas the tracing of ((i=j)) only shows ((i=j)). This could be where a bash debugger would be useful, but I know only one, bashdb, and it does not seem updated anymore, and I could not find a version working with bash 5.1. The trap DEBUG trick can be useful, though.
    • So, I tend to write now i=$((j+k)) while developing code, and maybe later for production switch to the a bit more efficient ((i = j+k))
  • I avoided arrays in bash because I found out they were abysmally slow when first introduced, to the point that managing data in files with grep, sed, ... was actually faster than using arrays. But not anymore! Bash arrays should now be used as much as possible.
  • A lot of bash functions can "map" on arrays. For instance ${tab[@]//x/y} will string-replace x by y in all the elements of the array tab, and is super fast.
  • Use arrays rather than the classic way to represent lists in bash by space-separated (or tab-separated) substrings in a string.
  • Bash can have typed variables: integer ones via declare -i or local -i, and using them makes your code safer.
  • Bash functions can be passed variables by name, useful for efficiency to avoid copying big arrays or strings, and to provide multiple return values by modifying passed variables. But it cannot recurse as it is not a passing by reference, but by name.
  • Working with arrays makes using $(...) impractical, as commands are executed in a subshell and cannot access arrays anymore to update them in the parent shell. So I tend to pass the return value(s) into global variables of the same name of a function. E.g. instead of x=$(foo), I write foo; x="$foo". Or pass variables by name to set them if I want to return multiple results.
  • To parse a space-separated string, the fastest is using set to map the elements in the positional parameters $1, $2, ... then the ${string#* } and ${string% *} operators are the fastest, closely followed by a read, the full [[ $string =~ ([-[:digit:]]+)[[:space:]]... ]] being 3 times slower. And if possible, using indexes is even faster: ${string:i:j}.
  • Use the faster $(< filename) instead of $(cat filename).
  • To copy an associative array A1 to A2 in bash 4.4+, do: A1_def=$(declare -p A1) && declare -A A2="${A1_def#*=}"
  • To access more than 9 parameters in a function: use braces: $10 wont work, but ${10} does.