Half-baked ideas

Bourne shell abstract interpreter

6 thoughts
last posted May 1, 2016, 2:53 a.m.
get stream as: markdown or atom

The problem

If you write software on Unix, you've probably seen this many times:

curl example.org/install.sh | sh

Often, there is a sudo involved.

This is bad, for four main reasons:

  1. It's (usually) insecure.
    Far too many installation instructions use plain HTTP (or omit the URL scheme), which means that any malicious intermediary can inject arbitrary code and compromise the user's system. Well-known URLs means that attackers can passively target them on any network where developers might connect to the Internet (Wi-Fi hotspots, coffee shops, hackathons…)

  2. It's dangerously unreliable.
    Even if you use securely verified HTTPS for distribution, if the connection fails while in progress, the truncated script will still be executed. If you're lucky, this might only give you an error, or a broken installation. If you're unlucky, the consequences could be disastrous: for example, rm -rf /tmp/foo/... could be truncated into rm -rf /.

  3. It's usually poorly targeted.
    Even if you solve the previous two problems, these installation scripts usually target the host environment badly. They might mess with /usr/local or /opt instead of using your OS's package manager, or vice versa. They might install their own copies of dependencies, when you want them to use your existing ones, or vice versa. They might install stuff into ~/bin when you want ~/.local/bin, or vice versa. The script has to try and work on every common OS and configuration in a fool-proof fashion, which usually means not supporting any particular OS or configuration particularly well.

  4. It obscures useful choices. Because these scripts are intended to be fire-and-forget, many choices of versions, options, and other tweakables are obscured.


There is no shortage of articles and essays explaining why this anti-pattern is bad:

There are various attempts to address the first two problems:

This still leaves the further problems unresolved, though.


Abstract interpretation

What I usually find myself doing with scripts like this amounts to doing manual abstract interpretation. Instead of executing the script, I open it in an editor, and trace through it.

Simple scripts may just be a list of commands to review, but more complex scripts often involve:

  • Substituting variable definitions
  • Inlining helper functions
  • Evaluating dynamic checks, and skipping code that's not relevant to my environment
  • Overriding certain paths, commands, or other hardcoded choices

What if a tool could automate this?


There is at least one academic implementation of abstract interpretation for Bash:

This work is oriented toward static analysis and finding bugs, but the techniques could probably be repurposed for use in an interactive evaluator.


Philipp Emanuel Weidmann's maybe (2016) is another interesting take: it executes shell scripts (or other executables) under ptrace, to intercept, log, and stub file system operations.

This implementation offers little safety and robustness, but it's a step toward what an abstract interpretation based execution previewer and modifier might look like.