Shebang Shenanigans

You have probably all seen the shebang at top of shell scripts, the first line starting with #!/bin/sh.

The initial characters #! tells the OS that this isn’t a regular binary, but rather something that needs to run through an interpreter, namely the interpreter after #!. Therefore you can see lines like #!/usr/bin/perl, #!/usr/bin/awk, or #!/usr/bin/python.

Executing a file like ./test.sh, having the shebang #!/bin/sh, is similar to calling this command: /bin/sh ./test.sh.

While learning awk, I noticed the use of the shebang #!/usr/bin/awk -f, i.e. a shebang with extra arguments. This can also sometimes be seen for Python, usually as #!/usr/bin/python -u to enable unbuffered output.

I wanted to add some extra arguments to (g)awk, ideally running it as #!/usr/bin/awk -i inplace -f, to modify the file in-place. Surprisingly (to me), this didn’t work. It turns out that this is equivalent to the command: awk "-i inplace -f" file.awk, i.e calling awk with a single argument with all flags mashed into a single string. Not what I intended, and certainly not something that worked. This led me to the question: How does different Unix-like systems handle shebang arguments?.

Let’s investigate!

Helpers⌗

To easily see how arguments are passed to a binary, a wrote a small helper application in C. It prints one line for every argument passed.

#include <stdio.h>

int main(int argc, char **argv)
{
	for (int i = 0; i < argc; ++i) {
		printf("argv[%d]: %s\n", i, argv[i]);
	}

	return 0;
}

Running it like ./args hi there reader provides the following output:

$ ./args hi there reader
argv[0]: ./args
argv[1]: hi
argv[2]: there
argv[3]: reader

I copy this binary to /usr/local/bin/args, and then proceed to create the following test file, and make it executable with chmod +x file.txt.

#!/usr/local/bin/args -a -b --something

hello i'm a line that doesn't matter

Now, let’s try it out on different systems!

Linux⌗

As explained before this produced the following output:

$ ./file.txt
argv[0]: /usr/local/bin/args
argv[1]: -a -b --something
argv[2]: ./file.txt

As we can see, argv[1] has all flags stored as a single argument. :(

FreeBSD / OpenBSD⌗

Nothing exciting here, it turns out both these systems work the same way as Linux regarding shebangs.

$ ./file.txt
argv[0]: /usr/local/bin/args
argv[1]: -a -b --something
argv[2]: ./file.txt

macOS⌗

This worked exactly as I expected initially! Here each argument is passed independently to the interpreter.

$ ./file.txt
argv[0]: /Users/linus/args
argv[1]: -a
argv[2]: -b
argv[3]: --something
argv[4]: ./file.txt

OpenIndiana⌗

I also wanted to try out a Solaris-fork, in this case OpenIndiana, to try out different Unixes. Turns out this provided different results as well:

$ ./file.txt
argv[0]: /usr/local/bin/args
argv[1]: -a
argv[2]: ./file.txt

As we can see, OpenIndiana completely throws away anything except the first argument. (What? ಠ_ಠ )

Summary⌗

The results can be summarized in the following table:

	argv[0]	argv[1]	argv[2]	argv[3]	argv[4]
Linux	`/usr/local/bin/args`	`-a -b --something`	`./file.txt`
FreeBSD OpenBSD	`/usr/local/bin/args`	`-a -b --something`	`./file.txt`
macOS	`/usr/local/bin/args`	`-a`	`-b`	`--something`	`./file.txt`
OpenIndiana	`/usr/local/bin/args`	`-a`	`./file.txt`