internal field separator.
© 2013 Jan Zulawski <fdd@altair.pw>

Suppose you want to pass each occurence of a certain file type in a directory hierachy as an argument. You need not only the basename of the file, but also its path, therefore you decide you want to use find(1), instead of just plain old ls(1) w/ `-R'.

Fair enough, and for what you considered at the time to be simplicity over unnecessary cluttering, you decided you don't want to use find(1) w/ `-exec', but rather your favorite shell's loop control capabilities.

So you agreed on writing something like

$ for i in `/usr/bin/find -name "*.bmp"`; do : insert exec here; done

$ # construct proper filenames.
$ /usr/bin/find -name "*.bmp" -print0 | xargs -0 -I '{}' echo \"'{}'\"

Anyway, if the above is used as a variable, such as in the following:

$ for i in `/usr/bin/find -name "*.bmp" | sed 's/\ /\\ /g'`; do ls $i; done

or,

$ for i in `/usr/bin/find -name "*.bmp"`; do ls $i; done

(yeah, you don't need extra whitespace quoting for this to be accomplished), it simply does NOT work as intended.

Why?

Because bash loop control statement constructions use all kinds of whitespace as delimiters, including ' '.

So, if you feed a list to the shell's for, it will take both the space and the tab characters as separator tokens, e.g.,

$ # "1 2 3"

so having a set of expressions like

$ list="1 2 3"
$ for i in $list; do echo $i; done

will produce the expected result, that is

> 1
> 2
> 3

Say we want our list to have only two actual arguments, one of them containing a space character, like in the following:

$ for i in 1\ 2 3; do echo $i; done

This will print out - again - a straightforward result, that is

> 1 2
> 3

Now, if the above list is embedded in a variable, this suddenly becomes exponentially obscure:

$ list="1\ 2 3"
$ for i in $list; do echo $i; done

Producing this output:

> 1\
> 2
> 3

More than just a bit weird, I'd say.

Thing is, bash has a special internal environment variable that handles exactly this behaviour. One shouldn't really be surprised in any way, due to bash's notoriety on having over-complex internals. We live in a modern world.

The variable in question is IFS (Internal Field Separator). It defines the token or set of tokens used when splitting fields (pretty neat by now, ain't it?).

By default, it is defined as:

$ set | grep IFS | sed 1q
> IFS=$' \t\n'

OK, so far, so good. At last.

This means that all we have to do is to temporarily set this IFS to something else, which doesn't contain the space character.

[Typing nervously]

$ IFS=$'\n'

$ list="1 2 3"
$ for i in $list; do echo $i; done
> 1 2 3

Notice that hard quoting (escaping) the space characters is not needed anymore.

That's it for tonight Go listen to some Depeche Mode. Cause you best believe that's what I'm doing at this moment.


-- May 31, 2013.

tonight
black celebration
tonight

[ up ]