sections in this module City College of San Francisco - CS160B
Unix/Linux Shell Scripting
Module: Loops1
module list

for loop

The for loop is the loop of choice if you need to process a list of items. The most common use of a for loop is to process a set of files:

for file in *; do
do something to "$file"
done

In this loop, the next item in the list is assigned to the variable file (the iteration variable), each time through the loop. In this example, the list is generated by the wildcard *, which expands to every non-hidden name in the current directory. The loop stops when the list is exhausted, thus the for loop is the loop of choice if you can list the things you want to process, as there is no danger of the loop being infinite. There is one caveat: the list cannot be too big. The "too big" limitation is because there is a maximum length for a Unix command-line which must fit the for loop and its body. This limit is system-dependent. A conservative rule of thumb is if you can fit the list in a standard-sized terminal window, you should be safe. We will discuss a more general loop for processing unlimited amounts of data in the next module.

Exit condition

The for loop stops when the list is exhausted. You can obtain further control of the loop by using break and continue:

Example:

Copy all regular files from the current directory to the directory $TGT

for file in *; do
[ ! -f "$file" ] && continue
if cp "$file" "$TGT" 2>/dev/null; then
echo "'$file' copied"
fi
done

List generation

The most important thing about the for loop is the generation of the list. Going over a few suggestions can ease this process and make your for loops easier to write. Remember, the list must be tokenized, so you should not enclose it in quotes:

for item in a b c; do      processes three items: a, b, and c

for item in "a b c"; do   processes one item: a b c

The list can, of course, just be text, like the examples above. There are few rare cases when this is useful, but it happens:

for item in item1 item2 item3 .... ; do

here item gets set successively to item1, item2, item3, etc.

The list is usually generated by command- or variable substitution. The list is then broken apart using the delimiters in $IFS. For example, if the third field of a colon-delimited file $datafile contains an integer and you want to add together all the integers in this field, you could command-substitute the  command cut -d: -f3 "$datafile" to generate the list:

for num in $(cut -d: -f3 "$datafile")

The list here contains the third field of each line in $datafile. Each field is followed by a newline, which is an IFS delimiter. 

Since the result of command- or variable- substitution is broken apart on the delimiters in $IFS, you cannot easily substitute names that contain embedded spaces. This creates issues with filenames imported from other systems. One easy solution to this is to rely on wildcards to generate filenames. Wildcards are expanded after substituted text is retokenized. When the wildcard is expanded, the boundaries between the generated filenames are preserved, rather than being subject to retokenization. Let's look at a simple example of this to see what I mean.

The current directory contains:

$ ls -1
file2
file3
the other file
$

We will now process the list when it is generated using command-substitution:

$ for file in $(ls); do
> echo "$file"
> done
file2
file3
the
other
file
$

Now we will process the list when it is generated by a wildcard:

$ for file in *; do
> echo "$file"
> done
file2
file3
the other file
$

When the list of files was generated using command-substitution, the resulting text is broken apart using the delimiters in $IFS. When it is generated using the wildcard, the boundaries between the filesnames are preserved. Note that the quotes around the iteration variable "$file" in the echo command are very important.

I know this is confusing. So just remember: use a wildcard to generate a list of files.

Processing command-line arguments

for loop can easily be used to process command-line arguments, since the default list is the command-line arguments. This means you do not have to specify a list.

Suppose our shell script myss was invoked like this:

myss 'these are the' command-line arguments

Inside myss, of course, there are three arguments. The most obvious way to write a for loop would be

for arg in $*; do
echo "$arg"
done

but since $* is a variable that contains all the command-line arguments as a string, the result of the substitution is broken up on spaces giving the result

these
are
the
command-line
arguments

The simplest way around this is to just delete the list:

for arg; do
echo "$arg"
done

The result now is

these are the
command-line
arguments

You can do this yourself using the other positional parameter $@.

using $@

$@ is similar to $*. It is all the command-line arguments in one variable. However, $* is a string (or a list) that contains the text of each command-line argument separated from the next by a space. $@ is an array of the command-line arguments that preserves the actual boundaries between arguments. The trick is how to preserve this information when you substitute its value.

By default, substituting $@ is exactly the same as substituting $* - the result of the substitution is broken up using $IFS delimiters. Thus, the two loops 

for arg in $*; do

and

for arg in $@; do

produce exactly the same results. If you want to preserve the boundaries between the command-line arguments (in other words, its quality as an array), you need to use a counter-intuitive trick: put double-quotes around $@

for arg in $@; do
echo "$arg"
done

generates

these
are
the
command-line
arguments

while 

for arg in "$@"; do
echo "$arg"
done

generates

these are the
command-line
arguments

I know this is counter-intuitive, but think of $@ as being

$1 $2 $3 $4 $5 ...

and "$@" as being

"$1" "$2" "$3" "$4" "$5" ...

$@ is actually an array. The book covers arrays in general, but they are not very useful in the shell and are really ugly. Since you can gain access to the array $@ easily using set --, you can use it as the only array you really need. Returning to our example

$ ls -1
file2
file3
the other file
$

The command 

set -- *

fills the positional parameters correctly:

$ echo "$3"
the other file
$

Prev This page was made entirely with free software on linux:  
Kompozer
and Openoffice.org    
Next

Copyright 2010 Greg Boyd - All Rights Reserved.

Document made with Kompozer