sections in this module | City
College of San Francisco - CS160B Unix/Linux Shell Scripting Module: Loops1 |
module list |
The for loop is the loop of choice if you need to process a list of items. The most common use of a for loop is to process a set of files:
In this loop, the next item in the list is assigned to the variable file
(the iteration variable),
each time through the loop. In this example, the list is
generated by the wildcard *, which expands to every non-hidden
name in the current directory. The loop stops when the list is
exhausted, thus the for
loop is the loop of choice if you can list the things you want to
process, as there is no danger of the loop being infinite. There
is one
caveat: the list cannot be too
big.
The "too big" limitation is because there is a maximum length for
a Unix
command-line which must fit the for loop and its body. This limit is
system-dependent. A conservative rule of thumb is if you can fit
the
list in a standard-sized terminal window, you should be safe. We
will discuss a more general loop for processing unlimited amounts
of data in the next module.
Exit condition
The for loop stops when the list is exhausted. You can obtain further control of the loop by using break and continue:
Example:
Copy all regular files from the current directory to the directory $TGT
List generation
The most important thing about the for loop is the generation of the list. Going over a few suggestions can ease this process and make your for loops easier to write. Remember, the list must be tokenized, so you should not enclose it in quotes:
for
item in a b c; do processes three
items: a, b, and c
for
item in "a b c"; do processes one item: a b c
The list can, of course, just be text, like the examples above. There are few rare cases when this is useful, but it happens:
for item in item1 item2 item3 .... ; do
here item gets set successively to item1, item2, item3, etc.
The list is usually generated by command- or variable substitution. The list is then broken apart using the delimiters in $IFS. For example, if the third field of a colon-delimited file $datafile contains an integer and you want to add together all the integers in this field, you could command-substitute the command cut -d: -f3 "$datafile" to generate the list:
for num in $(cut -d: -f3 "$datafile")
The list here contains the third field of each line in $datafile. Each field is followed by a newline, which is an IFS delimiter.
Since the result of command- or variable- substitution is broken apart on the delimiters in $IFS, you cannot easily substitute names that contain embedded spaces. This creates issues with filenames imported from other systems. One easy solution to this is to rely on wildcards to generate filenames. Wildcards are expanded after substituted text is retokenized. When the wildcard is expanded, the boundaries between the generated filenames are preserved, rather than being subject to retokenization. Let's look at a simple example of this to see what I mean.
The current directory contains:
We will now process the list when it is generated using command-substitution:
Now we will process the list when it is generated by a wildcard:
When the list of files was generated using command-substitution, the resulting text is broken apart using the delimiters in $IFS. When it is generated using the wildcard, the boundaries between the filesnames are preserved. Note that the quotes around the iteration variable "$file" in the echo command are very important.
I know this is confusing. So just remember: use a wildcard to generate a list of files.
Processing command-line arguments
A for loop can easily be used to process command-line arguments, since the default list is the command-line arguments. This means you do not have to specify a list.
Suppose our shell script myss was invoked like this:
myss 'these are the' command-line arguments
Inside myss, of course, there are three arguments. The most obvious way to write a for loop would be
but since $* is a variable that contains all the command-line arguments as a string, the result of the substitution is broken up on spaces giving the result
these
are
the
command-line
arguments
The simplest way around this is to just delete the list:
The result now is
You can do this yourself using the other positional parameter $@.
using $@
$@ is similar to $*. It is all the command-line arguments in one variable. However, $* is a string (or a list) that contains the text of each command-line argument separated from the next by a space. $@ is an array of the command-line arguments that preserves the actual boundaries between arguments. The trick is how to preserve this information when you substitute its value.
By default, substituting $@ is exactly the same as substituting $* - the result of the substitution is broken up using $IFS delimiters. Thus, the two loops
for arg in $*; do
and
for arg in $@; do
produce exactly the same results. If you want to preserve the boundaries between the command-line arguments (in other words, its quality as an array), you need to use a counter-intuitive trick: put double-quotes around $@
generates
these
are
the
command-line
arguments
while
generates
these are the
command-line
arguments
I know this is counter-intuitive, but think of $@ as being
$1 $2 $3 $4 $5 ...
and "$@" as being
"$1" "$2" "$3" "$4" "$5" ...
$@ is actually an array. The book covers arrays in general, but they are not very useful in the shell and are really ugly. Since you can gain access to the array $@ easily using set --, you can use it as the only array you really need. Returning to our example
$
ls -1
file2
file3
the other file
$
The command
fills the positional parameters correctly:
Prev | This page was made entirely
with free software on linux: Kompozer and Openoffice.org |
Next |