sections in this module | City
College of San Francisco - CS160B Linux Shell Scripting Module: Loops1 |
module list |
for loops rely on generating a list of items to work on. Before we discuss for loops, then, we must define a term that has appeared before, but is crucial to understanding for loop - a list.
Lists
When these notes refer to a list, they mean a sequence of text words. You can think of a list as what the shell creates from the Linux command-line.
Example: if the command-line arguments were "hello there" shell
script, the list created from it has three "words": hello there, shell,
and script.
arguments: "hello there" shell script | list (3 items): hello there, shell, script |
A listing of the current directory shows: $ ls -1 file1 hello this file to day |
If this list is created using *, the list items correspond exactly to the filenames (four items): file1, hello, this file, and to day If the list is created using command-substitution, we may have some problems. This is discusssed soon. |
The for loop
The for loop is the loop of choice if you need to process a list of items. The most common use of a for loop is to process a set of files:
In this loop, the next item in the list is assigned to the variable file
(the iteration variable),
each time through the loop. In this example, the list is
generated by the wildcard *, which expands to every non-hidden
name in the current directory. The loop stops when the list is
exhausted, thus the for
loop is the loop of choice if you can list the things you want to
process, as there is no danger of the loop being infinite. There
is one
caveat: the list cannot be too
big.
The "too big" limitation is because there is a maximum length for
a Linux
command-line which must fit the for loop and its body. This limit is
system-dependent. A conservative rule of thumb is if you can fit
the
list in a standard-sized terminal window, you should be safe. We
will discuss a more general loop for processing unlimited amounts
of data in the next module.
Exit condition
The for loop stops when the list is exhausted. You can obtain further control of the loop by using break and continue:
Example:
Copy all regular files from the current directory to the directory $TGT
List generation
The most important thing about the for loop is the generation of the list. Going over a few suggestions can ease this process and make your for loops easier to write. Remember, the list must be tokenized (i.e., divided into list items), so you should not enclose it in quotes:
for
item in a b c; do processes three
items: a, b, and c
for
item in "a b c"; do processes one item: a b c
The list can, of course, just be text, like the examples above. There are few rare cases when this is useful, but it happens:
for item in item1 item2 item3 .... ; do
here item gets set successively to item1, item2, item3, etc.
The list is usually generated by command- or variable substitution. The list is then broken apart using the delimiters in $IFS. For example, if the third field of a colon-delimited file $datafile contains an integer and you want to add together all the integers in this field, you could command-substitute the command cut -d: -f3 "$datafile" to generate the list:
for num in $(cut -d: -f3 "$datafile")
The list here contains the third field of each line in $datafile. Each field is followed by a newline, which is an IFS delimiter.
Since the result of command- or variable- substitution is broken apart on the delimiters in $IFS, you cannot easily substitute names that contain embedded spaces. This creates issues with filenames imported from other systems. One easy solution to this is to rely on wildcards to generate filenames. Wildcards are expanded after substituted text is retokenized. When the wildcard is expanded, the boundaries between the generated filenames are preserved, rather than being subject to retokenization. Let's look at a simple example of this to see what I mean.
The current directory contains:
We will now process the list when it is generated using command-substitution:
Now we will process the list when it is generated by a wildcard:
When the list of files was generated using command-substitution, the resulting text is broken apart using the delimiters in $IFS. When it is generated using the wildcard, the boundaries between the filesnames are preserved. Note that the quotes around the iteration variable "$file" in the echo command are very important.
I know this is confusing. So just remember: use a wildcard to generate a list of files.
(Note: If a command- or variable- substitution is used to
generate the list, the list is affected by changes in the $IFS
characters. In the example above, if IFS had contained a newline only,
the for loop using * and the for loop using $(ls) would have functioned
the same. (Yes, I know, now you're really confused. If this is the
case, just use wildcards when you can.))
Processing command-line arguments
A for loop can easily be used to process command-line arguments, since the default list is the command-line arguments. This means you do not have to specify a list.
Suppose our shell script myss was invoked like this:
myss 'these are the' command-line arguments
Inside myss, of course, there are three arguments. The most obvious way to write a for loop would be
but since $* is a variable that contains all the command-line arguments as a string, the result of the substitution is broken up on spaces giving the result
these
are
the
command-line
arguments
The simplest way around this is to just delete the list:
The result now is
You can do this yourself using the other positional parameter $@.
using $@
$@ is similar to $*. It is all the command-line arguments in one variable. However, $* is a string (or a list) that contains the text of each command-line argument separated from the next by a space. $@ is an array of the command-line arguments that preserves the actual boundaries between arguments. The trick is how to preserve this information when you substitute its value.
By default, substituting $@ is exactly the same as substituting $* - the result of the substitution is broken up using $IFS delimiters. Thus, the two loops
for arg in $*; do
and
for arg in $@; do
produce exactly the same results. If you want to preserve the boundaries between the command-line arguments (in other words, its quality as an array), you need to use a counter-intuitive trick: put double-quotes around $@
generates
these
are
the
command-line
arguments
while
generates
these are the
command-line
arguments
I know this is counter-intuitive, but think of $@ as being
$1 $2 $3 $4 $5 ...
and "$@" as being
"$1" "$2" "$3" "$4" "$5" ...
$@ is actually an array. The book covers arrays in general, but they are not very useful in the shell and are really ugly. Since you can gain access to the array $@ easily using set --, you can use it as the only array you really need. Returning to our example
$
ls -1
file2
file3
the other file
$
The command
fills the positional parameters correctly:
Iterating using an integer sequence
In some cases, you really do want to iterate a for loop through an integer sequence. As long as you really have a reason to do this, and are not just a Java or C++ programmer trying to get the shell for loop to work like a for loop in those languages, this is fine. There is a C-style for loop available in the shell. I have never used it, so you'll have to research it on your own.
You can, however, generate a list of integers to iterate on and use a standard for loop to process the list. You can so this using the seq command (see the man page) or just using {x..y} where x and y are integers:
$ for s in {1..5}; do
> echo "$s"
> done
1
2
3
4
5
$
The sequence does not have to increase, and you can specify the interval between each list item:
Prev | This page was made entirely
with free software on Linux: Kompozer and LibreOffice |
Next |