Syntax and style issues

City College of San Francisco - CS160B
Unix/Linux Shell Scripting
Module: Conditionals

Syntax and Style Issues

Shell scripts are very sensitive about syntax. You have probably had the experience that an incorrectly-written commandline not only fails to do what you want, but does something different! Consider the following command

ls -l $DIR

The intention of this simple command is to list the contents of the directory DIR. It has two problems, however:

if DIR is not set, it lists the current directory
if the directory in DIR has a space in its name, it attempts to list two incorrect paths

Both of these problems could be avoided by judicious placement of double-quotes:

ls -l "$DIR"

will list either the correct directory, or give an error about no such directory, which is far preferable to listing the wrong directory.

Such syntax mistakes at the commandline are annoying. In a shell script, however, you do not have the ability to retype the command, so syntax mistakes make the difference between your program working reliably or not. If any of you are programmers already, you are aware of these two important axioms

Programs tend to work when you are testing them.

Programs rarely fail at a convenient time.

The result of these two axioms is that your program will work reliably so long as you are there to fix it. It will, of course, tend to fail the most disasterously when it is being run by the President of the company and you are on your dream vacation in Europe. (As a student, this could be reworded "your program will fail when it is run by your instructor after you hand it in!")

It will serve you well to write your shell scripts (and any programs) with these axioms in mind. It takes ten times as much time to debug a problem than it does to avoid it.

Quoting

Correct use of quoting will help you avoid many bugs. Usually, these situations call for adding double-quotes. The most important ones are:

enclose regular expressions given to grep in the strongest quotes possible. (If the regular expression does not contain substitutions to be performed, enclose it in single-quotes, otherwise in double quotes.)
you should generally enclose variable substitutions in double-quotes. This is particularly important if

the variable contains a filename. Although most people do not create files on Unix systems whose names have embedded spaces, files imported from other platforms often contain them.
the the variable was set by an input command or by extraction from another Unix command.

Filename generation

If you are generating a list of filenames, use wildcards if possible. Wildcard expansion occurs late in the series of commandline expansions and you can avoid the tokenization of the resulting filenames if you are careful. A simple example shows this.

$ cat showargs.bash
#!/bin/bash
echo "\$#=\"$#\""
echo "\$1=\"$1\""
echo "\$2=\"$2\""
echo "\$3=\"$3\""
echo "\$4=\"$4\""

The program above simply outputs its $# and first four arguments. If we are in a directory that contains two files:

$ ls -1
my file
xxx

Let's compare the use of showargs.bash when we use a wildcard character and an ls command

$ showargs.bash *
$#="2"
$1="my file"
$2="xxx"
$3=""
$4=""

$ showargs.bash $(ls)
$#="3"
$1="my"
$2="file"
$3="xxx"
$4=""
$

The test command

The most commonly-used command in if statements is test. It is used to compare variables as strings or as numbers, to check if they are empty, and to check for the existence, permissions, and type of files. As an example, the following if statement cat's the file "$inputfile" if it is a readable regular file:

if test -f "$inputfile" -a -r "$inputfile" ; then

cat "$inputfile"

In this if statement, there is no confusion about spacing rules in the test command. Here, test has 5 arguments, separated by spaces. We're used to that. The importance of quoting the variable $inputfile is also fairly obvious. Consider what would happen in the above test command if the variable had not been set. The equivalent resulting command would be:

test -f "" -a -f ""

test would try to open a file whose name was empty, and would give an error about a non-existent file. If however, the variable had not been quoted, the test command, after substitution would look like this:

test -f -a -r

which is a syntax error! Believe me, syntax errors resulting from running your shell script will significantly affect the user's opinion of your programming ability, as well as your ability to do your job in the future!

The above issue looks obvious. But remember, a 'bug' that is due to missing quotation marks will not show up in normal shell script testing! These data-dependent bugs are the most difficult to detect.

The need for quotes and spacing is fairly easy to see when test is written as a standard Unix command. Unfortunately, a common way to write test is using the square bracket notation. In this case, it is not so obvious that the spacing rules are important:

if [ -f "$inputfile" -a -r "$inputfile" ]; then
cat "$inputfile"
fi

[ ] is not special punctuation for the if statement. It is really the test command in disguise! Spaces are still necessary between each piece, and quotes are still necessary around the variables!

Missing spaces and quotation marks in [ ] commands are common causes of syntax errors.

Lengthy control constructs

Even with correct indentation, try to close your if statements as soon as possible. This will enhance readablity as well as make your shell script easier to write and to modify. Consider the following example of a shell program that requires a single filename argument. The approach we may learn in more structured programming languages is to write the program like this:

prog=$(basename "$0")
input=$1
if [ $# -eq 1 ]; then

if [ -f "$input" -a -r "$input" ]; then

# the main body of your shell program goes here,
# even if it is several pages long

else

echo "$prog: '$input' is not a readable file"
exit 1

else

echo "$prog: single argument required."
exit 1

fi
exit 0

We may have learned that program (or function) exits should only occur at the end of the program (or function), or, better, should be limited to a single exit. Though this may be useful in a C++ program, it is not useful in a shell script. This requires the indentation of each line of your shell program, limiting the length of subsequent commands, and making writing your program difficult. It also delays the disposition of errors, requiring the reader to search for them! (What do we do if the argument is not a readable file?")

Instead, take the less structure approach below. Here, the errors are disposed of immediately and clearly, and the main program does not have to be indented!

prog=$(basename "$0")
input=$1
if [ $# -ne 1 ]; then

echo "$prog: single argument required."
exit 1

fi
if ! [ -f "$input" -a -r "$input" ]; then

echo "$prog: '$input' is not a readable file"
exit 1

fi

# the main body of your shell program goes here,
# even if it is several pages longelse

exit 0

This is much more readable. Besides, if the arguments are incorrect, you want to give an error message and get out immediately. While both organizations do this, the latter one is much more obvious.

Case statements

The structure of case statements can be even more problematic than that of if statements, and can very easily get out of control if the case clauses are long at all. Use your case statements for simple comparisons, and avoid case clauses that are excessively long. As always, observe strict indentation. A case statement that passes half a page in length quickly becomes unreadable.

&& and || complexity

The && and || constructs are very popular in shell scripts in place of an if statement. This is fine if the if- and else- clause are a single command:

grep 'Greg' file1 > /dev/null && echo "Greg is in file1"

grep 'Greg' file1 > /dev/null || echo "Greg was not in file1"

These are fine. Don't go overboard with these. In general, if your && or || command needs to be continued on a second line, perhaps you should replace it with an if statement. In any case, organize it for clarity. The following && / || command is complex, but by organizing the line breaks and indenting to show dependency it becomes marginally readable:

[ -n "$(grep "^$id:" /etc/passwd | cut -d: -f5 | cut -d, -f1)" ] && \

echo "name found in passwd file" || \
echo "name not found in passwd file"

Note that the translation between if statements and &&/|| is inexact. This is because && binds more tightly than ||. The if statement

if [ -f "$file" ]; then
echo "'$file' found"
else 
echo "'$file' not found"
fi

actually translates exactly to

( [ -f "$file" ] && echo "'$file' found" ) || echo "'$file' not found"

This means if the first echo command failed the second would be run. This is a moot point since echo cannot fail, but use &&/|| combinations carefully with other commands.

This page was made entirely with free software on Linux:
Kompozer and LibreOffice