sections in this module | City College of San Francisco - CS160B Unix/Linux Shell Scripting Module: Functions |
module list |
If you program in most languages, you are told to use functions for everything. This is not a good rule when programming in shell. Because of their simple interface, functions can easily obscure, rather that clarify, program flow. However, creating well-designed functions are still useful tools in our scripting toolbox in several situations:
you want to use the same function(s) in multiple scripts. These functions would be placed in a separate file (function library) and sourced (using the . operator) by each script that needed them.
you need to perform the same task at several places in your shell script. A well-designed function may be useful to perform this repetative task.
you have a well-defined task that produces a boolean (success/failure) result and you want to hide the complexity from your main script. Again, if you are careful about the interface, a function can clarify your code.
It is, however, generally a bad idea to modularize your code using functions. If your shell script must perform several well-defined tasks sequentially it may be tempting to write four functions and call them in sequence. This is a bad idea and will result in a script which is more difficult to write, debug, and, most importantly, to analyze than the equivalent script using sequential code.
Why all the fuss? Aren't functions wonderful? The problem is that all variables used in a shell program are, by default, global, whether they appear in a function or not! This makes it much easier to follow a shell program written as a single piece of code than it is if functions are used. Functions are still useful, but they must be written carefully to enhance, rather than obscure, your program. We will learn techniques to do this later in this module.
Function syntax
Functions look and act like mini-commands. They have their own set of arguments: $1 $2 ... $9 $* $@ $# except for $0 which, in bash, is still the name of the shell script.
Just like your shell script or any Unix command, functions can output to standard output or to standard error, they can be used in command substitution, and they have an exit status. Only in this last instance, the exit status, is there a difference - the function exit status is set using return n. If you call exit inside a function it still exits the shell script.
Listen up, all you C++ and Java programmers! return n in a function sets the function exit status to n. It is not used to return a data value to the caller!
Just like exit 0 is used to indicate the program succeeded, return 0 indicates the function succeeded. Any other value indicates the function failed.
To start with, we will go ahead and try a function and see how this works. Here's our program:
Functions in the shell act like mini-commands, so calling this function looks just like using a standard Unix command. The function call is on the last line of our program.
When we run our shell script (named firstfuncss), the output is:
-bash$ firstfuncss
in ./firstfuncss: received 0 arguments. Calling firstfunc()
Hello from firstfunc
I was called from ./firstfuncss
with 4 arguments. They are
here are the arguments
-bash$
Let's go through our first use of a function with a few notes:
The function interface
By default, all variables in a function are global. Thus, if the function creates or alters any variables inside the function, those variables are "visible" when the function returns. In the last example, the function assigns to a variable name. After the function is called, name is altered.
Below is a slight modification of our program firstfuncss, with the modifications in bold face. The function itself has not been modified, but notice what happens to the variable name:
and running our new program (firstfuncss1) outputs (again, changes are in bold):
Looking at the main shell script, it is not obvious that firstfunc() modifies the variable name. This modification of a program variable by a function is called a side-effect, and it is why shell functions can obscure your program.
This side-effect can be avoided by declaring the variable in the function using the local declaration. Variables declared as local in a function do not conflict with variables in the surrounding shell script. Look at the output of this same example after adding the line
local name
to the beginning of firstfunc():
and the output of running this version (firstfuncss_local):
-bash$ firstfuncss_local
in ./firstfuncss_local: name is 'Greg Boyd'
in ./firstfuncss_local: received 0 arguments. Calling firstfunc()
Hello from firstfunc
I was called from ./firstfuncss_local
with 4 arguments. They are
here are the arguments
after firstfuncss: name is 'Greg Boyd'
-bash$
Thus we have our first rule of using functions in shell scripts: all variables in a function should be declared local. (Note: in other shells, local would be typeset. Either declaration works in bash.)
Passing data to and from functions
Passing data to a function is easy - use the function arguments. Inside your function, copy the argument(s) into named local variables for clarity. If your function needs more than a few arguments, unless they are a list of objects to work on, it probably shouldn't be a function. You can also "pass" data to a function using global variables. If you must do this, you must call attention to it in the function documentation and at the function call.
Getting data back from a function is much more difficult. There are two "safe" methods:
The only other way to "get" data from a function is to use global variables. This violates basic programming practices, but it is acceptible if it is obvious and limited. You should document such side-effects both at the function call and at the function entry.
Example #1 - a Boolean function
In this example we will write the simplest kind of function: one that returns a Boolean value. This is a very nice task for a shell function because if the function is written correctly it is very safe and its use can be self-documenting.
The name of the function is loggedin. loggedin() will take a single argument, a username. It will return success or failure depending on whether the given user is currently logged in.
The raw material for our function is the output of who, where fields are separated by multiple spaces:
Given a username, our function will succeed or fail depending on whether there is a line for that user in the output of who. We need to use a regular expression to match the username. The RE is very simple: the username must appear at the beginning of a line and it must be followed by a space. Our function will simply succeed or fail - it will not output any error messages.
Here is a reasonable function and an example of its use:
This is straightforward and easy to read. However, there is a bit of code in loggedin() that is redundant. Here is a simpler version:
Here we are taking advantage of the fact that the exit status of a function is the exit status of the last command executed in it. I'll let you decide which is more clear. (This file containing this last version is named loggedin.)
Example #2 - returning a simple value using a global variable
For this example, we will expand on our loggedin program. Our function (and the program) will be named wholine. This function will extract the first line from the output of who corresponding to a user and place it in a global variable line. The user, again, is given as its single argument. We will continue to use the exit status of the function to indicate whether the user was found or not.
You should notice a few things about this function:
Example #3 - returning a simple value using command substitution
We will modify the previous example slightly to make it "safer". Here, our function will output its result to standard output and we will use command-substitution to capture it: (we will name our new file wholine1)
This technique is better than the use of a global variable for this reason: Since we use command-substitution to run the wholine() function, the function runs in a separate process and any variables it sets cannot affect our variables. Thus the function can use all the variables it wants and there is no need of a local declaration.
Remember, all the standard output of a command is captured when you use command substitution. This is why it is important to always output error messages and user prompts to standard error.
The three files loggedin, wholine and wholine1 are in this module's directory beneath the examples directory on hills.
Using functions safely
Let's review the guidelines for functions in the shell:
Preview question: Find the directory corresponding to this module's example files on hills and review (and run) the three programs we just discussed before proceeding to the next section. |
Prev | This page was made entirely with free software on linux: Kompozer and Openoffice.org |
Next |