sections in this module | City College of San Francisco - CS160B Unix/Linux Shell Scripting Module: Advanced Topics |
module list |
In previous sections, we learned about some operators on the value of a variable, specifically
${VAR:?message}
${VAR:-value}
and
${#VAR}
In this section we will add new operators - the substring operators. These operators output the variable's value after extracting, deleting, or modifying part of it. In all cases, the value of the variable is unchanged unless you reassign to it. Also, in all cases, the result is substituted on the command-line, so you must do something with it, either use it in another command or assign it to a variable.
Extract characters by offset and length
The simplest substring operator allows you to output certain characters from the value of a variable. The characters output are indicated by the starting character and a length:
${VAR:offset:length}
As in all of these variables, VAR is the name of the variable. Here, offset and length are integers. offset indicates the first character to output, where characters are numbered starting at 0. length indicates how many characters to output.
If length is missing, the remainder of the value is output:
The offset and/or length can be variables themselves:
VAR, however, must be a variable name. You cannot use the output of command-substitution, for example, unless you first save the result in a variable. For example
does not work, but
does.
Substitute text in variable's value
The substring operator
${VAR/pat/str}
can be used to substitute str for the first instance of pat in $line. Here, pat is a wildcard pattern, and str is a string. Let's look at a few examples:
Here, of course, pat is just a string. It could just as easily be a wildcard:
Note that there were two choices to match the pattern *d: abcd and abcdefd. The substring operator, like grep, chooses the longest match. Another way to remember this is patterns are greedy.
You can modify this simple behavior of substituting the first match by a few changes to your substring operators:
${VAR//pat/str}
changes all occurrances of pat to str.
${VAR/#pat/str}
still substituted str for pat, but pat must be anchored at the left end of the string. Similarly, pat in
${VAR/%pat/str}
must be anchored at the right end of the string. Let's look at a few examples:
This deletes the first instance of ??c in line. It is the same as
but
$ echo ${line/%??c/}
abcdefdef
only works if the last character in the value is c.
The hardest thing about these operators is remembering which of # and % anchors on which end. You can easily remember them as the % symbol always comes to the right of a number. Similarly, # always comes to the left of a comment! (These characters and their meaning will be more important in other substring operators.)
Delete pattern operators
The last set of substring operators take a little getting used to. These operators delete part of a variable's value using a pattern, either on the left or on the right end, then substitute what remains. This result is often compared against the original value to see if the deletion worked, or, in other words, to see if the variable contained the pattern. Although these operators are very useful, the convoluted logic is difficult at first. Let's look at an example:
Suppose you have a path in $file and want to know if the path ends in .pdf Here is what you do:
variable $file | after deleting .pdf on right | is it different than the original? |
ends in .pdf | value changed | yes |
doesn't end in .pdf | value unchanged | no |
In words:
These deletion operators have the following form
${VARoppat}
where VAR is the variable's name, op is the operator (see below), and pat is a wildcard pattern that matches the text to be deleted. If pat does not match, nothing is deleted, and the original variable's value is substituted.
operator | meaning |
# | delete shortest match anchored on left |
## | delete longest match anchored on left |
% | delete shortest match anchored on right |
%% | delete longest match anchored on right |
Before we take up our .pdf problem, let's use our new operators on our variable VAR:
Since the variable's value does not end in c, it is silly to use the same pattern and variable when the pattern is anchored on the right. Instead we will use c*
Returning to our .pdf problem, let's fill out our code:
Again, here is what is happening:
Examples:
1. If $name contains a person's name in the form last,first or last,first middle(s), write a sequence using substring operators to output their name as first last or first middle(s) last
First, we separate the last name from the rest:
last=${name%,*}
and the rest from the last name
rest=${name#*,}
then output the result
echo "$rest $last"
Alternately, we could do this in one step
echo "${name#*,} ${name%,*}"
2. Mimick the function basename using substring operators, using the variable $path
The simplest solution for this is ${path##*/}, but there is a caveat:
So you would need an if statement besides the substring operator
3. Mimick the function dirname using substring operators, using the variable $path
The simplest solution for this is ${path%/*}, but, again, there are special cases:
$ dirname /file1
/
$ dirname file1
.
so you would, again, need a couple of if statements.
4. In Example 1 from module 7, two lines of our shell script were:
if file "$dir/$file" | grep ':.*text' > /dev/null; then
if ! echo "$file" | grep -q '\.txt$'; then
Rewrite this code using substring operators.
Assuming the string text output by file always comes at the end of the description, we can rewrite this code as follows:
Conclusion
After looking at these substring operators, you might ask: Why would I want to use them? You are correct: they are confusing. Even seasoned shell programmers must stop when they encounter substring operators so they can decipher them. There are three reasons we learn them:
Prev | This page was made entirely with free software on linux: Kompozer and Openoffice.org |
Next |