By David W. Myers (dwm042@comcast.net)
written March 30, 2000
revised November 22, 2006
The Purpose of this Document
The purpose of this document is to detail ways to write clean, commercial grade shell scripts that can be easily understood, read, maintained and used by others without having to guess at the intention of the author. We intend to detail good and poor code practices and suggest alternatives to inefficient or situation dependent coding techniques. The focus is on Solaris, HP-UX, and Korn shell, with some time taken to show how to write commercial grade startup and shutdown scripts. Some examples will be taken from Linux, as this Unix variant is gaining popularity and the contrast is useful in detailing the common features of the Unix operating system.
The Nature of Modern Shells
If you log onto a modern computer running Sun Solaris 2.6 and enter the command
#file /bin/sh
the response would be:
/bin/sh: ELF 32-bit MSB executable SPARC Version 1, dynamically linked, stripped
The pertinent information is that the executable is dynamically linked. To run, sh requires access to the shared libraries in /usr. Consequently, on a system in single user mode with only the root file system mounted, sh will fail. In fact, all the default shells in a modern SysVR4 Unix are dynamically linked except for one: the statically linked shell in /sbin/sh (In Linux, the equivalent is /sbin/sash ). Therefore, the only shell that is safe to use in startup and shutdown is the statically linked shell /sbin/sh.
Default shells in Unix are moving away from their Bourne roots and are becoming (bash) or have become POSIX compliant shells (8-10). This is in contrast to Korn shell, ksh, which has noticeably different properties with regard to function handling and the passing of command line parameters to functions (12). The new Korn Shell, the 1993 version of ksh, is a hybrid, capable of running as a POSIX compliant shell while also supporting a function syntax that is identical to the 1988 shell. As a consequence, it is arguable that ksh is one of the most versatile of the Unix shells, and more recently has become available as an open source product (13).
A competitor to ksh among shells is the Z-shell, zsh, which has superior command line editing capabilities, sports a ksh compatibility mode, and is becoming more widely distributed (14). Zsh is now available as a download from http://sunfreeware.com/ for Solaris 8. However, age, use, and familiarity still argue in favor of ksh as the working shell for many Unix systems administrators (20). The POSIX shell /sbin/sh has to be used in startup and shutdown, and zsh has many advantages as the interactive shell of choice. But as a scripting shell, ksh has been widely used and is a de facto standard among 'old school' administrators.
In this document:
Alternatives to Shell
In recent years, the language Perl has come to the fore as a scripting language, especially as a glue language for web based applications (15). Larry Wall, the language´s founder, has been very vocal in its support and as an open source product, it has benefited from the groundswell of support for the open source movement. The result is that Perl, more than any other scripting language, has full featured open source libraries and a body of code greater than any other general purpose scripting language (20). Just, when a machine is down and /usr isn't mounted, an administrator still has to be able to write shell script.
Shell is not fast, it´s an interpreter that often relies on coded C binaries to accomplish certain tasks (e.g. sorting). It is fairly common to write a program in shell, find that it doesn´t run fast enough and then have to rewrite the program into a compiled language, such as C.
Naming Your Scripts
Buchholtz (21) had this to say about the naming of files in Unix and all I can add is that I wholeheartedly agree with this sentiment:
"Do *not* use the filename to indicate the language an executable was written in! I´ve been burned by this and now make it a standard part of my training program for student-sysadmins:The following happened here:
1) Nice shell script was written to handle foo. Newbie named it 'foo.sh'.
2) It got used in dozens of scripts.
3) We figured that to really handle all of the cases, the Bourne shell just wasn´t cutting it, so it was re-written in C. All the scripts that used 'foo.sh' broke, so we renamed the binary executable, 'foo.sh'. Now, all the scripts run fine. Gahk.
4) Perl was born, and eventually the foo program was rewritten in perl. Now we have a perl script named 'foo.sh'.
5) We eventually hunted down all the calls and changed them.
In CP/M, DOS, and Windows using extensions for executables works because the command to run 'foo.bat' is 'foo'. In Unix, the command to run 'foo.sh' is 'foo.sh', so adding the suffix kills your implementation independance. Data files can get extensions: .tar, .Z, .gz, .tar.gz, .cf, .a, .so, etc. Executables don´t get extensions. Unless you like typing 'ls.exe', 'cp.exe file1 file2', etc :-)"
Beginnings: The First Line of a Shell Script
The first line of a shell should always tell you which shell is in use:
Either:
#!/sbin/sh
or
#!/bin/ksh
The alternative is to allow the default shell to execute the code. This can break scripts. If the user is in csh, and runs a ksh script, the Korn shell script is likely to fail. More so, detailing the language in use on the first line tells the reader what kind of program they are reading.
It is the belief of the author that the same practice should hold for other scripting languages, such as awk and Perl.
#!/bin/awk –f
#!/usr/bin/perl
The Minimal Header
The minimal header has three comment lines at the beginning of the script. The first line defines the interpreter in use. The second line of a shell should be a comment, and give the name of the shell and a brief summary of its usage. The third is a blank comment line, for clarity.
#!/bin/ksh
# findman – a program for searching through the file system
and finding all directories with man pages
#
How Much to Document in the Header?
This is a topic for which the honest answer is, decide what you need to document and stick to it. When I write code, my standard header is:
#!/bin/ksh
#
# name: findman
# author: David Myers
# created: 02/19/1998
# modified: 09/23/1999
#
followed by a usage function. Less information may be necessary if you are using source code control, such as cvs.
Search Path, Critical Environment Variables
Critical environment variables, such as PATH, need to be explicitly set in a good script, or they need to be sourced in the code. Relying on a .login or .profile to set these variables is bad practice. An administrator could enter the working account with the command:
#su username
and thereby miss the .login or .profile altogether.
Source Files: Where Should They be Located?
The files that are sourced to provide environment variables for scripts should be located on the root file system. My personal preference is to use the /env subdirectory. Others may prefer a different location.
Argument Checking
A good script should check that all supplied arguments are correct.
Exit Status
A good script should exit. It is not good practice to start processes with an endless loop. The script should return an exit status of 0 if the program completed correctly, and a non-zero exit status if the program aborted.
Scripts tend to build on scripts. A script, once written, can be used by another piece of code. So the onus is on the script writer to provide meaningful return codes.
Variables
As environment variables are capitalized, then it is advised that a separate convention be used for variables within scripts. Our recommendation is to capitalize the first letter of the variable and leave all the rest lower case.
#!/bin/ksh
# variableexamplescript – shows the capitalization pattern
for variables internal to a script
#
Thisvariable=$1
Thatvariable=$2
Another style is to capitalize each 'word' within the name of the variable. It seems a reasonable way to do things. Whatever convention you choose, it should be consistent. The point is to create code with a single look, and thereby ease the task of the reader.
#!/bin/ksh
# VariableExampleScripts – another capitalization pattern for
variables internal to a script
#
ThisVariable=$1
ThatVariable=$2
Local Variables
You can scope a variable to a function by using 'typeset'. This avoids side effects with variables in the main body of the code.
Code Blocks
The use of '{' and '}' to separate code into functional blocks is a commendable practice. Consider the following:
Code Fragment 1:
#!/bin/ksh
# logexample – code sends output to a log file
#
. /env/myenvvars.env
Log=/tmp/mylog.log
ps –ef >> $Log
metastat >> $Log
metadb -i >> $Log
Code Fragment 2:
#!/bin/ksh
# logexample2 – code sends output to a log file
#
. /env/myenvvars.env
Log=/tmp/mylog.log
{
ps –ef
metastat
metadb -i
} >> $Log
The second is much easier to read and maintain. Rather than having to change many statements if $Log is renamed, only one line has to be renamed.
The 'read' Builtin
In many cases, code that would otherwise be written in awk can be rewritten effectively in shell script by using the read function to parse text. Using awk is inefficent as a means for parsing, compared to the builtin in Korn and the POSIX shell.
#!/bin/ksh
# killname – a program to kill all processes given a command
line pattern
#
[ $1 ] || { print "No argument given. Usage: $0 [pattern]" ; exit 1
; }
ps –ef | grep $1 | grep –v grep | while read f Pid f ;
do
kill –9 $Pid
done
The Use of 'cat'
Catting a file into a code block is less efficient than redirecting a file into the code block.
Not Recommended:
cat $Log | while read f Pid f Name f Message f ;
do
print "Process $Name with pid $Pid left error message
$Message"
done
Recommended:
while read f Pid f Name f Message f ; do
print "Process $Name with pid $Pid left error message
$Message"
done < $Log
Indentation
Code blocks should be indented. Whether the indentation is two
spaces, three spaces, four spaces, or a tab (my personal opinion is
a tab is too much), it should be decided on as a group and then
adhered to.
The Usage Function
It is good practice to write scripts that are largely self-explanatory. People should be able to look at the script and figure out what is being accomplished here. Thus, the usage function, which is the response function to `script –h` or `script –q`.
function usage
{
echo "$SCRIPT [-hq] [-y value] arguments
this is a script that serves as an example of a usage
function
Arguments:
-h : show this message on screen
-q : show this message on screen
-y : an argument that can accept a value
"
exit 0
}
A usage function is meaningless unless you're parsing command line arguments with the getopts function.
Parsing Command Line Arguments
The getopts function is used to parse command line arguments:
#!/bin/ksh
# agetoptsexample – show how getopts can be used.
#
function usage
{
echo "$SCRIPT [-hq] [-a] [-b b_parameter ] arguments
this script shows how getopts can be used.
Arguments:
-h : show this message on screen
-q : show this message on screen
-a : an argument with no value
-b : an argument that expects a value
"
exit 0
}
SCRIPT=${0##*/}
Aflag=
Bval=
while getopts "hqab:" Arg ; do
case $Arg in
h) usage ;;
q) usage ;;
a) Aflag=1 ;;
b) Bval=$OPTARG ;;
*) usage ;;
esac
done
shift $(( OPTIND – 1 ))
Startup and Shutdown Scripts.
These are boilerplate, really. The outline of the script is:
#!/sbin/sh
# mystartupscript – it starts my really important
processes
#
case "$1"in
'start')
#
# comments about the startup process, if necessary, you place
here.
#
code here
;;
'stop')
#
# comments about the stop process, if necessary, you place
here.
#
code here
;;
*)
echo "Usage: $0 { start | stop }"
;;
esac
The placement of the script is dependent on the Unix distribution.
Solaris:
/etc/init.d
HP-UX:
/sbin/init.d
Linux:
/etc/rc.d/init.d
To execute, a link needs to be made from the runtime directory to the init.d directory of the respective operating system. Which kind of link is preferred again depends on the operating system of choice. In Solaris, hard links are in use by the OS. In HP-UX, symbolic links are used. The naming convention of the links in the runtime directories is also OS dependent. In Solaris, a startup script begins with the capital letter S and then is followed by two digits, and then an alphabetic name. In HP-UX, a startup script begins with the capital letter S and then is followed by three digits, and an alphabetic name.
NOTE:
In Solaris, a script named S100mystartupscript will not be the "100"th script to execute. It will be the "10"th and the alphabetic name of the script will begin with the letter 0.
Tests in Startup Scripts
The rules here are easily enough stated:
#!/sbin/sh
# StartupExample2 – show tests for code.
#
StartDir=/extra/apps
case "$1" in
'start')
if [ -d $StartDir ] ; then
if [ -x $StartDir/startprocesses ] ;
then
$StartDir/startprocesses
fi
fi
;;
'stop')
#
# Nothing special required to stop these processes.
#
;;
*)
echo "Usage $0 { start | stop }"
;;
esac
Bibliography
Unix and systems
programming
Unix Systems Administration
Shells Guides, Shell programming
Perl
Online coding resources
Communications, Newsgroups, Mailing Lists