Search the Catalog
Exploring Expect

Exploring Expect

A Tcl-based Toolkit for Automating Interactive Programs

By Don Libes
1st Edition December 1994
1-56592-090-2, Order Number: 0902
602 pages, $32.95

Chapter 3
Getting Started With Expect

Three commands are central to the power of Expect: send , expect , and spawn . The send command sends strings to a process, the expect command waits for strings from a process, and the spawn command starts a process.

In this chapter, I will describe these commands and another one that is very useful: interact . To best understand this chapter, it will help to have some basic familiarity with Tcl. If you are wondering about a command that is not explained, look it up in the index for a reference in the previous chapter and read about it there.

The send Command

The send command takes a string as an argument and sends it to a process. For example:

send "hello world"

This sends the string " hello world " (without the quotes). If Expect is already interacting with a program, the string will be sent to that program. But initially, send will send to the standard output. Here is what happens when I type this to the Expect interpreter interactively:

expect1.1> send "hello world"

hello worldexpect1.2>

The send command does not format the string in any way, so after it is printed the next Expect prompt gets appended to it without any space. To make the prompt appear on a different line, put a newline character at the end of the string. A newline is represented by " \n ".

expect1.1> send "hello world\n"

hello world

expect1.2>

If these commands are stored in a file, speak , the script can be executed from the UNIX command line:

% expect speak

hello world

%

With a little magic it is possible to invoke the file as just " speak " rather than " expect speak ". On most systems it suffices to insert the line " #!/usr/local/bin/expect -- " and say " chmod +x speak; rehash ". I will explain this in more detail in . For now, just take it on faith.

The expect Command

expect is the opposite of send . The expect command waits for a response, usually from a process. expect can wait for a specific string, but more often expect is used to wait for any string that matches a given pattern. Analogous to send , the expect command initially waits for characters from the keyboard1. Using them, I can create a little conversation:

expect "hi\n"

send "hello there!\n"

When run, the interaction looks like this:

hi

hello there!

I typed the string hi and then pressed return. My input matched the pattern " hi\n ". Ideally, a return would be matched with " \r "; however, the UNIX terminal driver translates a return to " \n ".2 As you will see later on, it is rarely necessary to have to worry about this mapping because most of Expect's interactions occur with programs not users, and programs do not "press return". Nonetheless, it is occasionally useful to expect input from people. Plus, it is much easier to experiment with Expect this way.

If expect reads characters that do not match the expected string, it continues waiting for more characters. If I had typed hello followed by a return, expect would continue to wait for " hi\n ".

When the matching string is finally typed, expect returns. But before returning, expect stores the matched characters in a variable called expect_out(0,string) . All of the matched characters plus the characters that came earlier but did not match are stored in a variable called expect_out(buffer) . expect does this every time it matches characters. The names of these variables may seem odd, but they will make more sense later on.

Imagine the following script:

expect "hi\n"

send "you typed <$expect_out(buffer)>"

send "but I only expected <$expect_out(0,string)>"

The angle brackets do not do anything special. They will just appear in the output, making it clear where the literal text stops and the variable values start. When run the script looks like this:

Nice weather, eh?

hi

you typed <Nice weather, eh?

hi>

but I only expected <hi>

I typed " Nice weather, eh " <return> " hi " <return>. expect reported that it found the hi but it also found something unexpected: " Nice weather, eh?\n ".

Anchoring

Finding unexpected data in the input does not bother expect . It keeps looking until it finds something that matches. It is possible to prevent expect from matching when unexpected data arrives before a pattern. The caret ( ^ ) is a special character that only matches the beginning of the input. If the first character of the pattern is a caret, the remainder of the pattern must match starting at the beginning of the incoming data. It cannot skip over characters to find a valid match. For example, the pattern ^hi matches hiccup but not sushi .

The dollar sign ( $ ) is another special character. It matches the end of the data. The pattern hi$ matches sushi but not hiccup . And the pattern ^hi$ matches neither sushi nor hiccup . It matches hi and nothing else.

Patterns that use the ^ or $ are said to be anchored . Some programs, such as sed , define anchoring in terms of the beginning of a line. This makes sense for sed , but not for expect . expect anchors at the beginning of whatever input it has received without regard to line boundaries.

When patterns are not anchored, patterns match beginning at the earliest possible position in the string. For example, if the pattern is hi and the input is philosophic , the hi in philo is matched rather than the hi in sophic . In the next section, this subtlety will become more important.

What Happens When Input Does Not Match

Once expect has matched data to a pattern, it moves the data to the expect_out variable as I showed earlier. The matched data is no longer eligible to be matched. Additional matches can only take place with new data.

Consider the following fragment:

expect "hi"

send "$expect_out(0,string) $expect_out(buffer)"

If I execute these two commands, Expect waits for me to enter hi . If I enter philosophic followed by a return, Expect finds the hi and prints:

hi phi

If I execute the two commands again, Expect prints:

hi losophi

Even though there were two occurrences of hi , the first time expect matched the first one, moving it into expect_out . The next expect started from where the previous one had left off.

With simple patterns like these, expect always stops waiting and returns immediately after matching the pattern. If expect receives more input than it needs, that input is remembered for the possibility of matching in later expect commands. In other words, expect buffers its input. This allows expect to receive input before it is actually ready to use it. The input will be held in an input buffer until an expect pattern matches it. This buffer is internal to expect and is not accessible to the script in any way except by matching patterns against it.

After the second expect above, the buffer must hold c\n . This is all that was left after the second hi in philosophic . The \n is there, of course, because after entering the word, I pressed return.

What happens if the commands are run again? In this case, expect is not going to find anything to match hi . The expect command eventually times out and returns. By default, after 10 seconds expect gives up waiting for input that matches the pattern. This ability to give up waiting is very useful. Typically, there is some reasonable amount of time to wait for input after which there is no further point to waiting. The choice of 10 seconds is good for many tasks. But there is no hard rule. Programs almost never guarantee that "if there is no response after 17 seconds, then the program or computer has crashed".

The timeout is changed by setting the variable timeout using the Tcl set command. For example, the following command sets the timeout to 60 seconds.

set timeout 60

The value of timeout must be an integral number of seconds. Normally timeouts are nonnegative, but the special case of -1 signifies that expect should wait forever. A timeout of 0 indicates that expect should not wait at all.

If expect times out, the values of expect_out are not changed. Therefore, the commands above would have printed:

hi losophi

even though only c\n remained in the buffer.

Pattern-Action Pairs

You can directly associate a command with a pattern. Such commands are referred to as actions . The association is made by listing the action immediately after the pattern in the expect command itself. For example:

expect "hi" {send "You said $expect_out(buffer)"}

The command " send "You said $expect_out(buffer)" " will be executed if and only if hi is matched from the input.

Additional pattern-action pairs can be listed after the first one:

expect "hi" { send "You said hi\n" } \

"hello" { send "Hello yourself\n" } \

"bye" { send "That was unexpected\n" }

This command looks for " hi ", " hello ", and " bye " simultaneously. If any of the three patterns are found, the action listed immediately after the first matching pattern is executed. It is possible that none of them match within the time period defined by the timeout. In this case, expect stops waiting and execution continues with the next command in the script. Actions can be associated with timeouts, and I will describe that in .

In the expect command, it does not matter how the patterns and actions visually line up. They can all appear on a single line if you can fit them, but lining up the patterns and actions usually makes it easier for a human to read them.

Notice how all the actions are embedded in braces. That is because expect would otherwise misinterpret the command. What is the problem with the following command?

expect "hi" send "You said hi\n" ;# wrong!

In this case, hi is taken as a pattern, send is the associated action and " You said hi\n " is taken as the next pattern. This is obviously not what was intended! If the action is more than a single argument, you must enclose it in braces.

Because Tcl commands normally terminate at the end of a line, a backslash is used to continue the command. Since all but the last line must end with a backslash, it can be a bit painful to cut and paste lines. You always have to make sure that the backslashes are there. The expect command supports an alternate syntax that lets you put all the arguments in one big braced list. For example:

expect {

"hi" { send "You said hi\n"}

"hello" { send "Hello yourself\n"}

"bye" { send "That was unexpected\n"}

}

The initial open brace causes Tcl to continue scanning additional lines to complete the command. Once the matching brace is found, all of the patterns and actions between the outer braces are passed to expect as arguments.

Here is another way of writing the same expect commands:

expect "hi" {

send "You said hi\n"

} "hello" {

send "Hello yourself\n"

} "bye" {

send "That was unexpected\n"

}

Each open brace forces more lines to be read until a close brace is encountered. But on the same line that the close brace appears, another open brace causes the search to continue once again for a mate. Even though all the arguments are not enclosed by yet another pair of braces, the whole command is nonetheless read as one. This style has the advantage that it is easier to have multi-line actions, and the actions can be moved around more easily because they are not on the same line as their actions (presuming your editor can cut and paste by lines more easily than half-lines). If you want to further separate the patterns, you can rewrite it as:

expect {

"hi" {

send "You said hi\n"

}

"hello" {

send "Hello yourself\n"

}

"bye" {

send "That was unexpected\n"

}

}

While this looks like it wastes a lot of space, you can now cut and paste the first action ( hi ) without disturbing the " expect { ". You can decide for yourself which style is appropriate. Depending on the context, I may use any one of these. If commands are very short, I may even pack them all on a line. For example, the following command has two patterns, " exit " and " quit ". Their actions are listed immediately to the right of each pattern.

expect "exit" {exit 1} "quit" abort

Example -- Timed Reads In The Shell

I have shown how to wait for input for a given amount of time and how to send data back. I will wrap this up in a script called timed-read .

#!/usr/local/bin/expect --

set timeout $argv

expect "\n" {

send [string trimright "$expect_out(buffer)" "\n"]

}

The timeout is read from the variable argv which is predefined to contain the arguments from the command line. I will describe argv further in . The next command waits for a line to be entered. When it is, " string trimright ... "\n" " returns the string without the newline on the end of it, and that is returned as the result of the script.

You can now call this script from a shell as follows:

% timed-read 60

This command waits 60 seconds for the user to type a line and then it returns whatever the user typed. This ability is very useful. For example, suppose your system reboots automatically upon a crash. You could set up your system so that it gives someone the opportunity to log in to straighten out any problems before coming up all the way. Of course, if the machine crashes when no one is around, you do not want the computer to wait until someone comes in just to tell it to go ahead. To do so, just embed this in your shell script:

echo "Rebooting..."

echo "Want to poke around before coming up all the way?"

answer= ` timed-read 60 `

Now you could test to see if the answer is yes or no . If no one is around, the script will just time out after 60 seconds and the answer will be empty. The shell script could then continue with the rebooting process.

Surprisingly, there is no simple way for a shell script to wait for a period of time for an answer. The standard solution is to fork off another shell script that sends a signal back to the original shell script that catches the signal and tries to recover. This sounds easy but is fairly difficult to code. And if you are already in a forked process or have forked other processes, it is very tricky to keep everything straight.

By comparison, the Expect solution is straightforward. In the next chapter, I will show how to make the expect command strip off the newline automatically. This will make the script even simpler.3

The spawn Command

While interacting with a person is useful, most of the time Expect is used to interact with programs. You have already seen enough to get a feeling for send and expect . There is more to learn about them, but now I want to explore the spawn command.

The spawn command starts another program. A running program is known as a process . Expect is flexible and will view humans as processes too. This allows you to use the same commands for both humans and processes. The only difference is that processes have to be spawned first.4

The first argument of the spawn command is the name of a program to start. The remaining arguments are passed to the program. For example:

spawn ftp ftp.uu.net

This command spawns an ftp process. ftp sees ftp.uu.net as its argument. This directs ftp to open a connection to that host just as if the command " ftp ftp.uu.net " had been typed to the shell. You can now send commands using send and read prompts and responses using expect .

It is always a good idea to wait for prompts before sending any information. If you do not wait, the program might not be ready to listen and could conceivably miss your commands. I will show examples of this in a later chapter. For now, play it safe and wait for the prompt.

ftp begins by asking for a name and password. ftp.uu.net is a great place for retrieving things -- they let anyone use their anonymous ftp service . They ask for identification (you must enter your e-mail address at the password prompt) but it is primarily for gathering statistics and debugging.

When I run ftp by hand from the shell, this is what I see:

% ftp ftp.uu.net

Connected to ftp.uu.net.

220 ftp.UU.NET FTP server (Version 6.34 Thu Oct 22 14:32:01 EDT 1992) ready.

Name (ftp.uu.net:don): anonymous

331 Guest login ok, send e-mail address as password.

Password:

230- Welcome to the UUNET archive.

230- For information about UUNET, call +1 703 204 8000...

230- Access is allowed all day...

< a lot of stuff here omitted >

230 Guest login ok, access restrictions apply.

To automate this interaction, a script has to wait for the prompts and send the responses. The first prompt is for a name, to which the script replies " anonymous\r ". The second prompt is for a password (or e-mail address) to which the script replies " don@libes.com\r ". Finally, the script looks for a prompt to enter ftp commands. This looks like " ftp> ".

expect "Name"

send "anonymous\r"

expect "Password:"

send "don@libes.com\r"

expect "ftp> "

Notice that each line sent by the script is terminated with \r . This denotes a return character and is exactly what you would press if you entered these lines at the shell, so that is exactly what Expect has to send.

It is a common mistake to terminate send commands to a process followed by \n . In this context, \n denotes a linefeed character. You do not interactively end lines with a linefeed. So Expect must not either. Use " \r ".

Contrast this to what I was doing earlier -- sending to a user, or rather, standard output. Such strings were indeed terminated with a \n . In that context, the \n denotes a newline. Because standard output goes to a terminal, the terminal driver translates this to a carriage-return linefeed sequence.

Similarly, when reading lines from a program that would normally appear on a terminal, you will see the carriage-return linefeed sequence. This is represented as \r\n in an expect pattern.

This may seem confusing at first, but it is inherent in the way UNIX does terminal I/O and in the representation of characters and newlines in strings. The representation used by Tcl and Expect is common to the C language and most of the UNIX utilities. I will have more to say on the subject of newlines and carriage returns in .

Running this script produces almost the same output as when it was run by hand. The only difference is when the program is spawned. When you manually invoke ftp , you normally see something like:

% ftp ftp.uu.net

Instead expect shows:

spawn ftp ftp.uu.net

The difference is that there is no shell prompt and the string spawn appears. In , I will show how to customize this string or get rid of it entirely.

The remainder of the output is identical whether run interactively via the shell or automated via Expect.

Uunet is a very large repository of public-access on-line information. Among other things stored there are the standards and other documents describing the Internet. These are called RFCs (Request For Comments). For instance RFC 959 describes the FTP protocol and RFC 854 describes the Telnet protocol. These RFCs are all in separate files but stored in one common directory. You can go to that directory using the following commands:

send "cd inet/rfc\r"

Each RFC is assigned a number by the publisher. Uunet uses this number to name the file containing the RFC. This means that you have to know the mapping from the title to the number. Fortunately, Uunet has such an index stored as a separate document. You can download this with the following additional commands:

expect "ftp> "

send "binary\r"

expect "ftp> "

send "get rfc-index.Z\r"

expect "ftp> "

The first line waits to make sure that the ftp server has completed the previous command. The binary command forces ftp to disable any translation it might otherwise attempt on transferred files. This is a necessity because the index is not a text file but a compressed file. This format is implied by the .Z extension in the name.

The RFCs are named rfc ### .Z , where ### is the RFC number. Along with the index, they are all stored in the directory inet/rfc . By passing the RFC number as an argument, it is possible to add two more commands to download any RFC.

send "get rfc$argv.Z\r"

expect "ftp> "

This extracts the number from the command line so that you could call it from the shell as:

% ftp-rfc 1178

Notice that after the get command is another expect for a prompt. Even though the script is not going to send another command, it is a good idea to wait for the prompt. This forces the script to wait for the file to be transferred. Without this wait, Expect would reach the end of the script and exit. ftp would in turn exit, and the file transfer would almost certainly not be completed by then.

ftp actually has the capability to tell if there were problems in transferring a file, and this capability should be used if you want a robust script. In the interest of simplicity I will ignore this now, but eventually I will start presenting scripts that are more robust.

However, there is one change for robustness that cannot be ignored. The default timeout is 10 seconds, and almost any ftp transfer takes at least 10 seconds. The simplest way to handle this is to disable the timeout so that the script waits as long as it takes to get the file. As before, this is done by inserting the following command before any of the expect commands:

set timeout -1

So far this script simply retrieves the RFC from Uunet. As I noted earlier, the file is compressed. Since you usually want to uncompress the RFC, it is convenient to add another line to the script that does this. The uncompress program is not interactive so it can be called using exec as:

exec uncompress rfc$argv.Z

You could certainly spawn it, but exec is better for running non-interactive programs -- you do not have to mess around with send and expect . If uncompress has any problems, Expect reports them on the standard error.

The final script looks like this:

#!/usr/local/bin/expect --

# retrieve an RFC (or the index) from uunet via anon ftp

 

if {[llength $argv] == 0} {

puts "usage: ftp-rfc {-index|#}"

exit 1

}

set timeout -1

spawn ftp ftp.uu.net

expect "Name"

send "anonymous\r"

expect "Password:"

send "don@libes.com\r"

expect "ftp> "

send "cd inet/rfc\r"

expect "ftp> "

 

send "binary\r"

expect "ftp> "

send "get rfc$argv.Z\r"

expect "ftp> "

 

exec uncompress rfc$argv.Z

I have added a comment to the top describing what the script does, and I have also added a check for the arguments. Since the script requires at least one argument, a usage message is printed if no arguments are supplied.

More checks could be added. For example, if a user runs this script as " ftp-rfc 1178 1179 ", it will not find any such file -- the get will try to get a file named rfc1178 and save it locally as 1179.Z -- obviously not what the user intended. How might you modify the script to handle this case?

The interact Command

All of the uses of Expect so far have been to totally automate a task. However, sometimes this is too rigid. For a variety of reasons you may not want to completely automate a task. A common alternative is to automate some of it and then do the rest manually.

In the previous example, anonymous ftp was used to retrieve files automatically from the site ftp.uu.net . At the beginning of that script was some interaction to identify myself to the ftp server. This consisted of entering the string anonymous\r followed by my email address. Here was the Expect fragment to do it:

expect "Name"

send "anonymous\r"

expect "Password:"

send "don@libes.com\r"

Now consider doing this manually. If you like to browse through the many computers that support anonymous ftp , repeating this little identification scenario can be a nuisance. And it seems rather silly since your computer is perfectly capable of supplying this information. This so-called password is not really a secret password -- it is just an email address. Let Expect do this part while you do the browsing.

Expect provides a command that turns control from the script over to you. It is named interact and called as:

interact

When this command is executed, Expect stops reading commands from the script and instead begins reading from the keyboard and the process. When you press keys, they are sent immediately to the spawned process. At the same time, when the process sends output, it is immediately sent to the standard output so that you can read it.

The result is that you are effectively connected directly to the process as if Expect was not even there. Conveniently, when the spawned process terminates, the interact command returns control to the script. And if you make interact the last line of the script, then the script itself terminates as well.

Example -- Anonymous ftp

The interact command is ideal for building a script I call aftp . This script consists of the user/password interaction from the previous example and an interact command. The complete aftp script is shown below.

Anytime you want to begin anonymous ftp , you can use this little script and it will automatically supply the appropriate identification and then turn control over to you. When you type quit to ftp , ftp will exit, so interact will exit, and then the script will exit.

#!/usr/local/bin/expect --

spawn ftp $argv

expect "Name"

send "anonymous\r"

expect "Password:"

send "don@libes.com\r"

interact

Notice that the script does not wait for " ftp> " before the interact command. You could add another expect command to do that, but it would be redundant. Since the interact waits for characters from the process as well as the keyboard simultaneously, when the " ftp> " finally does arrive, interact will then display it. Presumably, a user will wait for the prompt before typing anyway so there is no functional benefit to using an explicit expect .

With only a little more, this script can be jazzed up in lots of ways. For example, rather then embedding your name in the script, you can pull it out of the environment by using the expression $env(USER) . The full command in the script would be:

send "$env(USER)@libes.com\r"

It is a little more difficult to make this script portable to any machine because there is no standard command to retrieve the domain name (presuming you are using domain-name style email addresses, of course). While many systems have a command literally called domainname , it often refers to the NIS domain name, not the Internet domain name. And the hostname command does not dependably return the domain name either.

One solution is to look for the domain name in the file " /etc/resolv.conf ". This file is used by the name server software that runs on most UNIX hosts on the Internet. Here is a procedure to look up the domain name:

proc domainname {} {

set file [open /etc/resolv.conf r]

while {[gets $file buf] != -1} {

if {[scan $buf "domain %s" name] == 1} {

close $file

return $name

}

}

close $file

error "no domain declaration in /etc/resolv.conf"

}

The domainname procedure reads /etc/resolv.conf until it encounters a line that begins with the string domain . The rest of the line is returned. If no string is found, or the file cannot be read, an error is generated.

The full command in the script can now be written as:

send "$env(USER)@[domainname]\r"

Exercises

  1. The ftp-rfc script does not understand what to do if the user enters multiple RFC numbers on the command line. Modify the script so that it handles this problem.
  2. Modify ftp-rfc so that if given an argument such as " telnet ", the script first retrieves the index, then looks up which RFCs mention the argument in the title, and downloads them. Cache the index and RFCs in a public directory so that they do not have to be repeatedly downloaded.
  3. Most ftp sites use a root directory where only pub is of interest. The result is that " cd pub " is always the first command everyone executes. Make the aftp script automatically cd to pub and print the directories it finds there before turning over control to interact .
  4. Make the aftp script cd to pub only if pub exists.
  5. Write a script to dial a pager. Use it in the error handling part of a shell script that performs a critical function such as backup or fsck .
  6. The domainname procedure on See proc domainname {} { is not foolproof. For example, the file resolv.conf might not exist. Assume the procedure fails on your system and ask nslookup for the current domainname.
  7. Write a script that connects to a modem and dials phone numbers from a list until one answers.

1.It actually reads from standard input which is typically the keyboard. For now, I will treat them as if they were the same thing.

2. You can disable this behavior by saying " stty -icrnl " to the shell, but most programs expect this mapping to take place so learn to live with it.

3.Shell backquotes automatically strip trailing newlines, so the script could be simplified in this scenario just by omitting the " string trimright " command. However, in other contexts it is useful to strip the newlines.

4.Admittedly, humans have to be spawned as well; however, this type of spawning is probably best left to the confines of the bedroom.

Back to: Exploring Expect


oreilly.com Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies | Privacy Policy

© 2001, O'Reilly & Associates, Inc.