This is a tutorial for vda. I got it off of the news group comp.sys.att, so I assume there are no legal problems with putting it here.
Getting Started With VDA System Software Release 2.0 Reuel Robertson 1.0 INTRODUCTION This paper describes the Voice Power Voice Data Access Utility used for building call-in voice applications. The utility is an executable program written in the C language, and will henceforth be referred to by its command name, vda(1). Executable commands appear in italics, followed by a chapter number in parenthesis, throughout this paper. The chapter number indicates the section of this manual, or the section of the AT&T 386 UNIX System V Reference Manual, that contains a manual page for the command. In this paper, boldface type indicates a literal string, and user arguments are shown in italics. Vda(1) provides voice output to callers, and decodes the caller's touch tone inputs. The data that vda(1) provides access to is recorded 16k sub-band coded speech, 24k sub-band coded speech, or 64k PCM coded speech, arranged in a tree structure described by a set of ASCII files, called vda scripts. Access to numeric data is also provided with shell scripts that are run by vda(1). The vda scripts that determine the structure of the tree follow a strict format. They are interpreted by vda(1) while it is running, and they are created and edited with a text editor. Voice files are created and edited with the Voice Power voice editor, ve(1). Thus, only three programs are needed to build, administer, and run a voice call-in application: vda(1), ve(1), and a text editor. Since vda(1) is a runtime interpreter of vda scripts, vda scripts may be viewed as written in a programming language. Although this view is accurate, viewing vda scripts as descriptions of voice menus is more useful. Vda scripts describe the way a voice menu sounds, and the way it behaves. Developing a vda(1) application is largely a process of describing the voice menus needed to provide some service to a caller. If an application can be provided through a set of voice menus, then vda(1) is a candidate tool with which to build and run the application. A surprisingly large number of applications can be provided through a set of voice menus. Information retrieval systems, such as a stock price quotation system or a news database, may be Page 1 Getting Started with VDA based on vda(1). Call re-direction services, or automatic attendants, may too. Order entry systems, telemarketing systems, message centers, and automated surveys are also applications that may be built with vda(1). 2.0 SYSTEM COMPONENTS 2.1 Voice Processing Hardware The vda(1) application services one line per vda(1) process. Vda(1) imposes no limit on the number of channels that can be serviced. That will depend on the number of voice cards installed, the nature of the vda(1) application, and unrelated concurrent processor activities. The minimal hardware configuration consists of: oOne 6386 WGS with at least 4 MB RAM oOne 386 Voice Power Model VP1 or VP4 Voice Card 2.2 Voice Processing Software The software configuration must include: oAT&T 386 Voice Power System Software 2.3 External Telephone Connections For a call-in application, analog phone lines must be connected to each card. If callers are to be served through a single phone number, an external PBX must provide this service, perhaps with a hunt group. On the VP1, Vda(1) can also run on a local telephone set, by starting when the set goes off-hook. This feature is provided primarily to aid in debugging and demonstrating vda scripts. The voice editor requires an audio connection to the voice card for recording and checking announcements. You can make the audio connection to a VP1 card using a telephone set connected to the set port. You can make the connection on a VP4 (and VP1) using powered microphone and speaker, or you may phone the voice card. In a multicard configuration only one audio connection is required, since ve(1) is only used for announcement administration. 3.0 INSTALLATION The voice cards are provided with installation instructions. Follow those instructions for installing the cards using the Page 2 Getting Started with VDA procedures in the 386 Voice Power Installation Guide. Telephone lines should be attached to each card. 4.0 VDA SCRIPT EXAMPLE Before delving into details on file formats and vda(1) operation, it is useful to see a brief example of a vda script. The vda script shown below plays messages to callers for two touch tone selections, and branches to another voice menu for a third selection. A selection for exiting is also supported. A naming convention for voice files is used with Voice Power. A name suffix of :e:v is used for voice files, and the example below adheres to that convention. ; **************************************************************************** ; Vda script file example. This implements a single key voice menu. ; **************************************************************************** G play greeting:e:v ; Greeting line: executed once when ; this voice menu is reached. P play prompt:e:v ; Prompt line. This line executes ; after the greeting, and then waits ; for a touch tone input. Loops below ; cause this line to repeat. 1 play message1:e:v loop ; Play message1 when the caller enters ; touch tone 1, and then repeat prompt. 2 play message2:e:v loop ; Play message2 when the caller enters ; touch tone 2, and then repeat prompt. 3 branch menu2 loop ; Branch to another voice menu (menu2) ; when the caller enters touch tone 3. ; Repeat prompt if menu2 returns. 0 play good-bye:e:v exit ; Play good-bye when the caller enters ; touch tone 0, and then hang up. 5.0 FILE FORMATS Vda(1) processes a tree-structured voice data base. Callers reaching the analog line port on the voice card are answered by vda(1). They can then traverse the data base tree by entering touch tone digits in response to prompts spoken by vda(1). The data base tree has two types of nodes: menus and messages. The leaves on the tree are messages, and the interior nodes on the tree are menus. At a menu node, vda(1) provides the caller with a spoken menu of touch tone choices, and then a branch in Page 3 Getting Started with VDA the tree is traversed based on the caller's touch tone input. At a message node a recorded voice file is played. Occasionally it is preferable to have a program decide what branch in the tree to take, rather than letting the caller decide. A special type of menu, called a shell menu, provides this functionality. In shell menus branching is based on a shell script's exit value rather than a touch tone input. The messages in the tree are stored as binary voice files, and the menus are stored as ASCII files. Three types of menus are supported: single-key voice menus, multi-key voice menus, and shell menus. Each type of file is covered in this section. Examples are discussed in the "OPERATION" section. 5.1 Menus Each line of a menu has the following format: INPUT FUNCTION ARGUMENT FLOWCONTROL ; Optional Comment The FLOWCONTROL field is not needed on some lines. Fields are white-space separated, and white-space is ignored. In addition, comments may be added to the end of any line, or put on a separate line. The ';' character designates the beginning of a comment. Vda(1) ignores the characters after a ';', up to the next line. Vda(1) also ignores blank lines and leading white- space. A character may be quoted (i.e., made to stand for itself) by preceding it with a '\'. The valid values for each field are shown in the following sections. Where there are differences in the interpretation of these fields based on the type of menu (single-key, multi-key, or shell), these are discussed in the definition. 5.1.1 INPUT Field Values G Greeting. This line is executed once, when the menu is entered from its parent. No FLOWCONTROL is needed, but nextline (see FLOWCONTROL table) may be used. P Prompt. It is executed immediately after the greeting, and then once each time a loop is encountered. This line is usually used to play a prompt telling the caller to select from a touch tone menu. After this line executes, vda(1) waits for the caller to enter a touch tone. No FLOWCONTROL is needed, but nextline may be used. Page 4 Getting Started with VDA R Run. It is executed once after the greeting and must have a FUNCTION value of shell (see FUNCTION table below). The presence of a run line makes the node a shell menu, in which branching is based on the exit value of a shell script rather than on the caller's touch tone input. No FLOWCONTROL is needed, nor supported. No prompt line is permitted in a script with a run line. O Options. This line sets some command line options (see "Running Vda" in the "OPERATIONS" section) to new values. The new values are in effect only at this menu of the tree. Syntactically, the options line does not follow the menu line format above. Instead, white-space-separated command line options follow the O (example: O -t 8). The only options supported here are: -r, -L, and -t. + Confirm. This line is executed when a valid touch tone selection is entered by the caller. Similarly, in a shell menu this line is executed when a shell script's exit value is valid. It is executed before the line corresponding to the selection is executed. The FLOWCONTROL value is not considered for confirmation lines. This line may be used only in scripts including explicit touch tone specifications, that is, a TT value of 0-9 *# abcd, or shell exit code values. - Default. This line is executed when a touch tone selection with no corresponding INPUT value is entered. It is often used to tell the caller that she or he has made an invalid touch tone selection. Similarly, in a shell menu this line is executed when a shell script's exit value has no corresponding INPUT value. T Time out. This line is executed if no touch tone input is received within timeout (default 15) seconds of when the prompt line completes execution. Timeout may be modified on the vda(1) command line (see "Running Vda" in the "OPERATIONS" section) or with an options line. . Null. The '.' corresponds to no event or input. It is a place holder for a line that is part of a nextline sequence (see FLOWCONTROL table below). 0-9 * # Single touch tone. In a single-key voice menu this line is executed if the corresponding touch tone key is entered by the caller. See touch tone string below for multi-key menus. These values determine valid touch tone selections. Touch tone string In a multi-key menu this line is executed if the corresponding touch tone string is entered by the caller. The string is terminated with a # by the caller, and the # is NOT a part of Page 5 Getting Started with VDA the string. These values determine valid touch tone selections. Numeric string Shell exit value. In a shell menu this line is executed if the corresponding value (0-255) is returned as an exit value from a shell script. Note: All the lines are optional, but greeting, prompt, and run lines are optional only in a limited sense; at least one of these lines must be included in each menu. 5.1.2 FUNCTION Field Values and their Arguments play Play Play the 16 or 24k voice file in the ARGUMENT field. The ARGUMENT is a file name. Play with a capital 'P' ignores touch tone interruptions. play64 Play64 Play the 64k voice file in the ARGUMENT field. The ARGUMENT is a file name. Play64 with a capital 'P' ignores touch tone interruptions. playnum Playnum Play the number in the ARGUMENT field as a whole number. ARGUMENT must be a numeric string or one of: r1, r2, r3, r4, r5, r6. Here rN refers to the Nth number written to standard output by the last shell script run by vda(1). If an illegal ARGUMENT is encountered, playnum plays the first contiguous string within the ARGUMENT that is legal, or plays '0' otherwise. Numbers should not exceed 999,999,999,999. Playnum with a capital 'P' ignores touch tones. playword Playword Identical to playnum, with two exceptions: 1) Numbers are played one digit at a time, 2) The letters A through Z are accepted in addition to the numbers. If a custom vocabulary has been installed (see the section entitled "EDITING THE VDA VOCABULARY") these letters refer to words or phrases in that vocabulary. If an illegal ARGUMENT is encountered, playword plays the first contiguous string within the ARGUMENT that is legal. Letters and numbers can be mixed in the ARGUMENT to playword, but playback terminates when an empty vocabulary element is encountered. Playword with a capital 'P' ignores touch tones. record Record into the 16k voice file named in the ARGUMENT field until a touch tone is encountered, MAXrecord bytes are recorded (see "Running Vda" in the "OPERATIONS" section), or a silence Page 6 Getting Started with VDA of six seconds is encountered. MAXrecord has a default value of 120,000, or 60 seconds of speech. branch Branch to a single-key voice menu, or a shell menu, in the ARGUMENT field. The ARGUMENT is a file name. Note that the presence or lack of a run line in the menu determines if it is a voice menu or a shell menu. mkbranch Branch to a multi-key voice menu, or a shell menu, in the ARGUMENT field. The ARGUMENT is a file name. Note that the presence or lack of a run line in the menu determines if it is a voice menu or a shell menu. dial Dial the number in the ARGUMENT field. The ARGUMENT field may contain the touch tone characters, f (switch-hook flash), p (pause 2 seconds), u (the last touch tone entry by the caller), or r (equivalent to r1 for the playnum function). Note that in a multi-key menu u refers to the user's multi-key entry. passwd Play the 16k or 24k voice file ARGUMENT. Then get the caller's '#' terminated input and see if it is in the vda(1) password file. If after three trys no input matches an entry, hang up. Otherwise, continue. This function on a greeting line provides password-controlled entry to the menu. The vda password file, which contains a newline-separated list of valid passwords, is specified on the command line (see "Running Vda" in the "OPERATION" section). If the -L command line option is included, the indicated length must be sufficient for the password including the terminating '#'. shell Run the shell script file ARGUMENT. The script is passed the last touch tone [string] entered by the user, as a $1 argument. Vda(1) will collect up to six newline separated numbers written to standard output by the shell script, for use in some other functions. Note that shell must be the FUNCTION on all run lines, but may be on any line. register Operate on the ten general-purpose registers, 0 through 9, provided with vda(1). Documentation of this stack oriented operation is provided in the section entitled "COLLECTING AND REPORTING STATISTICS." Data accessible to this function includes: u for last user input (converted to an integer), and r0 - r5 for the first six numeric values written to standard output by the last shell script vda(1) ran. Operators include: i for increment, d for decrement, +, and -. Page 7 Getting Started with VDA log Print a formatted string on standard output. Documentation of this sprintf(3) oriented operation is provided in the section entitled "COLLECTING AND REPORTING STATISTICS." Numeric data (%d) accessible to this function includes: n0 - n9 for the ten general purpose registers, x for the last shell script exit value, and c for the voice channel number. String data (%s) accessible includes: r0 - r5 for the first six numeric values written to standard output by the last shell script vda(1) ran, u for the last touch tone entry by the user, and t for the date and time. monitor Print a formatted string in the shared memory structure. Identical to the log function, except that output goes to shared memory instead of standard output. This function is used to support a monitor program that displays vda(1) activity in real time. See the section entitled "COLLECTING AND REPORTING STATISTICS." Note that any file name in the ARGUMENT field must be a full path if it is not in the directory where vda(1) is run. 5.1.3 Special ARGUMENT Field Characters [ This begins an ARGUMENT value that may include white-space. All characters after the [ and up to but excluding the next ] are part of the bracketed ARGUMENT. ] If a [ was already encountered, the ] marks the end of a bracketed ARGUMENT that may include white-space. \ Any character following a \ character is interpreted as part of the ARGUMENT, and is not interpreted as a field delimiting character. For example, the comment character ';' may be included in an ARGUMENT as \;. 5.1.4 FLOWCONTROL Field Values loop After execution of the line is completed, repeat the prompt line. return After execution of the line is completed, return to the parent menu. exit After execution of the line is completed, hang up on the caller and prepare to accept the next call. On the last line of a Page 8 Getting Started with VDA sequence, this applies to all the elements of the sequence. nextline After execution of the line is completed, execute the next line in the file. A set of lines connected by nextlines is referred to as a sequence. Sequences can have no more than 50 lines. null The null entry has no meaning, but was used historically on greeting and prompt lines. 5.2 Voice Files 16k, 24k and 64k voice files may be used in the voice data base. The detailed format of these files is beyond the scope of this manual. Sub-band coding, but not file formatting specifics, is discussed in: R.E. Crochiere, S.A. Webber, J.L. Flanagan, "Digital Coding of Speech in Sub-bands," Bell System Technical Journal, Vol. 55, October 1976, pp. 1069-1085. Voice files may be created, played, or edited with ve(1). When ve(1) creates a voice file, it appends a :e:v to the file name, unless the name is already ended with :e:v. When ve(1) edits an existing file, it does not alter the file's name. Refer to the AT&T 386 Voice Power Application Programmer's Reference Manual for information on its use, including how to create and edit 16k, 24k and 64k voice files. Other tools are available to process voice files. A partial list includes vplay(1) and vrecord(1), which are documented in chapter 1 of the Reference Manual. Voice files of the same type (16k, 24k or 64k) may be concatenated to form larger voice files as well. In addition, 16k and 24k files may be mixed together in the same file by cat(1). The directory /usr/lib/voice/vda_vocab contains 16k recordings of numbers that are used by the playnum and playword functions. If required, these can be recorded in a different voice by using ve(1) (see the section entitled "EDITING THE VDA VOCABULARY"). 6.0 OPERATION To set up and run a vda application, three steps must be followed: The application tree structure must be defined and implemented, the voice files in the application must be recorded, and vda(1) must be invoked on each card. Each step is discussed below. Page 9 Getting Started with VDA 6.1 Creating The Tree It is easiest to sketch the tree on paper before implementing it. An example of such a sketch is shown in Figure 1, at the back of this paper. For each menu, list the choices and their associated touch tones. Then label each branch from the menu with the touch tone that leads to that traversal. At each leaf write down a brief description of the message that will be spoken there. The example in Figure 1 is a simple news data base. The root menu lets the caller choose an introduction to the service, the top news item, a sub-menu of news stories, or to be connected to a reporter. If the caller chooses to be connected to a reporter, he or she is transferred to another extension, thus exiting the data base. For each menu, an ASCII file must be created to describe it. An example of the ASCII files describing the menus of the tree in Figure 1 is shown in Figure 2, at the back of this paper. Examples of the FUNCTIONs play, branch, and dial are included. The dial FUNCTION in the figure transfers a call to extension 1234, and assumes that the card is attached to a PBX capable of a blind transfer. A blind transfer means transferring and then hanging up, without waiting to determine the progress of the transfer. The line with INPUT 4 illustrates the use of the FLOWCONTROL value nextline. When the caller hits 4 the pleasewait:e:v file is played, and then the call is immediately transferred by dial on the next line. After ASCII files like these have been created for each menu in the tree, the tree has been implemented. 6.2 Recording The Voice Files Each menu may have a prompt line, and perhaps a greeting line. The voice files that are played by these lines must be recorded with the Voice Editor. For example, if the ARGUMENT is prompt4:e:v , then the Voice Editor is invoked with ve prompt4:e:v. Then the prompt is recorded. When a new file is created with ve(1), ve(1) will append a :e:v suffix to the file name if you don't. Some of the lines may have play FUNCTIONs. Ve(1) must be used to create the associated voice file for each of these lines too. This process must be completed for each menu in the tree, so that all the needed voice files are created. ARGUMENT values may be full paths to voice files. If file names alone are used, the files must reside in the directory where vda(1) is run (see "Running Vda", below). Page 10 Getting Started with VDA 6.3 Running Vda The vda(1) program is run from the UNIX shell as follows: vda [-c channel] [-L TTlength] [-t timeout] [-r MAXrecord] [-f flashdur] [-V vocabfile] [-v] [-l linecontrol] [-p passwordfile] [-m] [-i] [-D] [-F tmpl_file] menufile Arguments inside square brackets are optional. The menufile is the root of the tree on which vda(1) operates. It is an ASCII file formatted as described in the "FILE FORMATS" section. The command line options are described in the following table. 6.3.1 Vda Command Line Options -m Multi-key. This option causes the root node of the database tree to be processed for multi-key, # terminated touch tone input. -c channel The vda(1) process will use the specified channel. By default, if no channel is specified, the channel is -1 and the channel with the lowest logical number is used. If the channel is busy, vda(1) will interrupt only lower priority applications. A value of -2 causes vda(1) to search from the lowest channel that is not busy. -p passwordfile An ASCII file with a set of numeric password strings, separated by newlines, can be used as the optional passwordfile. If any passwds are used in the database the passwordfile is required. Each entry in the passwordfile is a valid password. -L TTlength When processing multi-key touch tone inputs vda(1) waits for a '#' key to stop collecting digits. Vda(1) will also stop collecting digits when TTlength digits have been collected. By default digit collection stops after the caller enters 21 keys. TTlength cannot exceed 21. -t timeout Vda(1) waits for a touch tone input after playing a prompt. If no input is received within timeout seconds vda(1) hangs up and prepares to accept the next call. By default vda(1) waits 15 seconds for a touch tone input. A value of 0 means wait forever. Maximum value: 59 seconds. Page 11 Getting Started with VDA -r MAXrecord When vda(1) is recording speech it will stop when MAXrecord bytes have been recorded. Recording is at 16k bytes per second, and MAXrecord has a default value of 120,000, corresponding to 60 seconds of speech. Only values between 12,000 and 840,000 are allowed. -f flashdur The dial command flashes the switch hook when it encounters an 'f' in its ARGUMENT. By default, the duration of this flash is 500 milliseconds. If this value is not appropriate for a given PBX, the duration of the flash may be altered with the "-f" option. Flashdur is an integer in units of milliseconds. -V vocabfile Vda(1) normally uses /usr/lib/voice/vda_vocab/vocab:e:v and /usr/lib/voice/vda_vocab/vocab_h for its numbers and custom vocabulary. This is equivalent to running vda(1) with the option -V /usr/lib/voice/vda_vocab/vocab. When the -V option is used, a :e:v is appended to to vocabfile to determine the vocabulary voice file, and a _h is appended to it to determine the vocabulary header file. This option permits a vda(1) application to have a private vocabulary that does not interfere with other vda(1) applications. -v Verbose. This option causes diagnostic information to be printed to standard error when the dial command is executed. It is useful for diagnosing problems with transferring calls. The following options are provided to support voice applications written as shell scripts. -i Immediate mode. Vda(1) normally waits for the phone line to ring before it starts processing its script. With this option vda(1) starts immediately. When the conversation is complete, vda(1) then exits, rather than waiting for the phone to ring. A -l option (see below) should be used in conjunction with -i, to set the initial state of the voice card port switches. -l linecontrol The initial and quiescent state of the voice card port switches may be changed with this option. Linecontrol values of 0 (the default), 2, or 3 correspond to states of on-hook, line and set, or set only, respectively. -D If vda(1) was run with the -i option, delay automatic restoration of the voice card port switches for 10 seconds after vda(1) exits. Page 12 Getting Started with VDA -F tmpl_file Use the call classification template file tmpl_file when dialing. The template file permits the addition of new tones or the redefinition of existing tones. It's format is defined in v_cc_tfile(3). To run several instances of the same data base on multiple channels, run several vda(1) processes in the background, each with a different channel number. If you want to be able to exit the invocation shell and leave the vda(1) processes running, use the nohup(1) command with each vda(1) invocation. Running vda(1) nohup(1) in the background means that to stop the vda(1) process you must use kill(1). Ps(1) will report the process numbers needed to kill the vda(1) processes. The shell command line below would run the application shown in Figures 1 and 2 at the end of this paper. Assume the ASCII file for the top menu is named topmenu. This command uses channel 0. vda -c 0 topmenu The command below does the same thing, but it runs in the background and it lets you exit the shell without stopping vda(1). nohup vda -c 0 topmenu & Much of this shell interaction can be hidden in shell scripts to improve ease of use. Note that vda(1) terminates immediately if it receives an interrupt, so hittingin the same shell that spawned vda(1) will kill vda(1). However, as long as vda(1) was run with nohup(1), EOF (usually ^d) will have no ill effects, so you can safely exit the shell that spawned vda(1). 6.4 Caller Interface When a caller telephones a vda(1) application, vda(1) answers the phone with the root greeting message, or the root prompt if no greeting is present. Interaction proceeds as defined in the data base. If the caller fails to make any entry when prompted, after 15 seconds (or timeout seconds) one of two things happens. If there is no timeout line in the menu, the data base is exited and vda(1) hangs up. If there is a timeout line in the menu it is executed, and control flows according to its FLOWCONTROL field (usually exit here). Special care should be taken with the timeout line to prevent an endless loop of timeouts after the caller hangs up. On a VP1, vda(1) is not aware when the caller hangs up; it only sees a timeout when the caller fails to hit a touch tone. On a VP4, depending on the local switch or CO, vda(1) may see a no-line-current event (VE_LNOCURRENT) when the caller hangs up. This event causes the vda(1) to hang up the Page 13 Getting Started with VDA current call and to wait for a new call. A FLOWCONTROL value of exit is almost always appropriate on a timeout line. On a VP1, operation is similar for local phones, except that instead of calling to start, the set is taken off-hook to start, and hang-up is recognized immediately rather than through a timeout. When the caller makes a menu selection the appropriate line in the ASCII menu file is executed. If the caller makes an entry that has no corresponding INPUT entry in the menu then one of two things occurs. If there is a default line in the menu it is executed. If there is no default line, the prompt is repeated. For a multi-key menu the entry is not processed until a '#' tone is received (or TTlength tones are received), so it is possible to time out if the caller fails to hit the '#' key. The caller need not wait for a prompt to finish playing before making a selection. Hitting a touch tone during prompt playback immediately executes that selection. Likewise a leaf message may be interrupted with a touch tone. Exceptions to these behavior are provided by capitalizing the 'P' in the play commands: play, playword, and playnum. 6.5 Script-Processing Algorithm The pseudo shell algorithm shown in Figure 4, at the back of this paper, describes how the greeting, prompt, timeout, and run lines are processed by vda(1), regardless of the order in which they appear in the file. Note that a timeout line is optional, but at least one of the other three lines must appear in every menu. 7.0 COLLECTING AND REPORTING STATISTICS Three FUNCTIONs aid in collecting and reporting statistics on vda(1) usage: register, log, and monitor. Register provides access to a set of ten general-purpose registers from within vda(1) scripts. These registers are stored in shared memory, and their values span invocations and terminations of vda(1) processes. Log and monitor provide a method of printing variable strings from within vda(1) scripts. Log sends output to standard output, and monitor sends output to shared memory. 7.1 Register Register implements stack-oriented calculations similar to many of those provided by the UNIX desk calculator, dc(1) . Register uses a stack ten entries long, and accesses ten general-purpose Page 14 Getting Started with VDA registers '0' through '9'. Registers may be pushed onto or popped from the top of the stack. Literal numbers may also be pushed. Plus and minus operators work on the top two stack entries, popping both, and store the results on the top of the stack. Minus subtracts the top of the stack from the second element of the stack. Plus and minus will cause an underflow error, printed to standard error, if less than two values are currently on the stack. The ARGUMENT to register may contain several white-space or ',' separated operations. If white-space is used, the entire ARGUMENT must be enclosed in square braces ("[ ]"). The following table shows how these ARGUMENT characters are interpreted. 7.1.1 Register ARGUMENT Syntax n0 - n9 Push numeric register onto stack. For example: n4 pushes the contents of register 4 on the stack. s0 - s9 Pop numeric register from stack. For example: s4 pops the top of the stack and stores the value in register 4. i Increment the value on the top of the stack. d Decrement the value on the top of the stack. number Push the numeric value on the top of the stack. + Add the value on the top of the stack to the second value on the stack, pop both values off the stack, and store the result on the top of the stack. - Subtract the value on the top of the stack from the second value on the stack, pop both values off the stack, and store the result on the top of the stack. r0 - r5 Push values written to standard output by the last shell script vda(1) ran onto the stack. For example: r1 stores the numeric value of the first numeric string the shell script wrote to standard output. u Push the last touch tone [string] entered by the caller. The numeric value is stored. The example line below illustrates how the register FUNCTION is used. This line stores registers 0 and 1 on the stack, sums Page 15 Getting Started with VDA their values, and stores the result in register 2. It also stores the caller's last touch tone entry in register 3. 1 register [n0n1+s2 us3] loop 7.2 Log Log uses a printf(3) like syntax to write strings to standard output. The formatting string appears in the ARGUMENT field and must be enclosed in square brackets ("[ ]") when it includes white-space. The formatting string has two components: a string enclosed in quotes that includes characters to be printed and special characters to format variables, and a list of variables to use with the special formatting characters. The two special formats supported are %d for numeric values and %s for strings. Refer to the printf(3) page in the AT&T 386 UNIX System V Programmer's Reference Manual for details on these formats, including how to specify field widths. Log actually passes the quoted component to sprintf(), so all string and numeric formatting provided by sprintf() is supported. The table below lists the variables that may be used in the second component of the formatting string, and their types (string or numeric). These variables may be separated by commas or white-space. 7.2.1 Log ARGUMENT Variables n0 - n9 Numeric (%d). The ten general-purpose registers. c Numeric (%d). The number of the voice channel in use. x Numeric (%d). The exit value from the last shell script run. r0-r5 String (%s). The six values written by the last shell run. u String (%s). The last touch tone [string] entered by the caller. t String (%s). The current date and time. The example line below logs the time, the user's input, and the contents of register 4 to standard output. 1 log ["On %s \\n caller entered %s. R4=%d\\n",t,u,n4] loop Note that the newline character required two '\' characters. This is because the vda(1) script parser uses '\' as a quoting Page 16 Getting Started with VDA character, just as sprintf() does. Vda(1) strips the first '\', interpreting it as a quote of the second '\'. 7.3 Monitor Syntactically, monitor is identical to log . The difference is that monitor sends it's output to that portion of shared memory that corresponds to the voice channel in use. The rest of this section contains information needed to implement a monitor program in the C programming language. This information is not needed to use vda(1) itself. Please refer to the 386 UNIX System V Programmer's Reference Manual chapters 2 and 3, for information on the shared memory functions mentioned in this section. Vda(1) creates or attaches to a structured shared memory segment that holds, for each vda(1) invocation, registers, monitor strings, and other information. All interprocess communications with this segment need a key to address it. Such a key may be obtained from ftok(3), using a path of /usr/bin/vda and an id of V. The functions shmctl(2), shmget(2), and shmop(2) are used in shared memory interprocess communications, and each uses a key argument (shmid) that should be obtained with ftok(3). The header file vcb.h describes the structure of shared memory that vda(1) uses. A monitor function, accessing this shared memory, could be used to display vda(1) activity on the screen in real time. 8.0 EDITING THE VDA VOCABULARY The FUNCTIONs playnum and playword play components from a single voice file, normally /usr/lib/voice/vda_vocab/vocab:e:v . This section discusses how the components of this voice file may be edited, using two tools provided with vda(1), and ve(1). The standard vocabulary has only numbers, but may also be extended to include up to twenty-six custom vocabulary words or phrases. Such extensions are also discussed below. 8.1 Extract The utility used to extract the voice components from vocab:e:v is called extract(1). It is run from the directory /usr/lib/voice/vda_vocab. Extract(1) accepts, as an argument, the name of a directory in which to store the separate voice components. If no argument is given the directory /usr/lib/voice/vda_vocab/scratch is used. Extract(1) creates the following separate voice files: 0:e:v 1:e:v 2:e:v 3:e:v 4:e:v 5:e:v 6:e:v 7:e:v 8:e:v 9:e:v 10:e:v Page 17 Getting Started with VDA 11:e:v 12:e:v 13:e:v 14:e:v 15:e:v 16:e:v 17:e:v 18:e:v 19:e:v 20:e:v 30:e:v 40:e:v 50:e:v 60:e:v 70:e:v 80:e:v 90:e:v 100:e:v 1000:e:v million:e:v billion:e:v A:e:v B:e:v C:e:v D:e:v E:e:v F:e:v G:e:v H:e:v I:e:v J:e:v K:e:v L:e:v M:e:v N:e:v O:e:v P:e:v Q:e:v R:e:v S:e:v T:e:v U:e:v V:e:v W:e:v X:e:v Y:e:v Z:e:v. Normally the twenty-six letter files are empty. These are the custom vocabulary words or phrases. Ve(1) may be used to edit any of the extracted voice files. If the number files are re-recorded special care should be taken to trim leading and trailing silences, maintain monotone, and keep volume levels constant. This will help the numbers spoken by playnum sound better. The twenty-six letter files may be edited to hold any word or phrase. Playword may then be used to play combinations of these phrases and numeric digits. The line below illustrates this. Assume A:e:v holds a recording of the phrase "partly sunny," and B:e:v holds "weather is predicted for today." 1 playword AB loop The line above plays the phrase "partly sunny weather is predicted for today." Note that "AB" could have been written to standard output by a shell script, and then the line below would have produced the same result. 1 playword r1 loop To have vda(1) use the extracted files that have been edited the utility in the next section must be run. Do not modify any non- voice files in /usr/lib/voice/vda_vocab/scratch . These files are used and modified by extract(1) and rebuild(1). 8.2 Rebuild The utility used to rebuild the voice file vocab:e:v from the extracted components is called rebuild(1). It is run from the directory /usr/lib/voice/vda_vocab. Rebuild(1) will replace the recordings in vocab:e:v with their extracted versions. Whatever directory extract(1) uses is automatically used by rebuild(1); that is, rebuild(1) does not accept a directory name as an argument. The name of this directory is stored in the file /usr/lib/voice/vda_vocab/scratch/scratch_path which initially contains the scratch directory path and is updated whenever a directory is specified with extract(1). If rebuild(1) is run with a -d argument the extracted versions will be deleted as they are built into vocab:e:v. Page 18 Getting Started with VDA Normally editing a custom vocabulary or the number files is an iterative process. The new versions are tested with vda(1) until they sound acceptable. During each iteration rebuild is run without the -d argument. When editing is complete, rebuild(1) is run with the -d option to save disk space. The components can always be extracted for further editing at a future date. Rebuild(1) expects to find only the voice files created by extract(1) in the component-editing directory. Take care not to create additional voice files in that directory, and not to remove any of the voice files extract(1) puts there. Rebuild(1) will not work if voice files have been added to, or removed from, this directory. Note that running ve(1) with no arguments creates a file named voice:e:v, and this file in the component- editing directory will prevent from functioning. Do not modify any non-voice files in /usr/lib/voice/vda_vocab/scratch . These files are used and modified by extract(1) and rebuild(1). 8.3 Private Vocabularies A vda(1) application can use a private vocabulary. This permits two different vda(1) applications, that use different vocabularies, to be used on the same system. Private vocabularies are used with the -V command line option. A private vocabulary is created from a custom vocabulary by copying the vocab:e:v and vocab_h files to new locations, with possibly new names. For example, /tmp/newvocab:e:v and /tmp/newvocab_h could provide a private vocabulary. The two files must be in the same directory and must use the same name except for the suffixes :e:v and _h . Any editing of the vocabulary, with extract(1), ve(1) , and rebuild(1), must be done prior to creating the private vocabulary. Most developers of installable vda(1) applications with custom vocabularies will want to use private vocabularies. Refer to the section entitled "CREATING INSTALLABLE APPLICATIONS" below. 9.0 CREATING INSTALLABLE APPLICATIONS Vda(1) can execute installable voice applications written by independent software developers. Such an application could include the definition of the data base tree, administration utilities, and voice files, all provided on an "Install" disk. Refer to the AT&T 386 UNIX System V Programmer's Guide for information on creating an installable application. Page 19 Getting Started with VDA If a vda(1) application with a private vocabulary is created, the two vocabulary files must be put in the same directory by the install script. In addition, the application must invoke vda(1) with the appropriate -V command line option, so that the private vocabulary will be used. The extract(1) and rebuild(1) utilities described in the section entitled "EDITING THE VDA VOCABULARY" will not operate on a private vocabulary, so a private vocabulary is safe from being modified by these tools. 10.0 SUGGESTIONS Voice menus should not contain too many choices, since the caller must remember the choices after hearing them. Keep the choices to four or fewer if possible. A command to repeat the prompt is often useful, especially for larger menus. This can be implemented by having the prompt tell callers to hit a certain key to repeat the prompt. If there is no INPUT value for that key, the prompt is automatically repeated (or the default line is executed, if there is one). The "repeat the prompt" command should be consistent in all menus. For wordy prompts tell the caller when to make their entry. Prompts that trail off by listing the last choice without telling the caller to make their selection can be confusing. For any tree structure more than one level deep, a command to leave a menu and return to the parent menu is useful. A consistent return command should be used in all menus, to avoid confusing the caller. The zero key, for example, could be used in all menus to return to the parent node of the menu. Prompts should be recorded speaking as quickly as possible without sacrificing intelligibility. Prompts are spoken to the caller repeatedly, and a slow prompt can become annoying. Prompts should also be brief. If there is general information or friendly chit-chat to be spoken to the caller, put it in the greeting, which is only spoken once on entering a menu. Playnum and playword will not accept an ARGUMENT value of "u", for last user input. However, the lines below illustrate a way to work around this limitation. These lines play a file "entered:e:v", and then play the number entered by the caller. - play entered:e:v nextline . shell echo nextline . playnum r1 loop Note that playword would play the number one digit at a time, so if the value is a phone number use playword rather than playnum. Page 20 Getting Started with VDA ;*********************************************** ; Vda script file named topmenu. Single key. ;*********************************************** G play greeting1:e:v P play prompt1:e:v 1 play intro:e:v loop 2 play topstory:e:v loop 3 branch newsmenu loop 4 play pleasewait:e:v nextline . dial fp1234 exit 0 play exit:e:v exit ;*********************************************** Figure 2: Top Menu ;********************************************** ; Vda script file named newsmenu. Single key. ;********************************************** G play greeting2:e:v P play prompt2:e:v 1 play story1:e:v loop 2 play story2:e:v loop 3 play story3:e:v loop 4 play null:e:v return ;********************************************** Figure 3: News Menu Page 21 Getting Started with VDA /* ------------------------------------------------- */ if( there is a greeting ) { execute the greeting } PROMPT_LOOP: if( there is a prompt ) { execute the prompt wait for a touch tone input if( no touch tone input ) { if( there is a timeout line ) { execute the timeout line } else { hang up and exit } } if( there is a run line ) { execute the run line branch on the shell script's exit value else { branch on the touch tone input } go to PROMPT_LOOP } else { /* there is no prompt */ if( there is a run line ) { execute the run line branch on the shell script's exit value } else { return to parent node } } /* ------------------------------------------------- */ Figure 4: Script Processing Algorithm Page 22