Home - Parsing

Expression evaluation, and more general parsers.

Almost every program will at some point read a number from somewhere, and almost everyone will have reached for a pocket calculator to work out some simple expression so that they can enter a number such as 0.7854 instead of entering pi/4. The latter is easier for someone to read and understand, is more precise and requires less typing, but few programs or programming languages make it easy to enter values in this fashion.

The purpose of this program is to make it easy. It evolves from something that shows the principle of how to parse a statement, through being a useful interpreter for arithmetic expressions, to a compiler for a simple language. If you just want to add this capability to your own programs without understanding how it works, you can go directly to eval4, but if you want to know more, it's all developed from the very beginning below.

Introduction to parsing expressions.

Parsing a source program is the process of working out the structure of that program - analysing its syntax - but for the parser to be of any use to us, we must also take actions which are relevant to those structures - analysing the meaning or the semantics of the program. In the case of an arithmetic expression, a last-in, first-out stack of values is all we need to carry out the relevant actions. As we read in a value, it is pushed onto the top of the stack. At appropriate points, we remove one or two operands from the top of the stack, perform some operation on them, and push the result onto the stack. For example, to evaluate a+b*c we would

    push a
    push b
    push c
    pop the top two elements off the stack, multiply them together, and push back the result
    pop the top two elements off the stack, add      them together, and push back the result

The required result is the top (the only) element on the stack. Successive views of the stack would look like this:

Stack
level
Step 1Step 2Step 3Step 4Step 5
1 a b c b*c a+(b*c)
2
a b a
3

a

If we did

    push a
    push b
    add      the top two elements on the stack and push back the result
    push c
    multiply the top two elements on the stack and push back the result

we would have evaluated (a+b)*c rather than a+(b*c), but most programming languages regard multiplication as having a higher precedence than addition so we'll stick to that interpretation. "Precedence" refers to the order in which the operations (add and multiply here) are carried out. High precedence operations will be performed before low precedence operations. Of course, if you are contemplating compiling any of the examples here you probably knew that to start with.

If we know when to push values onto the stack and when to apply what operations to the values on the top of the stack, we can evaluate an expression. It is the parser that tells us when to do things, and the semantics of the structures it recognises that tell us what to do.

Stack-based evaluation is described by postfix or reverse-polish notation though, rather than the normal infix notation used when writing expressions. Note that in the code fragments above, the arithmetic operation comes after the operands have been gathered together, hence postfix. So, the first step in evaluating an expression is to rewrite the expression in postfix notation.

Links to the code for each version of eval can be found at the end of each section, just before the grammar. The programs were originally developed using Borland C++ 5.01 on a PC running Microsoft Windows 95, but have been verified using Microsoft Visual C++ 6.0 under Microsoft Windows 98 and Visual C++ 7.1 under Windows XP. Any feedback on problems you encounter would be welcome, whether with one of these compilers or any other.

Where the sources use tabs, these should be set to 4 characters to see the code layout properly.

At every step, there are alternative ways of doing things. If you don't like what I've done, write your own. The whole point of this article is to show that creating a simple language is not really difficult, and to encourage you to experiment.


Home - Parsing
Valid HTML 4.01 Any comments or queries, write to me.
Site last updated 14 July, 2004