An Interpreter in Object Pascal

Writing an Interpreter in Object Pascal

Part 1 (scroll down for Part 2)

How to Purchase

Paperback Copy (with the option to get 50% off the pdf version) $16.95

Get the eBook (pdf) $15.95

For more information contact: hsauro@gmail.com

This is part 1 of a series that will show you how to write an interactive interpreter in Object Pascal. Part 1 of the series will cover introductory material including a description of the language we’ll create, a full lexical analyzer for the language, how to use DUnitX for unit testing, and an introduction to the essential concepts in syntax analysis, recursive descent, grammar, and EBNF. Along the way, we’ll create a simple REPL, give a detailed discussion of how to parse expressions and build a simple interactive calculator to illustrate the theory. The book provides fully working code and explains in plain English how the code works and why certain decisions were made, including alternative designs. The book makes liberal use of code throughout the book chapters. Everything is done without the help of third-party tools such as Yacc, ANTLR or Flex. All you need is a standard installation of Free Pascal or Embarcaderos’s Delphi (including the free community edition).

The text is geared to hobbyists and midlevel developers who need an accessible introduction to lexical analysis and parsing. It’s also for students starting out in compiler and interpreter design and need something more digestible.

All source code is open source under Apache 2.0 and available from Github.

Available in paperback form in the first week of January 2019 from Amazon.

Price $16.95 (paper), $15.95 (eBook) 170 pages

ISBN: 978-1-7325486-0-2

Contents:

  1. Introduction
  2. a) Why Object Pascal
  3. b) What is an interpreter
  4. c) Parts of an interpreter
  5. The Rhodus Language
  6. Lexical Analysis
    1. Initial API
    2. Input streams
    3. Retrieving tokens
    4. First run
    5. Adding more tokens
  7. Testing
    1. Introduction to testing
    2. Using DUnitX
  8. An Interactive Console
  9. Introduction to Syntax Analysis
    1. Grammar
    2. Production rules
    3. LL(k)
    4. Recursive descent
    5. Factoring, the dangling else
    6. Left recursion
    7. Ambiguous Grammars
    8. A simple calculator
    9. Syntax trees
    10. Adding exponentiation and the unary minus
  10. Testing the Calculator
  11. Adding Assignments and Variables
    1. Using a queue for token lookahead
    2. Updating the syntax analyzer
    3. Updating the lexical analyzer
  12. Building a Recognizer
  13. Appendix EBNF

A brief description of the Rhodus language Version One

The language for the interpreter is a mix of ideas from Pascal, Basic and Python. This is a toy language in the sense it is used to illustrate how one might go about building an interpreter. This doesn’t mean it couldn’t be used in a serious application but given the availability of existing mature scripting languages, it would seem less likely. It’s primary purpose is therefore educational. The easiest way to describe the language is via some examples. A couple of initial points worth noting:

  1. Newlines are not part of the syntax
  2. Semicolons separate statements
// Data Types
a = 3;
b = 3.1415; b = -1.234E-12;
c = True; c = False;
d = "String";
myList = {1,2,3,{5,6},{"xyz", True, {False}}};

/* Operators:
   +, -, *, /, ^, (), mod, div, and or, not, xor, ==, >, >=, <, <=, !=
*/
println ("Hello World")
// Conditionals
if a > 6 then
   b = 9
else
   if b < 3 then
      x = 7
   else
      x = 8
   end
end
// Loops
a = 10
while a > 5 do
     a = a - 1
end;

a = 10
repeat
   a = a - 1
until a < 5;

for i = 1 to 10 do
    println (i)
    if i == 7 then
       break;
end;

for i = 0 to 10 do
    a = 5;
    for j = 0 to 20 do
        b = 7;
        for k = 0 to 30 do
            c = 9
        end    
    end 
end
// User defined functions
function add (a, b)
   return a + b
end;

x = 6;
function sub (a)
  global x
  return a - b;
end;

println ("Add two numbers: ", add (4,4))

// Functions can be passed to other functions
function callfunc (func)
    return func (5,6)
end;

println ("Add two numbers: ", callfunc (add))
// Lists
function bubbleSort (a, length)
   for i = 0 to length - 1 do
       for j = i+1 to length - 1 do
           if (a[i] > a[j])
               temp = a[i]; a[i] = a[j]; a[j] = temp;
           end
       end
   end
end;

// Sort the following array
alist = {5,2,6,7,3,2,5,1,4,9};
lengthOfList = 9;
bubbleSort (alist, lengthOfList);
println (alist);
// Larger example
function hamming (limit)
   h[0] = 1;
   x2 = 2; x3 = 3; x5 = 5;
   i  = 0; j  = 0; k = 0;
   for n = 1 to limit do
        h[n] = min(x2, min(x3, x5));
        if x2 == h[n] then i = i + 1; x2 = 2*h[i] end;
        if x3 == h[n] then j = j + 1; x3 = 3*h[j] end;
        if x5 == h[n] then k = k + 1; x5 = 5*h[k] end;
    end;
    return h[limit -1];       
end;

// Currently no library support for lists until Part 2 so
// we have to declare a list of a specific size like this
h = {0,0,0,0,0,0,0,0,0,0,0,0,0};

for i = 1 to 20 do
     println (hamming [i]);
end

Part II

Progress towards Part II of the Book

May/June/July 2019:

1. Finalizing the machine instructions for the virtual machine - complete

2. Implementing the virtual machine - supporting local variables completed.

3. Adding system tests - ongoing

July 27: First 5 chapters (not including (Chapter 1) for part 2 are now in draft form (73 pages), these chapters include:

  • The Virtual Machine
  • A Simple Assembler
  • Testing and Performance
  • User Functions
  • Supporting Lists

as well as an appendix:

  • Appendix 1: List of Virtual Machine Opcodes

The figure below shows a GUI I created today to make it easier to load and run virtual machine scripts (19 July 2019)

Examples of VM scripts:

# List test, create nested list {{6,7},4,9}# pushi 1 pushi 4 pushi 9 createList 3 store 0 pushi 6 pushi 7 createList 2 pushi 0 load 0 svecIdx load 0 print
# Test simple builtin call, 3 is the sqrt function, repeat 10 times# pushd 10.0 store 0L1: load 0 builtin 3 print load 0 pushd 1.0 sub store 0 load 0 print load 0 pushd 0.0 isGt jmpIfTrue L1
# Call with arguments# pushi 4 call 1
# Repeat/until loop# i = 1# repeat# print i# i = i + 1# until i > 5# print "Done"# pushi 1 store 0l1: load 0 print pushi 1 load 0 add store 0 load 0 pushi 5 isGt jmpIfFalse l1 pushs "Done"l2: print

End of December 2019:

After been busy in the fall on other things I finally had time to get back to part 2 of the interpreter book. The book itself is now up to 70 pages and aim to submit for publication at around 150 pages or so.

As of today (1/3/20), the interpreter can now compile recursive code such as:

// Recursive version for computing the Fibonacci Sequence
// Fn = Fn-1 + Fn-2, where Fo = 0 and F1 = 0
function fib (x)
   if ((x == 0) or (x == 1)) then
      return x;
   end;
   return fib (x-1) + fib (x-2)
end;

f = fib (31);
println (f);

or user functions with local variables:

function printme (x, y)
   a = x;
   b = y;   
   println ("Print me: ");
   return a+b
end;

a = printme(2, 4) + 10;
println ("Answer = ", a)

It also has support for while, for, repeat, and conditionals:

println ("Testing while loop");
a = 6;
while a >= 1 do
  print ("while: ");
  println (a);  
  a = a - 1;
end;

println ("Testing repeat loop");
a = 6;
repeat 
  print ("repeat: ");
  println (a);  
  a = a - 1;
until a < 1;

println ("If statements");
a = 6;
if a > 4 then
   print ("If statement: ");
   println (a);
end;

println ("For loops:");
for i = 1 to 10 do
    print (i, " ")
end

As of today (1/8/20), the interpreter now supports lists and can execute code such as:

// Compute the Hilbert Matrix

n = 5; // Size of matrix
row = n*{0};
m = n*{row};

for i = 0 to n - 1 do
    for j = 0 to n - 1 do
          m[i,j] = 1/((i+1)+(j+1)-1);
    end;
end;

Write a function to compute the Hilbert matrix:

function hillbert (n)
   row = n*{0};
   m = n*{row};

   for i = 0 to n - 1 do
       for j = 0 to n - 1 do
           m[i,j] = 1/((i+1)+(j+1)-1);
       end;
   end;
   return m;
end;

println (hillbert (3));

As of today (1/14/2020), the interpreter now supports break statements in looks and can execute code such as:

i = 1;
while i < 10 do
    i = i + 1;
    println ("In loop: ", i);
    if i == 5 then
       break;
    end;
end;
println ("i = ", i);

or

for i = 3 to 10 do
    for j = 2 to 12 do
        println ("i = ", i, " j = ", j);
        if j == 5 then
           break;
        end;
    end;
    if i == 8 then
       break;
    end;
end;


That competes the feature list for Part II of the series. I am now upto 145 regression tests. These have cought many edge cases that I would have missed otherwise. I will continue to add more tests to increase the coverage and then finish writing up the text itself.

What’s coming up next in Part II?

  1. Symbol Table: (1)
  2. The Virtual Machine: (1)
  3. Code Generation: (1)


Whats coming up in Part III?

  1. Building the AST
  2. Symbol Table: (2)
  3. Error Handling
  4. Version Two of the Rhodus Language
  5. Adding Module and Array Support
  6. Math and IO Libraries
  7. Adding Exception Handling
  8. List and Array Libraries
  9. The Virtual Machine: (2)
  10. Code Generation: (2)
  11. Semantic Analysis
  12. The Virtual Machine: (3)
  13. Code Generation: (3)
  14. A Better REPL