CSE 130 - Programming Assignment #4

OCaml

125 points

(see submission instructions below)

(click your browser's refresh button to ensure that you have the most recent version)

(Programming Assignment #4 FAQ)

Note: See this for instructions on starting OCaml in the ACS lab machines. To download and install OCaml on your home machines see the instructions here. Remember that this is only to enable you to play with the assignment at home: the final version turned in must work on the ACS Linux machines. While you can use windows to begin working with OCaml, the code you turn in must be that required for the ACS Linux environment.

Code Documentation and General Requirements

Code for all programming assignments should be well documented. A working program with no comments will receive only partial credit. Documentation entails writing a description of each function/method, class/structure, as well as comments throughout the code to explain the program logic. Comments in OCaml are enclosed within (* *), and may be nested. It is understood that some of the exercises in this programming assignment require extremely little code and will not require extensive comments.

While few programming assignments pretend to mimic the "real" world, they may, nevertheless, contain some of the ambiguity that exists outside the classroom. If, for example, an assignment is amenable to differing interpretations, such that more than one algorithm may implement a correct solution to the assignment, it is incumbent upon the programmer to document not only the functionality of the algorithm (and more broadly his/her interpretation of the program requirements), but to articulate clearly the reasoning behind a particular choice of solution.


Assignment Overview

The overall objective of this assignment is to expose you to some advanced features of OCaml such as higher-order functions, abstract datatypes and modules, as well as to fully understand the notion of scoping, binding, environments and closures, by implementing a mini-ML interpreter. The template for the assignment, as well as several test cases is available as a zip file pa4.zip that you need to download.

Note: All the solutions can be done using the purely functional fragment of OCaml, using constructs covered in class, and most require the use of recursion. Solutions using imperative features such as references or while loops will receive no credit. Feel free to use any library functions that you want.

It is a good idea to start this assignment early as it is somewhat longer than the first three assignments.

Even though we have included sample inputs and expected outputs for each step in the assignment, these tests are not comprehensive. It is possible to pass them all and still have a lot of mistakes (which our grading will expose). You are responsible for understanding the problems and testing other cases to make sure your implementation is correct.

Assignment Testing and Evaluation

Your functions/programs must compile and/or run on a Linux ACS machine (e.g. ieng6.ucsd.edu , as this is where the verification of your solutions will occur. While you may develop your code on any system, ensure that your code runs as expected on an ACS machine prior to submission. You should test your code in the directories from which the zip files (see below) will be created, as this will approximate the environment used for grading the assignment.

Most of the points, except those for comments and style, will be awarded automatically, by evaluating your functions against a given test suite. In the tests directory, there is a small handful of tests. For this project, you will use make to compile your program. If for some problem, you cannot get the code to compile, leave it as is with the raise ..., with your partial solution enclosed below as a comment. There will be no exceptions to this rule. If the code you submit does not compile with make, you will receive 0 for the assignment.



Submission Instructions

1. Create the zip file for submission

Your solutions to this assignment will be stored in separate files under a directory called solution/, inside which you will place the files : config.make,main.ml,Makefile,nanoLex.mll,nano.ml,nanoParse.mly,rules.make,test.ml,toplevel.ml and there should be no other files in the directory.

After creating and populating the directory as described above, create a zip file called <LastName>_<FirstName>_cse130_pa4.zip by going into the directory solution and executing the UNIX shell command:

zip <LastName>_<FirstName>_cse130_pa4.zip *
You can refer to an example submission file to compare with yours. Make sure that your zipped file's structure is the same as the example.

2. Test the zip file to check for its validity

Once you've created the zip file with your solutions, you will use the validate_pa4 program to see whether your zip file's structure is well-formed to be inspected by our grading system by executing the UNIX shell command:

validate_pa4 <LastName>_<FirstName>_cse130_pa4.zip
The validate_pa4 program will output OK if your zip file is well-formed and your solution is compiled. Otherwise, it will output some error messages. Before going to step 3, make sure that your zip file passes validate_pa4 program. Otherwise you get a zero for the whole assignment. If you have any trouble with this, refer to the instructions in step 1.

3. Submit the zip file via the turnin program

Once your zip file passes the validation check by validate_pa4, you will use the turnin_pa4 program to submit this file for grading by going into the directory solution/ and executing the UNIX shell command:

turnin_pa4 <LastName>_<FirstName>_cse130_pa4.zip

The turnin_pa4 program will provide you with a confirmation of the submission process; make sure that the size of the file indicated by turnin_pa4 matches the size of your tar file. (turnin_pa4 is a thin wrapper script around the ACMS command turnin that repeats validation and ensures that the propper assignment name is passed) Note that you may submit multiple times, but your latest submission overwrites previous submissions, and will be the ONLY one we grade. If you submit before the assignment deadline, and again afterwards, we will count it as if you only submitted after the deadline.


Problem #0: Review the NanoML template

In the template for NanoML, we have provided an idiomatic OCaml compiler project. The project includes many files, but the only file you will need to edit is nano.ml. Nano.ml contains the definitions for the NanoML language (binop, expr, value, and env), several helper functions (*toString, fold, and listAssoc), as well as your starter code (lookup and eval). To load strings and files into the NanoML language, we have provided a parser and lexer, which you do not need to modify. To build the project, use the provided Makefile by running make. This links your code in nano.ml with the parser and lexer, and generates an executible OCAML interpreter in nanoml.top. The parser provides two useful functions for you: string_to_expr, which converts a string to an NanoML expression, and filename_to_expr, which does the same for a file. For example, you should see the following behavior:

$ make
output from make, hopefully with no errors
$ ./nanoml.top
NanoML

$ ./nanoml.top
OCaml version 4.04.0

# Main.string_to_expr "x + y";;
- : Nano.expr = Nano.Bin (Nano.Var "x", Nano.Plus, Nano.Var "y")
# Main.filename_to_expr "tests/t1.ml";;
- : Nano.expr =
Nano.Bin (Nano.Bin (Nano.Const 2, Nano.Plus, Nano.Const 3), Nano.Mul,
Nano.Bin (Nano.Const 4, Nano.Plus, Nano.Const 5))

In the command Main.string_to_expr "x + y";;, we converted the string "x+y" to a NanoML expression. Similarly, the command Main.filename_to_expr "tests/t1.ml";; converted the file tests/t1.ml to a NanoML expression. You do not need to use string_to_expr or filename_to_expr in your submitted code: doing this will cause a compile-time error. As a shortcut, the makefile also produces an executible nanoml.byte which runs filename_to_expr and eval on its argument, e.g.

$ make
output from make, hopefully with no errors
$ ./nanoml.byte tests/t1.ml
NanoML

$ ./nanoml.byte tests/t1.ml
out: Error: Failure("to be written")

To get the expected value for a test, import the file in an OCaml session e.g.

$ ocaml
OCaml version 4.04.0
# #use "tests/t1.ml";;
- : int = 45

Now that you are familiar with how to use this project, we proceed with the assignment. You will implement NanoML in several separate stages, each more expressive than the last. In each problem, you will extend eval to handle more of the expr datatype. By the end of the assignment, your NanoML interpreter will be able to run most of the code you wrote in the previous 3 programming assignments.


Problem #1: NanoML basics (nano.ml)

In this problem, you will start the assignment by implementing some language basics in OCaml. Specifically, you will extend eval to handle the following datatypes:

(a) 20 points

    type binop = Plus | Minus | Mul | Div

    type expr = Const of int
	| Var of string
	| Bin of expr * binop * expr

    type value = Int of int

    type env = (string * value) list
  

Consider the types described above: binop, expr are used to capture simple ML expressions. Each expression is either an integer constant, a variable, or a binary operator applied to two sub-expressions. A value is an integer, and an environment is a list of pairs of variable names and values. Use listAssoc to write a function lookup: string * env -> value that finds the most recent binding for a variable (i.e. the first from the left) in the list representing the environment. Using this function, write a function eval : env * expr -> value that when called with the pair (evn,e) evaluates an NanoML expression e of the above type, in the environment evn , and raises an exception MLFailure ("variable not bound: x") if the expression contains an unbound variable.

Once you have implemented this functionality and recompiled, you should get the following behavior at the ./nanoml.top prompt:

# open Nano;;
# let evn =[("z1",Int 0);("x",Int 1);("y",Int 2);("z",Int 3);("z1",Int 4)];;
val evn : (string * Nano.value) list = [("z1", Int 0); ("x", Int 1); ("y", Int 2); ("z", Int 3); ("z1", Int 4)]
# let e1 =Bin(Bin(Var "x",Plus,Var "y"), Minus, Bin(Var "z",Plus,Var "z1"));;
val e1 : Nano.expr = Bin (Bin (Var "x", Plus, Var "y"), Minus, Bin (Var "z", Plus, Var "z1"))
# eval (evn,e1);;
- : Nano.value = Int 0
# eval (evn,Var "p");;
Exception: Nano.MLFailure "Variable not bound: p".

(b) 20 points

    type binop = Plus | Minus | Mul | Div
               | Eq | Ne | Lt | Le | And | Or

    type expr = Const of int
              | True | False
              | Var of string
              | Bin of expr * binop * expr
              | If  of expr * expr * expr

    type value = Int of int
               | Bool of bool

    type env = (string * value) list
  

Add support for the binary operators =, !=, <, <=, &&, ||. This will require using the new value type Bool. The operators = and != should work if both operands are Int values, or if both operands are Bool values. The operators < and <= are only defined for Int arguments, and && and || are only defined for Boolarguments. For all other arguments, a MLFailure exception should be raised with an appropriate error message.

Now implement If expressions. Given If(p,t,f), p should be evaluated first, and if it evaluates to true (as a Bool), then t should be evaluated, and the value of the if expression should be the value of t. Similarly, if p evaluates to false, then f should be evaluated and the result returned. If p does not evaluate to a Bool, a MLFailure exception should be raised with an appropriate error message.

Once you have implemented this functionality and recompiled, you should get the following behavior at the ./nanoml.top prompt:

# open Nano;;
# let evn =[("z1",Int 0);("x",Int 1);("y",Int 2);("z",Int 3);("z1",Int 4)];;
val evn : (string * Nano.value) list = [("z1", Int 0); ("x", Int 1); ("y", Int 2); ("z", Int 3); ("z1", Int 4)]
# let e1 =If(Bin(Var "z1",Lt,Var "x"),Bin(Var "y",Ne,Var "z"),False);;
val e1 : Nano.expr = If (Bin (Var "z1", Lt, Var "x"), Bin (Var "y", Ne, Var "z"), False)
# eval (evn,e1);;
- : Nano.value = Bool true
# let e2 =If(Bin(Var "z1",Eq,Var "x"),Bin(Var "y",Le,Var "z"),Bin(Var "z",Le,Var "y"));;
val e2 : Nano.expr = If (Bin (Var "z1", Eq, Var "x"), Bin (Var "y", Le, Var "z"), Bin (Var "z", Le, Var "y")) # eval (evn,e2);;
- : Nano.value = Bool false


Problem #2: NanoML bindings and functions (nano.ml)

(a) 15 points

Next, you will extend eval to handle let-bindings, i.e. the following datatypes:

    type expr = ...
	| Let of string * expr * expr
	| Letrec of string * expr * expr
  

The types as shown above include "let-in" expressions that introduce local bindings exactly as in OCaml. Let (b,e1,e2) should be evaluated as the ML expression let b = e1 in e2 . Similarly, Letrec (b,e1,e2) should be evaluated as let rec b = e1 in e2 . At this point, we do not support functions, so Let and Letrec should do the same thing.

Once you have implemented this functionality and recompiled, you should get the following behavior at the ./nanoml.top prompt:

# open Nano;;
# let e1 = Bin(Var "x",Plus,Var "y");;
val e1 : Nano.expr = Bin (Var "x", Plus, Var "y")
# let e2 = Let("x",Const 1,Let("y",Const 2,e1));;
val e2 : Nano.expr = Let ("x", Const 1, Let ("y", Const 2, Bin (Var "x", Plus, Var "y")))
# eval ([],e2);;
- : Nano.value = Int 3
# let e3 = Let("x",Const 1,Let("y",Const 2,Let("z",e1,Let("x",Bin(Var "x",Plus,Var "z"),e1))));;
val e3 : Nano.expr = Let ("x", Const 1, Let ("y", Const 2, Let ("z", Bin (Var "x", Plus, Var "y"), Let ("x", Bin (Var "x", Plus, Var "z"), Bin (Var "x", Plus, Var "y")))))
# eval ([],e3);;
- : Nano.value = Int 6

(b) 15 points

    type expr = ...
		| App of expr * expr
		| Fun of string * expr

    type value = ...
		   | Closure of env * string option * string * expr
  

We now extend the above types so that there is an expression for function application: App(e1,e2) corresponds to applying e2 to the function e1 , we allow function declarations via the expression Fun (x,e) where x and e are respectively the formal parameter and body-expression of the function. For now, assume the function is not recursive. However, functions do have values represented by the Closure (evn,n,x,e) where evn is the environment at the point where that function was declared, and n,x,e are the name, formal parameter, and body expression of the function. If the function is anonymous or declared in a let statement, the name should be None. If the function is declared in a let rec statement, then the name of the function should be Some f (where f is the name of the function). It is very important that you save the name of let-rec functions for the next problem. Extend your implementation of eval by adding the appropriate cases for the new type constructors.

Once you have implemented this functionality and recompiled, you should get the following behavior at the ./nanoml.top prompt:

# open Nano;;
# eval ([],Fun ("x",Bin(Var "x",Plus,Var "x")));;
- : Nano.value = Closure ([], None, "x", Bin (Var "x", Plus, Var "x"))
# eval ([],App(Fun ("x",Bin(Var "x",Plus,Var "x")),Const 3));;
- : Nano.value = Int 6
# let e3=Let("h",Fun("y",Bin(Var "x", Plus, Var "y")),App(Var "f",Var "h"));;
val e3 : Nano.expr = Let ("h", Fun ("y", Bin (Var "x", Plus, Var "y")), App (Var "f", Var "h"))
# let e2 = Let("x",Const 100,e3);;
val e2 : Nano.expr = Let ("x", Const 100, Let ("h", Fun ("y", Bin (Var "x", Plus, Var "y")), App (Var "f", Var "h")))
# let e1 = Let("f",Fun("g",Let("x",Const 0,App(Var "g",Const 2))),e2);;
val e1 : Nano.expr =
Let ("f", Fun ("g", Let ("x", Const 0, App (Var "g", Const 2))),
Let ("x", Const 100,
Let ("h", Fun ("y", Bin (Var "x", Plus, Var "y")),
App (Var "f", Var "h"))))
# eval ([],e1);;
- : Nano.value = Int 102
# eval ([],Letrec("f",Fun("x",Const 0),Var "f"));;
- : Nano.value = Closure ([], Some "f", "x", Const 0)

(c) 15 points

Make the above work for recursively defined functions (when declared with let rec). The key to recursive functions is that their evaluation environment in the App needs a binding for the function's own name. Use the Some name field of the Closure datatype to properly handle recursive functions in the App rule.

Once you have implemented this functionality and recompiled, you should get the following behavior at the ./nanoml.top prompt:

# open Nano;;
# eval ([],Letrec("fac",Fun("n",If(Bin(Var "n",Eq,Const 0),Const 1,Bin(Var "n",Mul,App(Var "fac",Bin(Var "n",Minus,Const 1))))),App(Var "fac",Const 10)));;
- : Nano.value = Int 3628800

Problem #3: NanoML lists (nano.ml)

Finally, you will add language support for lists to NanoML. You will start with the basic operations for representing and building lists:

(a) 15 points

	  type binop = ...
	            | Cons

    type expr = ...
              | NilExpr

    type value = ...
               | Nil
               | Pair of value * value
  

Extend your program to support basic list operations. We use Cons and NilExpr to represent OCaml's :: operator and [] expression. At runtime, the value of a list is either empty (Nil) or a pair of the element and the tail (i.e. element::tail -- this is known as a cons cell).

Once you have implemented this functionality and recompiled, you should get the following behavior at the ./nanoml.top prompt:

# open Nano;;
# eval ([],NilExpr);;
- : Nano.value = Nil
# eval ([],Bin(Const 1,Cons,Bin(Const 2,Cons,NilExpr)));;
- : Nano.value = Pair (Int 1, Pair (Int 2, Nil))

(b) 15 points

You will next implement several fundamental list-processing functions: hd, tl, and null. hd takes a nonempty list as input and returns the first element -- if its argument is empty, or not a list, it throws an error. tl takes a nonempty list as input and returns the remainder -- if its argument is empty, or not a list, it throws an error. null tests a list for whether the list is empty or nonempty -- if its argument is not a list, it throws an error.

You will implement this functionality by using the NativeFunc case of value:

    type value = ...
		| NativeFunc of string
  

Whenever a native function is used as a variable, you should return a value with that function's name, for example:

# open Nano;;
# eval ([],Var "hd");;
- : Nano.value = NativeFunc "hd"

Then, when a native function is applied in an App rule, you should return the behavior corresponding to that function, for example:

# open Nano;;
# eval ([],App(Var "hd",Bin(Const 1,Cons,Bin(Const 2,Cons,NilExpr))));;
- : Nano.value = Int 1
# eval ([],App(Var "tl",Bin(Const 1,Cons,Bin(Const 2,Cons,NilExpr))));;
- : Nano.value = Pair (Int 2, Nil)
# eval ([],App(Var "null",NilExpr));;
- : Nano.value = Bool true

(c) 10 points

Finally, you will implement several higher-order list functions: map and foldl. map takes a function as input and a list as (curried) inputs and applies the function to each element of the list. This is OCaml's List.map. foldl takes a function, initial value, and list as (curried) inputs and reduces the list using the function and initial value. This is Ocaml's List.fold_left. As always, if either map or foldl are passed ill-typed inputs, it should result in a MLFailure. However, the error message does not have to be specific to map or foldl, and if an ill-typed expression is never fully evaluated, for example foldl (fun x -> x + 1) true, you do not need to throw an exception.

We recommend that you use your existing list primitives to implement NanoML versions of map and foldl -- for a hint, see tests/t11.ml and tests/t12.ml. Then, when the interpreter sees a map or foldl variable, it returns the saved library version.

After you have implemented this portion, you should see the following behavior:

# open Nano;;
# eval ([],Main.string_to_expr "map (fun x -> x + 1) [1;2;3]");;
- : Nano.value = Pair(Int 2, Pair(Int 3, Pair (Int 4, Nil))
# eval ([],Main.string_to_expr "foldl (fun acc -> fun next -> acc + next) 0 [1;2;3]");;
- : Nano.value = Int 6


Problem #4: NanoML executable

Once you have completed all the above parts, you should end up with an executable "nanoml.byte". You should be able to test it as follows from the shell prompt:

$ ./nanoml.byte tests/t1.ml
...
out: 45
$ ./nanoml.byte tests/t2.ml
...
out: 0
$ ./nanoml.byte tests/t3.ml
...
out: 2

and so forth, for all the files in tests. To get the expected value for the other tests, run them with ocaml:

# #use "tests/t1.ml";;
- : int = 45


If the code you submit does not get passed by validate_pa4 or submitted by turnin_pa4, you will get 0 for the assignment.