Aragain falls

You are near the top of Aragain Falls. The only path leads north. A solid rainbow spans the falls to the west.

> go north

Sandy beach
You are on a large beach on the east shore of the river, which flows by quickly. A path runs south along the river, and a sand-filled passage leads northeast.
There is a shovel here.

> take the shovel


> go northeast

Code for this and the next several episodes is at

All right, we have the opcode and we have a list containing the types of the operands – that is, whether each operand is a one byte integer, a two byte integer, or a one-byte code that refers to a variable. I want to make a list of the actual operands.

The time has come to say how one byte describes a variable. The convention is as follows:

  • 0 means to pop the evaluation stack and use the popped value. (We’ll see later that the “store” portion of the instruction uses the same convention to mean “push the created value onto the stack”, so this is nicely symmetrical.)
  • 1 through 15 identifies a local variable in the current routine activation. There can be no more than 15 local variables in any given routine.
  • 16 through 255 identifies a global variable; the story maintains a block with 240 words that are reserved for use as global variables.

In all cases a variable stores a word.

Of course I want to capture this in the type system, rather than representing these things as naked integers. Each of these concepts gets its own type:

type local_variable = Local of int
type global_variable = Global of int
type variable_location =
  | Stack
  | Local_variable of local_variable
  | Global_variable of global_variable

And now we have enough gear to describe an operand:

type operand =
  | Large of int
  | Small of int
  | Variable of variable_location

Notice that I’m getting pretty deep in my wrapper types here. An operand could be a Variable Local_variable Local 10 for example. But I live in fear of accidentally mixing that up with the small number 10, the address 10, and so on. These wrapper types are cheap.

Anyways, we have a list of operand types, and we have the operands encoded in the instruction bytes. I want a list of operands.

A couple helper functions deal with turning integers into variables; later on I will need to go the other way, so I’ll just implement them both now:

let decode_variable n =
  let maximum_local = 15 in
  if n = 0 then Stack
  else if n <= maximum_local then Local_variable (Local n)
  else Global_variable (Global n)

let encode_variable variable =
  match variable with
  | Stack -> 0
  | Local_variable Local n -> n
  | Global_variable Global n -> n

All right, now we can take that list of operand types and the address of the start of the operands in the instruction, and go to town.

This method is not tail recursive but the maximum number of operands is eight, so I really don’t care.

  let rec decode_operands operand_address operand_types =

Here for the first time we are doing pattern matching on lists. It’s a bit surprising that it took this long to get here because this is a very normal thing to do in OCaml. We handle the list of types recursively.

    match operand_types with

First the base case. If the type list is empty then the operand list is empty.

    | [] -> []

Otherwise, there must be a first operand type, followed by a tail list with the remaining types. There are three possible cases for the head, and we cover each of them with a pattern. Either there is a large operand followed by a tail, or a small operand followed by a tail, or a variable operand followed by a tail.

    | Large_operand :: remaining_types ->
      let w = read_word (byte_addr_to_word_addr operand_address) in
      let tail = decode_operands (inc_byte_addr_by operand_address word_size) remaining_types in
      (Large w) :: tail
    | Small_operand :: remaining_types ->
      let b = read_byte operand_address in
      let tail = decode_operands (inc_byte_addr operand_address) remaining_types in
      (Small b) :: tail
    | Variable_operand :: remaining_types ->
      let b = read_byte operand_address in
      let v = decode_variable b in
      let tail = decode_operands (inc_byte_addr operand_address) remaining_types in
      (Variable v) :: tail

If we omitted Omitted, ha ha, then OCaml would give a warning that the set of patterns do not 100% cover all the possibilities. So to keep it quiet, even though you and I know that we never pushed an omitted operand onto the operand type list, we handle the case with a failure.

An alternative approach would have been to push omitteds onto the list of types, and then deal with them here, by, say, bailing out when we got there. I can see the elegance of such an approach but I don’t have a strong opinion one way or the other. I think this is fine.

    | Omitted :: _ ->
      failwith "omitted operand type passed to decode operands"

And we’re going to need to know how many bytes we consumed in operands. Two for every large operand, and one for every small or variable operand. (Again, this is not tail recursive, and again, I don’t care. It’s a list of 8 or fewer items.)

  let rec get_operand_length operand_types =
    match operand_types with
    | [] -> 0
    | Large_operand :: remaining_types -> word_size + (get_operand_length remaining_types)
    | _ :: remaining_types -> 1 + (get_operand_length remaining_types) in

All right, summing up: we now have the opcode, and we have all the operands tagged with whether they are large constants, small constants, local variables, global variables or stack pops. We know how many bytes the opcode, the operand types and the operands took up. Next time on FAIC: we’ll decode the store, if there is one.


7 thoughts on “Aragain falls

  1. ‘decode_operands’ is an excellent opportunity to break out…(checks ocaml documentation)… You’ve just mapped the obvious ‘decode_operand’ function over the operand_types list.

      • I had to cheat a bit (by letting a closure access a mutable variable), but I was able to get a (Rust) implementation using iterators/map by first mapping from the list (iterator) of the types to tuples of (Type, ByteAddress) and then mapping the tuples to the decoded Operand values.

        I suspect there’s some way to “fold into an array” using something like a tuple of (ByteAddress, Vec) as the accumulator for the fold operation, which would allow keeping all the state immutable and inside the iterator/map function chain.

  2. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #2052

  3. The language I’m playing along in (Swift) has been remarkably close to the OCaml so far, but pattern matching against arrays has finally broken it – so instead I pattern match against a tuple of (head, tail) instead – does OCaml’s list matching goes beyond just head/tail?

    • If it’s like Rust’s (experimental) version (which is based on OCaml) then there’s probably a way to match on some extra elements, something like:

      match some_array {
      [Large, middle.., Small] => {…}, // starts with Large, ends with Small, everything else is “middle”
      [front.., Small, Small] => {…}, // ends with two Smalls, everything else is “front”
      [Large, Large, middle.., Variable] => {…}, // start with two Large, end with Variable, everything else is “middle”

      [some_x, some_y, middle.., Large] => {…}, // start with any two operand_types some_x and some_y, and ends with Large,

      Of course the compiler will still require that there are no missing or overlapping cases.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s