This is a circular room with passages in all directions. Several have unfortunately been blocked by cave-ins.
> go southeast
Last time we implemented the state transition function; it takes in a zchar and a state, and produces a string fragment and new state. We can then modify our earlier byte-dumping debug method from a few episodes before to first, read the three zchars out of the current word, then use each zchar to compute the next string fragment and next state, and finally, if we’re at the end, return the accumulated string. If we’re not at the end then we bump up the word address and do a tail recursion:
let rec read story (Zstring address) = let process_zchar (Zchar zchar) state = (* Omitted; see previous episode *) let rec aux acc state1 current_address = let zchar_bit_size = size5 in let word = Story.read_word story current_address in let is_end = fetch_bit bit15 word in let zchar1 = Zchar (fetch_bits bit14 zchar_bit_size word) in let zchar2 = Zchar (fetch_bits bit9 zchar_bit_size word) in let zchar3 = Zchar (fetch_bits bit4 zchar_bit_size word) in let (text1, state2) = process_zchar zchar1 state1 in let (text2, state3) = process_zchar zchar2 state2 in let (text3, state_next) = process_zchar zchar3 state3 in let new_acc = acc ^ text1 ^ text2 ^ text3 in if is_end then new_acc else aux new_acc state_next (inc_word_addr current_address) in aux "" alphabet0 (Word_address address)
Code for this episode can be found here:
https://github.com/ericlippert/flathead/tree/blog5
A few things to note:
- the read method is marked as recursive because the nested process_zchar method calls it. This is the easiest way to get a mutually recursive function in OCaml – that is, two functions which do not call themselves, but each calls the other. It’s easiest to have one not-marked-recursive function as a nested function inside the marked-recursive outer function. (There are ways to get “sibling” methods to be mutually recursive but I understand this scenario to be relatively rare in OCaml.)
- I made a helper method fetch_bit that returns a single bit as a Boolean rather than a number.
- Look at how elegant it is to extract the values from a tuple returned by a function. Man, I wish C# had this feature.
- Comments in OCaml are marked by (* *) and they obey parenthesis nesting rules. So if you comment out something that already contains a comment, you don’t have to worry that your comment will end prematurely.
All right, we can now try out our zstring decoder. I happen to know – and how I know will have to be in a future episode – that there is a string at address 0xb106:
let () = let story = Story.load "minizork.z3" in let zstring = Zstring 0xb106 in let text = Zstring.read story zstring in Printf.printf "%s\n" text;
and sure enough:
"Flood Control Dam #3 was constructed in 783 GUE with a grant of 37 million zorkmids from Lord Dimwit Flathead the Excessive. This impressive structure is composed of 370,000 cubic feet of concrete, is 256 feet tall and 193 feet wide. The construction of FCD#3 took 112 days from ground breaking to dedication. It required a work force of 384 slaves, 34 slave drivers, and 12 engineers, 2345 bureaucrats, and nearly one million dead trees. As you start your tour, notice the more interesting features of FCD#3. On your right..."
I think we got it right this time.
And I almost forgot to mention that now we know what those “5” characters were doing at the end of the abbreviation for “the “. That string is four real characters long but there is room for six zchars in a four-byte zstring. The last two zchars are just harmless “alphabet shift” characters that contribute no text to the string, but make up the necessary two extra zchars.
Next time on FAIC: Now that we can decode strings, we can decode the dictionary the game uses to recognize user commands.
Pingback: Troll room | Fabulous adventures in coding
Isn’t the way text is stored in a story file excessively complicated? Was it due solely to compression purposes or were they thinking also in obfuscating the text as much as possible to avoid curious eyes from unveiling well kept secrets?
It does seem complicated, and the compression it achieves is not actually that great. But I gather that the by-design purpose was to make a fast-and-easy-to-decode compression scheme that could be implemented easily on a great variety of hardware, while still enabling some reduction in size. Keep in mind that most of the size of text adventure is its text. You really want to be able to fit one of these games onto as few floppy disks as possible; every additional disk greatly increases the marginal cost of the game.
The obfuscation I would think would be a nice side effect; there are plenty of other ways to obfuscate the code and text they could have chosen that would be as effective.
I wonder how long it took to feed the text of a story through the tools Infocom used to compress them, and how often the whole story would be run through (rather than just changed portions)? Experimenting to find the best story format would be easy with a modern development system, but the machines Infocom programmers were using back in the day were probably not nearly as nice.
I’ve been following along in Rust (which has been delightfully straightforward given its functional roots, and the fact that the first versions of rustc were *written* in OCaml) and got caught out by a subtle detail I’d missed.
Alphabet 2 (symbols and numbers) has *6* special codes/?s at the front, not 5 like the other two. I kept thinking I’d transposed some of the punctuation characters, “fixing” it, then finding that some other pair was wrong.
Hopefully this will save someone else a few minutes.
Correct — the rule is that 4 means “draw from alphabet 2”, and 5 means “draw from alphabet 3”, unless the 5 is followed by a 6. So 6 is only special in alphabet 3.
You’ve mentioned wanting tuple support and pattern matching in C#. Are you as excited as I am that the design team is considering those features (and non nullable references!) for C# 7? I can’t wait and hope it all makes it in.
I am super excited! Also ref-returning methods and ref locals are under consideration. I worked on prototypes of these features years ago; nice to see that they are finally getting traction.
Do you know what mechanism would be used to keep track of any objects whose fields are identified by a returned byrefs? If so, might be an interesting topic for a future blog post revisiting the return byref concept.
The CLR actually has support for this feature natively, and always has. It’s just never been exposed in a first-class way in a popular .NET language.
I think I wrote the article you want back in 2011: https://blogs.msdn.microsoft.com/ericlippert/2011/06/23/ref-returns-and-ref-locals/
Possibly stupid question – doesn’t that feature encourage sharing of mutable state, which is often a bad thing?
Yes it does. The feature is useful in more sort of “systems programming” scenarios where you want to manipulate references to memory in a manner that is more typesafe than raw pointers, but with similar performance characteristics.
I do not expect it will be heavily used in line-of-business programming.
Does .NET have the means to ensure the safety of a method that returns a byref to anything other than a storage location which is directly contained within an object which was passed to the method? I don’t think it did when that earlier post was written; much has been added to .NET since then, however, so perhaps there might be a means to make such things safe even if there wasn’t before? If so, that might make the subject worth a revisit.
I would think the safest approach from an implementation perspective would be to use auto-generated logic in a fashion vaguely analogous to the way compilers support closures and iterators, such that something like:
listThatExposedByrefs[23].X=Y;
would get translated into something like:
listThatExposesByrefs.get_as_byref(23,
(ref theType it, ref int p1)=>(it.X = p1), ref Y);
Having get_as_byref accept a generic byref would make it possible to pass any number of arguments to the inner function by value, merely by placing them in a temporary structure and passing a byref to that. Doing so would make it possible to use a static delegates rather than having to generate a closure every time the function is called, thus enabling the code to be quite efficient.
There are presently a lot of cases where the ability of arrays of structures to perform piece-wise updates “in-place” allows them to be much more efficient than any other data type, but where a variable-sized wrapper analogous to like “List” would be semantically preferable to a “bare” array. What would you think of that as an idea?
I asked the team that very question the last time I met with them and they said that the CLR team was investigating fixing the verifier so that it could be made both safe and verifiable. So that’s good news.
I think there are definitely use cases for more efficient mutable data types as well.
Pingback: The Morning Brew - Chris Alcock » The Morning Brew #2039