February | 2021 | Fabulous adventures in coding

Here we go again!

Fellow BASIC enthusiast Jeff “Coding Horror” Atwood, of Stack Overflow and Discourse fame, has started a project to translate the programs in the 1978 classic BASIC Computer Games into more modern languages. I never had a copy of this one — my first computer book was Practise Your BASIC — but these programs are characteristic of the period and I am happy to help out.

1978 was of course well into the craze of writing Life simulators for home personal computers. In this episode I’ll break down the original program assuming that you’re not a BASIC old-timer like me; this language has a lot of quirks. You can read the commentary from the book here, and f you want to run the original program yourself the copy here is a valid Vintage BASIC program.

Let’s get into it!

2 PRINT TAB(34);"LIFE"
4 PRINT TAB(15);"CREATIVE COMPUTING  MORRISTOWN, NEW JERSEY"
6 PRINT: PRINT: PRINT
8 PRINT "ENTER YOUR PATTERN:"

All lines are numbered whether they need to be or not. Line numbering is often used to indicate program structure; we’ll come back to this in a bit.
Lines are not one-to-one with statements! A single line may contain several statements separated by colons.
PRINT takes a semicolon-separated list of zero or more expressions.
PRINT outputs a newline by default; as we’ll see presently, there is a way to suppress the newline.
TAB(x) does not produce a string; it moves the “print cursor” to column x. In many versions of BASIC, TAB can only appear as an argument to PRINT.

9 X1=1: Y1=1: X2=24: Y2=70
10 DIM A(24,70),B$(24)
20 C=1

Variable names are typically one or two characters.
24 x 70 is the size of the Life board that we’re simulating here; this was quite a large screen by 1978 standards. I owned both a Commodore 64 and a Commodore PET with 40×25 character screens; it was possible to get an 80-column PET but they were rare.
Note that the authors have made the bizarre choice throughout that (X, Y) coordinates refer to (line, column) and not the more standard (column, line) interpretation. Normally we think of X as being the left-right coordinate, not the top-bottom coordinate.
Single-valued variables need not be declared.
Array-valued variables are “dimensioned”, and the dimensions typically must be literal constants. Here we have a two-dimensional array and a one-dimensional array.
Anything that is string-valued has a $ suffix, so B$ is an array of strings.
Arrays and strings are both indexed starting from 1.
As we know from early episodes in this series, a naïve implementation needs to keep track of both the “current” and “next” state. In a memory-constrained environment such as can be assumed by a 1978 BASIC program’s author, we need to keep all that in the same array. Spoiler alert: the convention used by this program for the contents of array A is:
- 0 means cell is dead, will stay dead.
- 1 means cell is alive, will stay alive.
- 2 means cell is alive, will die.
- 3 means cell is dead, will come alive.
You might have noticed that the program is formatted to have very few unnecessary spaces. Having become accustomed to vertical and horizontal whitespace being used to reduce eyestrain and guide the reader, it looks dense and hard to read. This was not a stylistic choice though; it was imposed by the limitations of the technology of the time. Line lengths were constrained, often to the width of the screen, and when you have only 40 or 80 characters of screen, unnecessary spaces are bad. But more interesting from my perspective as a compiler writer is that implementations of BASIC often tokenized as the user typed in the program, and then stored only the tokens in memory, not the original source code as it was typed. When the program was listed back, the program printer reconstituted the program from the token stream which had discarded the spaces. Pre .NET versions of VB did this!

30 INPUT B$(C)
40 IF B$(C)="DONE" THEN B$(C)="": GOTO 80
50 IF LEFT$(B$(C),1)="." THEN B$(C)=" "+RIGHT$(B$(C),LEN(B$(C))-1)
60 C=C+1
70 GOTO 30

INPUT takes a string from the console and places it in the given variable.
Arrays are indexed using parentheses.
This is a “while” loop but most BASICs did not have loop structures beyond FOR/NEXT, or even ELSE clauses for IF/THEN statements. You had to write GOTO statements to build those control flows, as shown here. Notice that all the colon-separated statements execute if the condition of the IF is met; otherwise we continue on to the next statement. (I was fortunate to have access to an original PET with Waterloo Structured BASIC which did have while loops, though I did not understand the point of them when I was in elementary school. Many years later I ended up working with some of the authors of Waterloo BASIC in my first internships at WATCOM, though I did not realize it at the time! The whole saga of Waterloo Structured BASIC and the SuperPET hardware is told here.)
LEFT$, RIGHT$ and LEN do what they say on the tin: give you the left substring of given length, right substring of given length, and length of a string. Strings are not directly indexable.
+ is both the string concatenation operator and the arithmetic addition operator.
It seems clear that we are inputting strings and sticking them in an array until the user types DONE, but why are we checking whether the first character in the string is a period, and replacing it with a space? It is because in some versions of BASIC, INPUT strips leading spaces, but they are necessary when entering a Life pattern.
Did you notice that nothing whatsoever was explained to the user of the program except “enter your pattern”? Enter it how? Why should the user know that spaces are dead, leading spaces are soft, a period in the first place is a hard space but a live cell anywhere else, and that you type DONE when you’re done? The assumption of the authors is that the only person running the code is the person who typed it in. “You should explain as little as possible” is not a great attitude to drill into beginner programmers.
Notice that we have no bounds checking. B$ was dimensioned to 24 elements; if the user types in more than 24 lines in the pattern, the program crashes. Similarly we have no checks on the lengths of the strings themselves. A whole generation of young programmers was taught from their very first lesson that crashes are the user’s fault for doing something wrong.
Our program state by the time we break out of the loop is: B$ has C lines in it, and B$(C) is an empty string.

80 C=C-1: L=0
90 FOR X=1 TO C-1
100 IF LEN(B$(X))>L THEN L=LEN(B$(X))
110 NEXT X

Since the last line is always blank, we are reducing the line count C by one.
The maximum line length L is initialized to zero, unnecessarily. I prefer an unnecessary but clear initialization to the lack thereof. But as we’ll see in a moment, some variables are just assumed to be initialized to zero, and some are initialized twice needlessly. More generally, programs that were designed to teach children about programming were chock full of sloppy inconsistencies and this is no exception.
FOR/NEXT is the only structured loop in many early BASICs. The loop variable starts at the given value and executes the body unless the loop variable is greater than the final value. Some versions of the language also had an optional STEP clause, and some would even keep track of whether the loop was counting up or down. Fancy!
Plainly the purpose of the FOR/NEXT loop here is to find the maximum line length of the pattern input by the user, but it appears to have a bug; we have strings in B$ indexed from 1 through C, but we are skipping the length check on B$(C). The subtraction here appears to be a mis-edit; perhaps the C=C-1 was moved up, but the developer forgot to adjust the loop condition. The bug only affects inputs where the last line is also the longest.
Is it inefficient to compute LEN(B$(X)) twice in the case where the maximum length L needs to be updated? Many versions of BASIC used length-prefixed strings (as do all versions of VB and all .NET languages) and thus computing length is O(1). When I first learned C as a teenager it struck me as exceedingly weird that there was no out-of-the-box system for length-prefixing a string. And it still does.

120 X1=11-C/2
130 Y1=33-L/2
140 FOR X=1 TO C
150 FOR Y=1 TO LEN(B$(X))
160 IF MID$(B$(X),Y,1)<>" " THEN A(X1+X,Y1+Y)=1:P=P+1
170 NEXT Y
180 NEXT X

There are no comments in this program; it would be nice to explain that what we’re doing here is trying to center the pattern that the user has just typed in to the interior of a 24 x 70 rectangle. (X1, Y1) is the array coordinate of the top left corner of the pattern as it is being copied from B$ to A; spaces are kept as zero, and non-spaces become 1. This automatic centering is a really nice feature of this program.
Once again we have no bounds checking. If L is greater than 67 or C is greater than 23, bad things are going to happen when we index into A. (Though if C is greater than 23, we might have already crashed when filling in B$.)
We already initialized X1 and Y1; we have not read their values at any point before they are written for a second time. By contrast, the population count P is accessed for the first time here and is assumed to be initialized to zero. Again, there is some amount of sloppiness going on here that could have been easily removed in code review.

200 PRINT:PRINT:PRINT
210 PRINT "GENERATION:";G,"POPULATION:";P;: IF I9 THEN PRINT "INVALID!";
215 X3=24:Y3=70:X4=1: Y4=1: P=0
220 G=G+1
225 FOR X=1 TO X1-1: PRINT: NEXT X

The input and state initialization phase is done. This is the start of the main display-and-compute-next-generation loop of our Life algorithm. The subtle indication is that the line numbering just skipped from 180 directly to 200, indicating that we’re starting a new section of the program.
Notice that two of our PRINT statements here end in semicolons. This suppresses the newline at the end. Notice also that the separator between G and “POPULATION:” is a comma, which instructs PRINT to tab out some whitespace after converting G to a string and printing it.
I9, whatever it is, has not been initialized yet and we are depending on it being zero. There is no Boolean type; in BASIC we typically use zero for false and -1 for true. (Do you see why -1 for true is arguably preferable to 1?)
We know that (X1, Y1) is the top left corner of the “might be living” portion of the pattern inside array A. (X2, Y2) and (X3, Y3) both appear to be the bottom right corner of the array, both being (24, 70) at this point, and (X4, Y4) is (1, 1), so it is likely another “top left” coordinate of some sort. Maybe? Let’s see what happens.
We reset the living population counter P to zero and increase the generation count G by one.
We then print out X1-1 blank lines. This implementation is quite smart for a short 1978 BASIC program! It is tracking what subset of the 24×70 grid is “maybe alive” so that it does not have to consider the entire space on each generation.
We’re continuing with the pattern established so far that X and Y are loop variables. Thankfully, this pattern is consistent throughout the program.
The assignment of P on line 215 is redundant; we’re going to assign it zero again on line 309 and there is no control flow on which it is read between those two lines.

230 FOR X=X1 TO X2
240 PRINT
250 FOR Y=Y1 TO Y2
253 IF A(X,Y)=2 THEN A(X,Y)=0:GOTO 270
256 IF A(X,Y)=3 THEN A(X,Y)=1:GOTO 261
260 IF A(X,Y)<>1 THEN 270
261 PRINT TAB(Y);"*";
262 IF X<X3 THEN X3=X
264 IF X>X4 THEN X4=X
266 IF Y<Y3 THEN Y3=Y
268 IF Y>Y4 THEN Y4=Y
270 NEXT Y
290 NEXT X

This is where things start to get interesting; this nested loop does a lot of stuff.
We are looping from (X1, Y1) to (X2, Y2), so this establishes the truth of our earlier assumption that these are the upper left and bottom right coordinates of the region of array A that could have living cells. However, note that the authors missed a trick; they set up (X1, Y1) correctly in the initialization phase, but they could have also set (X2, Y2) at that time as well.
We start with a PRINT because all the PRINTs in the inner loop are on the same line; we need to force a newline.
We update from the current state to the next state; as noted above, if current state is 2 then we were alive but we’re going to be dead, so we set the state to 0. Similarly, if current state is 3 then we were dead but are coming alive, so the state is set to 1.
It’s not clear to me why the test on line 260 is a not-equal-one instead of an equal-zero. There are only four possible values; we’ve considered two of them. It’s not wrong, it’s just a little strange.
In all cases where the cell is dead we GOTO 270 which is NEXT Y. Though some BASIC dialects did allow multiple NEXT statements for the same loop, it was considered bad form. The right thing to do was to GOTO the appropriate NEXT if you wanted to “continue” the loop.
Notice that there’s a syntactic sugar here. IF A(X,Y)<>1 THEN 270 elides the “GOTO”.
If we avoided skipping to the bottom of the loop then the cell is alive, so we tab out to the appropriate column and print it. Then we finally see the meaning of (X3, Y3) and (X4, Y4); as surmised, they are the upper left and bottom right corners of the “possibly alive” sub-rectangle of the next generation but I guessed backwards which was which. (X1, Y1), (X2, Y2) are the sub-rectangle of the current generation.
The line numbering pattern seems to have gone completely off the rails here and in the next section. This often indicates that the author got the code editor into a state where they had to introduce a bunch of code they did not initially plan for and did not want to renumber a bunch of already-written code that came later. The convention was to number lines on the tens, so that if you needed to come back and insert code you forgot, you had nine places in which to do it. Were I writing this code for publication, I would have taken the time to go back through it and renumber everything to a consistent pattern, but it really was a pain to do so with the editors of the era.

295 FOR X=X2+1 TO 24: PRINT: NEXT X
299 X1=X3: X2=X4: Y1=Y3: Y2=Y4
301 IF X1<3 THEN X1=3:I9=-1
303 IF X2>22 THEN X2=22:I9=-1
305 IF Y1<3 THEN Y1=3:I9=-1
307 IF Y2>68 THEN Y2=68:I9=-1

We’ve now processed the entire “currently maybe alive” rectangle so we print out the remaining blank lines to fill up the screen.
We copy (X3, Y3) and (X4, Y4), the “next generation” sub-rectangle to (X1, Y1), (X2, Y2) and it becomes the current generation.
Here we do something really quite clever that none of the implementations I looked at in my previous series handled. The authors of this algorithm have implemented a “rectangle of death” as a border of the array; that is a pretty standard way of handling the boundary condition. But what I really like is: they detect when a living cell hits the boundary and set flag I9 to true to indicate that we are no longer playing by infinite-grid Life rules! This flag is never reset, so you always know when you are looking at the UI that this is possibly not the true evolution of your initial pattern.

309 P=0
500 FOR X=X1-1 TO X2+1
510 FOR Y=Y1-1 TO Y2+1
520 C=0
530 FOR I=X-1 TO X+1
540 FOR J=Y-1 TO Y+1
550 IF A(I,J)=1 OR A(I,J)=2 THEN C=C+1
560 NEXT J
570 NEXT I
580 IF A(X,Y)=0 THEN 610
590 IF C<3 OR C>4 THEN A(X,Y)=2: GOTO 600
595 P=P+1
600 GOTO 620
610 IF C=3 THEN A(X,Y)=3:P=P+1
620 NEXT Y
630 NEXT X

Finally, we’ve got to compute the next generation. Note that we had a corresponding sudden increase in line number to mark the occasion.
We reset the population counter to zero and we loop over the currently-maybe-alive rectangle expanded by one cell in each direction, because the dead cells on the edge might become alive.
Variable C before was the number of valid lines in B$, our string array. Now it is the number of living neighbours of cell (X, Y). Even when restricted to two-character variables, they are in fact plentiful and there is no need to confusingly reuse them.
We count the number of living cells surrounding (X, Y) including (X, Y) itself, remembering that “is alive, stays alive” is 1, and “is alive, dies” is 2. Once we have the count then we have our standard rules of Life: if the cell is currently dead and the neighbour count is 3 then it becomes alive (3). If it is currently alive and the neighbour count including itself is not 3 or 4 then it becomes dead (2). Otherwise it stays as either 0 or 1.
We have a GOTO-to-GOTO bug here. That GOTO 600 could be replaced with GOTO 620 and save a statement.

635 X1=X1-1:Y1=Y1-1:X2=X2+1:Y2=Y2+1
640 GOTO 210
650 END

We did not track whether any cell on the border became alive, so the code makes the conservative assumption that the maybe-living-cells-in-here rectangle needs to be grown one cell on each side. Smart… or… is it?
Take a look at the logic on line 635, and then compare it to the looping constructs on lines 500 and 510. We loop from x1-1 to x2+1; we nowhere else read or write x1 or x2, and as soon as the loop is done, we reassign x1 to x1-1 and x2 to x2+1. It would have made more sense to do the increments and decrements first, and then do the loops!
Our program then goes into an infinite loop. This was common in BASIC programs of the time; when you wanted it to end, you just pressed RUN-STOP. I mean, what else are you going to do? Prompt the user? Poll the keyboard? Just run and the user will stop you when they want it stopped.
Some dialects of BASIC required an END statement even if it was unreachable.
Notice that there was never any “clear the screen” control here. It just constantly prints out new screens and hopes for the best as the old screen scrolls off the top!

I hope you enjoyed that trip down memory lane as much as I did. I am extremely happy to have, you know, while loops, in the languages I use every day. But I do miss the simplicity of BASIC programming some days. It is also quite instructive to see how you can implement all this stuff out of IF X THEN GOTO when you need to — and when you’re writing a compiler that takes a modern language to a bytecode or machine code, you really do need to.

Next time on FAIC: I’ll write this algorithm in idiomatic modern C# and we’ll see how it looks.

What regular work activity has the highest impact on the organization in the least amount of time and effort?

I haven’t done any science on this, but anecdotally it sure feels like recruiting, interviewing and mentoring are all huge impact-per-time compared to technical stuff like writing code. I can think of people I spent a few hours convincing to interview who ended up making multiple decades of contributions to Microsoft, for instance. Encouraging good hires and then helping grow each other’s skills are multipliers.

Why then, are so many companies so bad at helping their employees recruit their talented friends?

I don’t know! There’s got to be some perverse incentive somewhere that makes these processes broken. I was asked recently by a reader to share a “war story” on this subject. (I think I have told this story before but if I did I can’t find it, so here you go again.) Today’s story takes place about twenty years ago, and I sincerely hope things have improved at Microsoft in the intervening decades.

A friend of mine, let’s call them B, who I knew to be a talented software engineer with great technical PM skills was looking to change companies; I happened to know of a team in devdiv that needed someone with their exact skill set, and so I submitted an employee referral into the system. B got an interview and accepted an offer, and I was very happy right up until a few weeks later when I got a package in interoffice mail. The package contained:

A single sticky-paper gold star, like a primary school teacher puts on a perfect quiz.

A flimsily-built off-brand miniature lava lamp knockoff. (For my younger readers: a lava lamp is a novelty lamp in which the bulb is below a conical closed glass vessel containing oil and wax; the lamp melts the wax which then circulates in a sort of psychedelic convection pattern. There was a lava lamp fad in the late 1960s and early 1970s; I remember my parents had a blue lava lamp when I was a very small child. There was a (very brief!) resurgence in this fad in the late 1990s and it was common at Microsoft for lava lamps to be given out as funny prizes. I had three in my office at one point.)

A poorly photocopied note. Obviously I do not recall the exact wording of the note but I can very easily give you the gist. It was something like:

“You’re a recruiting all-star! Thanks for your successful referral. Here’s a gold star to put on your office door to let everyone know that you’re an all-star! Please fill out a survey on your referral experience at [internal site]”

Sigh.

They invited me to give them feedback and so I did. Rather than filling out the survey, I sent the head of recruiting a personal email. That of course is lost to the mists of time, but again, I can certainly give you the idea. It went something like:

Dear X, thank you for the off-brand lava lamp and gold star. As requested, I’m providing feedback on my experience of referring B to the Z team.

I do not require either a reward or recognition for successfully referring a friend. The reward is that I now get to work with someone great who will help us achieve our division’s goals. But if you are going to reward that behaviour, you could do a lot better than a literal paper gold star, a photocopied form letter, and a five dollar junk store lamp.

You could, for instance, pay out a bonus for successful referrals. You know better than I do that sourcing talent is expensive. You would pay in excess of 10% of the salary of my new employee friend if they had been sourced by an outside talent agency. You could pay employees for referrals at, say 2% of salary. We’d feel genuinely appreciated and incentivized, and you’d pay a fraction of what it would normally cost. A cheap lamp and a sticky paper star makes it seem like the company cares very little that I put in this effort, and I know that’s not your intention.

You know what I would find even more motivating than a bonus? Instead of a photocopied impersonal form letter from recruiting, you could encourage the hiring manager to send me and my manager a personal email that says “Thank you so much Eric for helping recruit B to our team; we really needed someone with that skill set and I am looking forward to working with them. I will remember that you helped out our team in the next performance review cycle.” The documented goodwill of a hiring manager in devdiv is not something I can buy with money, but it is valuable to my career.

I was young and naive. I expected a reply to my thoughtful, well-intentioned, solicited critique; I got none. I also never got another gold star from recruiting, so maybe that message landed; I don’t know.

I’d be curious to know if any of my readers have received similar “awards” that send the opposite message as was intended. Leave a comment if you have!

Next time on FAIC: I have not forgotten that I said I’d blog more about Bean Machine, but before I do that I’d like to share some thoughts on a Python library I’ve been developing as part of that effort. I am a novice Python programmer and I’d love to get some feedback from the experts and novices among my readers alike.

Fabulous adventures in coding

Eric Lippert's blog

Monthly Archives: February 2021

Life, part 38

Hey now, you’re an all-star