An interesting list structure, part 3

This is the story of how I was no-hired by Microsoft; surprisingly, it is germane to our discussion of unusual list structures.

I was a Waterloo co-op intern three times at Microsoft, each time working for the Visual Basic team; that’s where I learned how industrial developer tools work, how to build a localizable product, and many other valuable lessons. The VB team told me that on the strength of my internships — which, after all, was in total a 12 month long interview spread over two years — that they’d give me a job when I graduated. However, the VB managers were both smart and kind, and encouraged me to look around the company more broadly to see if there was any other team that I might be a good fit for.

I ended up interviewing with two or maybe three other teams; this was over 20 years ago and my memory is a bit hazy. A few points do stand out though. I had an interview with the ill-fated Blackbird team that I thought went pretty well — the technical question was about rapidly finding the seven letter words in a Scrabble rack, and I gave the standard solution of preprocessing the dictionary into a multi-dictionary that answers the question in O(1) time, and then discussed how to use a trie to solve the more general problem. I’m not sure why they no-hired me, but that team was about to be disbanded so I dodged a bullet there. In retrospect, this was over a year after the “tidal wave email” and I am surprised they lasted as long as they did.

But the thing that really stands out is my interview with the SQL Server team. The day I interviewed there was a snowstorm in the greater Seattle area, which is quite rare and there is not infrastructure to rapidly clear the streets. A grand total of zero of the people scheduled to interview me came into work that day, and so recruiting had to scramble to find interviewers. The technical question I got from the SQL team was “reverse a linked list in C++” — that was the extent of the problem statement, so I asked “Can I assume I already have a linked list class defined?”  Sure. “Is it acceptable if the original list is destroyed?” Sure. “OK,” I said, and wrote on the whiteboard something like:

LinkedList reverse(LinkedList list) 
{
  LinkedList result;
  while (!list->is_empty()) result.push(list.pop());
  return result;
}

I’m probably missing some syntactic details there; I haven’t programmed in C++ in fifteen years. But you get the idea; reversing a linked list is logically just popping the old list and pushing onto the new list.

The interviewer just said something like “that’s wrong”, and I said “no, that’s right but would you like me to implement push and pop to show you how they work?” and it went downhill from there.

That was super confusing to me at the time. I realized much later — like, years later, when I started doing interviews myself — that the interviewer was unprepared, was not expecting to do an interview that day, may have been a very inexperienced interviewer, probably picked a problem at random, and worst of all had a specific “correct” solution in mind. Almost certainly the “correct” solution was the solution where you traverse the list and rejigger the “next” pointer “in place” as you go, which of course is how you reverse a linked list in C. (I am often surprised at the number of people in interviews who write C-style solutions in C++ or C#, as though those languages did not have a culture of encapsulation of implementation details.)

So it was a confusing day and a bunch of no-hires, but no harm done. Obviously I ended up having a successful 16 year stint at Microsoft regardless, and I learned from example how not to conduct a technical interview.

All this was on my mind as I was writing this series because the original paper by Hughes that introduced the concept I’ve been discussing is called A novel representation of lists and its application to the function “reverse”.

What on Earth does “reverse a linked list” have to do with this unusual representation of linked lists? I won’t go through the whole argument presented in the Hughes paper because it gets a bit technical, but I’ll give you what was for me the key insight.

Suppose we have our “simple” linked list from last time, where a list is either empty, or a head value and a tail list. How do you reverse that linked list? The problem here is posed in the context of functional languages where we do not have mutable local variables and loops, so a recursive solution is required.

If we have an “append” operation available then the naive recursive way to reverse a simple list is to say “recursively reverse the tail, and then append the old head to the reversed tail”. That is, if we have 1, 2, 3, then we recursively reverse 2, 3 into 3, 2, and then append 1 to get 3, 2, 1. In our implementation that would be:

public SimpleList<T> NaiveReverse() =>
  this.IsEmpty ?
    this :
    this.Pop.NaiveReverse().Append(this.Peek);

The problem with that should be clear given our previous discussion. We recurse once per list element, so we recurse O(n) times. But Append is an O(n) operation on simple lists, so the total complexity is O(n2), which is pretty bad.

The standard O(n) solution is to emulate my little loop above, but non-destructively and recursively:

private SimpleList<T> ReverseWithTail(SimpleList<T> tail) =>
  this.IsEmpty ?
    tail :
    this.Pop.ReverseWithTail(tail.Push(this.Peek));
public SimpleList<T> Reverse() => this.ReverseWithTail(Empty);

But… hold on a minute here. If we said

Func<SimpleList<T>, SimpleList<T>> f = someList.ReverseWithTail;

Then what we’ve got is an object that represents a list — in this case, it represents the list that is someList reversed — where the object is a function such that when you give it a list, it gives you back the represented list concatenated with the given list, in O(n) time; a delegate to ReverseWithTail is exactly the data stored in a Hughes list!

So perhaps it is not that bizarre to represent a list as a concatenation function. Functional programs are chock-full of algorithms like this one, where the “tail end” of a list is being passed in and the algorithm represents the operation of stuffing values onto the left side of it. The genius of the Hughes list is to recognize that this technique can be generalized into an elegant, useful data type that defers expensive work until it is needed.


Next time on FAIC: System.Random is the source of so many bugs; could we come up with a better design? This is going to be a long series, because there’s a lot of improvements we could make. We’re not yet programming at the correct level of abstraction when it comes to dealing with randomness; let’s see how we can fix that.

9 thoughts on “An interesting list structure, part 3

  1. As a hobby project I worked on porting boost::random to C# while trying to understand the algorithms involved. I liked it very much – are you going to publish such a library?

    • Great question.

      A lot of the ideas I’m going to discuss are also characteristics of the boost library: the idea that we have objects that represent distribution-of-T, the idea that we have well-named objects so if you want a binomial, you say you want a binomial, and so on.

      However, rather than taking a comprehensive overview of how to develop a fully-featured library like boost’s random, I’m going to more dig into how we can represent common operations on distributions more naturally, and how we can use features already in C# to express probabilistic workflows.

  2. I can relate to the SQL Server interview story. I once showed up to an all day interview at a studio the morning after they shipped their latest product, and, as I understand, the first day off any of the team had in some time. To complicate things, the recruiter I had been working with had a death in the family just a few days earlier (I only learned this later) and communication on their end had broken down so nobody in the office was expecting me.
    Long story short: I show up, people have to be called in to interview me that are some combination of tired, unprepared, and not exactly happy to be there. One of the guys on the second panel asked a question about how the fields for a class are arranged in memory in some scenario with sub-classing. I responded that I didn’t actually know what was done in practice and would look it up if I found myself in a scenario where it mattered but offered one potential layout where the child class’s fields preceded the base class’s fields. He attacked the suggestion as unworkable and claimed it was required to be the other way around because in cases where an instance of the subclass is passed to a function accepting the base the fields all needed to be at the same offset. This prompted me to start asking my own questions like what happens with multiple inheritance? Since both base classes can’t be first, if we pass the child instance to a function expecting BaseB and BaseA is first we run into the same problem… The group as a whole then went on to debate the pros and cons of each with him getting more snippy as the session progressed. ( I looked it up when I got home the next day, turned out to be up to the implementation according to the specification, but most of the compilers of the day did follow the layout he was championing. )

    I didn’t get that job. It also turns out to be important to not come across as arguing with your interviewer.

    • Yuck. First rule of interviewing is *the person you are interviewing is a person*, so treat them like a person. Moreover, that person might also be your future coworker, and might be a future customer, and might be a current or future influencer of other customers or coworkers, so BE NICE TO THEM FOR PETE’S SAKE.

      Second, if the candidate makes a small mistake, that is a huge opportunity! I love it when candidates make mistakes because that gives you a chance to see how they engage with others when working together to solve a problem.

      Third, an interviewer who does not have thorough and deep understanding of the material they’re asking about is asking the wrong question. “How might you lay out memory in a multiple inheritance language?” is an awesome interview question, but if you don’t yourself know the answer at a great level of detail, don’t ask the question of a candidate. I would not ask that question, as I am not an expert in compiling multiple inheritance languages. I might ask how a language like C# or Java could do multiple inheritance of interfaces, since I do understand that at some level of detail.

  3. The interesting bit for me is how you can improve functions automatically, e.g. if you change NaiveReverse to have type SimpleList -> HughesList but write it in the same naive way, then compose with ToSimple, you get ReverseWithTail for free.

    The really interesting bit is the (very loose) analogy with Church-encoded free monads where you go “up a kind”:
    monoid : monad
    list : free monad
    Hughes list : Church-encoded free monad

  4. I guess it’s only fitting that about half the comments will relate to the first half of the post. 🙂

    I’ve interviewed a lot, mostly at Microsoft. Other than the first time, and then again 20 years later returning to the company, the Microsoft interviews were all internal interviews (i.e. moving from one group to another). I’ve had a few odd-ball interviewers here and there, but on the whole I’ve been pretty lucky and have had respectful, engaging interviewers. Both times in my recollection when I’ve had to correct an interviewer, they were gracious about it.

    The first time was my original hire, coming out of college. The interviewer asked me to write some list functions in C. I confused him by using pointers-to-pointers in the implementation; I hate implementations that special-case the root pointer, which is I guess what he was expecting, and pointers-to-pointers avoid that (just pass the address of the variable containing the root pointer).

    Ironically, the other time I can recall was on that returning interview (i.e. so the other “not internal” interview). By then, I’d gotten reasonably good at C# and .NET. An interviewer asked me to write some thread-safe code (I think it was a producer/consumer implementation or something like that), which I did using “lock” and the Monitor class, with calls to “Wait()” and “Pulse()”. I had to explain to him that Wait() released the lock until the signal came; he initially complained that I’d deadlocked my threads!

    But in both cases, each interviewer accepted my explanations, taking the misunderstanding in stride. Which as I learned through experience later is not at all a common human response (defensiveness and confrontation are much more common). All credit due to them for handling it with professionalism! 🙂

  5. These Hughes lists are crazy. Thanks for sharing! My brain will need a few passes over this to internalize these ideas. I find that letting a few days pass is sometimes a huge accelerator for learning.

    It is scientifically known that the brain records reality into a buffer structure (the Hippocampus). It then gradually flushes that data into a semantic index (the Cortex) over the course of multiple days. So maybe that’s what’s going on.

    > I am often surprised at the number of people in interviews who write C-style solutions in C++ or C#, as though those languages did not have a culture of encapsulation of implementation details.

    This is a piece of wisdom. These C programmers often bring their C ideas into C#. But these ideas often were wrong even in the C setting to begin with!

    Performance can justify all kinds of contortions. But even though C tends to be used in high performance scenarios I have to believe that most C code is cold, right? No need for this low level of abstraction in most places.

  6. When you refer to using a multi-dictionary, I’m assuming this means a dictionary with multiple words per key and using the alphabetically sorted set of letters in the Scrabble rack as the key? Since the rack is known to be exactly 7 characters, I’d still consider this constant time.

    Or am I missing something?

    • That’s exactly right. The specific problem was to find all the seven-letter words in a seven-letter rack. Since the rack is of a fixed size, and the dictionary is a hash table, we can say that sorting the rack into the key is O(1) and the lookup is O(1). Of course, the real problem faced by a Scrabble AI is to find the words on a rack that fit into the board, which is a rather more difficult problem.

Leave a comment