What is lexical scoping?

Happy Eliza Doolittle day all; today seems like an appropriate day for careful elocution of technical jargon. So today, yet another question about “scope”. As one of the more over-used jargon terms in programming languages, I get a lot of questions about it.

I’ll remind you all again that in C# the term “scope” has a very carefully defined meaning: the scope of a named entity is the region of program text in which the unqualified name can be used to refer to the entity.[1. Scope is often confused with the closely related concepts of declaration space (the region of code in which no two things may be declared to have the same name), accessibility domain (the region of program text in which a member’s accessibility modifier permits it to be looked up), and lifetime (the portion of the execution of the program during which the contents of a variable are not eligable for garbage collection.)]

For example, the scope of a local variable is the text of the block which declares it. The scope of a private method is the text of the body of the class or struct which declares it. And so on; the C# specification has careful definitions which define the scope of everything that has a name.

The word “lexical” means, in a broad sense “relating to text”, and clearly we have defined “scope” as being a relationship involving text, so is this kind of scoping also called “lexical scoping”?

Sort of, but not exactly. Let me explain.

Programming languages can be broadly divided into two categories: the lexically scoped languages and the dynamically scoped languages. The difference between the two is: in a lexically scoped language, the meaning of an unqualified name can be completely determined by looking at the program text; the analysis can be done “statically”. In a dynamically scoped language the meaning of an unqualified name can change at runtime; the name analysis can only be done “dynamically”.

Let me give you an example; it is easiest to show this with lambdas.

class C
  public static Func<int> M()
    int x = 123;
    return () => x;
class P
  static void Main()
    int x = 456;
    Func<int> f = C.M();

The question is: what gets printed out? C# is a lexically scoped language, so the meaning of x in the lambda is determined at compile time, by analyzing the text where the lambda was written. C# prints out 123. If C# were a dynamically scoped language then the meaning of x would be determined by analyzing the location where the delegate was executed at runtime, so it would print out 456.

Dynamically scoped languages essentially make the C# definition of scope useless; any method that executes a lambda in a dynamically scoped language makes the region of program text in which its locals can be referred to by their names arbitrarily large.

JavaScript, though a very dynamic language, is actually for the most part lexically scoped:

function M()
  var x = 123;
  return function () { return x; };
function N()
  var x = 456;
  var f = M();
  print(f()); // 123

However, JavaScript does have one feature which makes it dynamically scoped:

function Q(y)
  var x = 123;
    return x;
print(Q({ x : 456 })); // 456
print(Q(789));         // 123

Here the meaning of unqualified name x changes at runtime depending on whether y has a member x or not. For this reason I recommend avoiding the with block in JavaScript; it makes it hard for the reader to understand the meaning of the program.

Most modern languages are lexically scoped; experience has shown that lexical scoping is easier on all concerned; developers and maintenance programmers have an easier time understanding lexically scoped languages, and compiler developers have an easier time writing efficient compilers. Some variants of Lisp use dynamic scoping, though Scheme requires lexical scoping. There are still a few dynamically scoped languages in common usage though; PostScript, the programming language which runs on printers, is perhaps the most commonly used of them.

Next time on FAIC: We solve the mystery of the inserted method.

16 thoughts on “What is lexical scoping?

  1. If the javascript function M() has x=123, rather than var x=123, then it will print 456. And the scope is dynamic , right ?

    • Looks better than my own technique, which after 7 years of it’s inception has finally been detected by virus scanners as malicious code. Probably a good thing everyone I ever demonstrated it too called it overly complicated and decided not to adopt it.

  2. It’s interesting that you mention PostScript. It’s been awhile since I used it, but I recall that the language had a stack to handle nested identifier contexts. If the code for a routine were to start by pushing a context containing all its identifiers, and end by popping it, the routine would behave as though its identifiers were lexically scoped. What makes PostScript “interesting” compared to other languages is that program execution, function arguments/local variables, and identifier contexts are all handled by independent stacks, while many languages keep all three in sync(*). Additionally, PostScript has a “bind” procedure, which given an array of tokens, will replace any token the system knows about with a direct reference to the thing the token refers to at that moment.

    (*) Many language implementations use one stack for both call/return flow and auto variables/parameters, and resolve identifier contexts at compile time, but that’s an implementation detail. Some embedded systems use separate stacks for call/return flow and variables; this is transparent to the programmer, but may make such systems more resistant to stack overflow exploits.

  3. The scope of “this” in Javascript is kind of dynamically bound as well:

    function foo() { return this; }
    var a = { f : foo }
    var b = { f: foo }

    a.f() will return a, b.f() will return b;

    • Well, you can think of “this” as simply being an argument that is passed invisibly. That is, if you wrote:

      function foo($this) { return $this; }
      var a = { f : foo };

      then it is no surprise that ‘a’ is returned.

  4. Great stuff… but a little nit-picking…

    Stated: For example, the scope of a local variable is the text of the block which declares it.

    But not always true…

    x = 5; // is in the block which is declared, but invalid as it is BEFORE the declaration
    int x;

    • My statement is correct. Your comment, which you intended to give as an example of my error actually illustrates the correctness of my statement.

      I said that the scope of a variable is the region of text in which it can be resolved by its unqualified name, and here we have a region of text in which the name “x” refers to the local — “x” means the local throughout that entire block.

      If “x” did not mean the local through the entire block then how could the compiler possibly produce the error that x is used before it is declared? The lookup of “x” results in the local, which is declared later, and therefore is an error.

  5. Something I am starting to appreciate about lexical scoping is that it can make closures possible which then makes easier use of callbacks possible.

    I am glad that JavaScript supports closures, otherwise how would we deal with its lack of namespaces. I am also glad that Emacs now supports lexical scoping (and closures) as an opt-in feature on file-by-file basis, although special variables continue to be dynamically scoped.

  6. Perl is another popular language that supports dynamic scoping, with its “local” keyword. It is not uncommon for a n00b to attempt to declare a local variable, only to find that they’ve declared a variable that is globally available until its declaring function returns.

    Presumably the name is just an accident of history, because “local” was introduced with version 2 in 1988. Back then it was as local as you could get in Perl. When lexical scoping was introduced with Perl 5 in 1994, the name “local” was already taken, so “my” was chosen as the keyword to declare what we now consider to be local variables.

  7. If I remember correctly, LISP was intended to use lexical scoping, but due to the bug in the implementation, it got dynamic scoping, and then it became a tradition.

    Still, nothing beats ALGOL-60’s call-by-name, if you ask me.

  8. Pingback: Hello World in Scheme - The Renegade Coder

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s