Ref returns and ref locals

“Ref returns” are the subject of another great question from StackOverflow that I thought I might share with a larger audience.

Ever since C# 1.0 you’ve been able to create an “alias” to a variable by passing a “ref to a variable” to certain methods:

static void M(ref int x)
{
  x = 123;
}
  ...
  int y = 456;
  M(ref y);

Despite their different names, x and y are now aliases for each other; they both refer to the same storage location. When x is changed, y changes too because they are the same thing. Basically, “ref” parameters allow you to pass around variables as variables rather than as values. This is a sometimes-confusing feature (because it is easy to confuse “reference types” with “ref” aliases to variables,) but it is generally a pretty well-understood and frequently-used feature.

However, it is a little-known fact that the CLR type system supports additional usages of ref, though C# does not. The CLR type system also allows methods to return refs to variables, and allows local variables to be aliases for other variables. The CLR type system however does not allow for fields that are aliases to other variables. Similarly arrays may not contain managed references to other variables. Both fields and arrays containing refs are illegal because making it legal would overly complicates the garbage collection story.[1. I also note that the “managed reference to variable” types are not convertible to object, and therefore may not be used as type arguments to generic types or methods. For details, see the CLI specification Partition I Section 8.2.1.1, “Managed pointers and related types” for information about this feature. See also my numerous articles on memory management for more discussion of why C# and the CLR do not allow long-term storage of refs.]

As you might expect, it is entirely possible to create a version of C# which supports both these features. You could then do things like

static ref int Max(ref int x, ref int y)
{
  if (x > y)
    return ref x;
  else
    return ref y;
}

Why do this? It is quite different than a conventional “Max” which returns the larger of two values. This returns the larger variable itself, which can then be modified:

int a = 123;
int b = 456;
ref int c = ref Max(ref a, ref b);
c += 100;
Console.WriteLine(b); // 556!

Kinda neat! This would also mean that ref-returning methods could be the left-hand side of an assignment — we don’t need the local “c”:

int a = 123;
int b = 456;
Max(ref a, ref b) += 100;
Console.WriteLine(b); // 556!

Syntactically, ref is a strong marker that something weird is going on. Every time the keyword ref appears before a variable usage, it means “I am now making some other thing an alias for this variable”. Every time it appears before a declaration, it means “this thing must be initialized with a variable marked with ref“.

I know empirically that it is possible to build a version of C# that supports these features because I have done so in order to test-drive the possible feature. Advanced programmers (particularly people porting unmanaged C++ code) often ask us for more C++-like ability to do things with references without having to get out the big hammer of actually using pointers and pinning memory all over the place. By using managed references you get these benefits without paying the cost of screwing up your garbage collection performance.

We have considered this feature, and actually implemented enough of it to show to other internal teams to get their feedback. However at this time based on our research we believe that the feature does not have broad enough appeal or compelling usage cases to make it into a real supported mainstream language feature. We have other higher priorities and a limited amount of time and effort available, so we’re not going to do this feature any time soon.

Also, doing it properly would require some changes to the CLR. Right now the CLR treats ref-returning methods as legal but unverifiable because we do not have a detector that detects and outlaws this situation:

static ref int M1(ref int x)
{
  return ref x;
}
static ref int M2()
{
  int y = 123;
  return ref M1(ref y); // Trouble!
}
static int M3()
{
  ref int z = ref M2();
  return z;
}

M3 returns the contents of M2‘s local variable, but the lifetime of that variable has ended! It is possible to write a detector that determines uses of ref-returns that clearly do not violate stack safety. We could write such a detector, and if the detector could not prove that lifetime safety rules were met then we would not allow the usage of ref returns in that part of the program. It is not a huge amount of dev work to do so, but it is a lot of burden on the testing teams to make sure that we’ve really got all the cases. It’s just another thing that increases the cost of the feature to the point where right now the benefits do not outweigh the costs.

If we implemented this feature some day, would you use it? For what? Do you have a really good usage case that could not easily be done some other way? If so, please leave a comment. The more information we have from real customers about why they want features like this, the more likely it will make it into the product someday. It’s a cute little feature and I’d like to be able to get it to customers somehow if there is sufficient interest. However, we also know that “ref parameters” is one of the most misunderstood and confusing features, particularly for novice programmers, so we don’t necessarily want to add more confusing features to the language unless they really pay their own way.

Advertisements

9 thoughts on “Ref returns and ref locals

  1. A very late reply to this post, but I’ve been running into this again.

    I think ref returns are important because they make the story for mutable value types consistent between built-in and user-defined types.

    For a simple example, see http://pastebin.com/yyYtyh6u

  2. Pingback: How to: Why doesn't C# support the return of references? | SevenNet

  3. I know I’m really late to the party but I would use it to decrease risk of typos in code.
    For example I’m currently writing a method that depending some condition needs to send a reference to a field as a parameter into a few different methods (the same field is passed as a reference in four different places inside this scope). Thus allowing “aliasing” of the field inside this scope would be very nice – in particular because this is a situation prone to copy-paste errors since there are more than one such field, and to each of them a corresponding method. As it stands, copying the method requires changing all references, and missing one would be a bug. Allowing a local “alias” would mean that for each copy of the method one would only need to change one thing: what the alias points to. Much less error prone.

  4. Also late to the party but this becomes *really* useful when you need to write code to solve GC pressure problems similar to what Sam documented years ago with Stack Overflow. I am dealing with that right now and if I could return references to *structures* in the middle of an array, that would make implementing high performance code so much easier.

    https://samsaffron.com/archive/2011/10/28/in-managed-code-we-trust-our-recent-battles-with-the-net-garbage-collector

    I doubt many people would normally run into the need for this, but when you do, it is super painful to code around it using whacky delegates or other ways to avoid copying by value when you want to pass around structures embedded in an array.

  5. I found myself writing my own Dictionary class because for convenience, such as getting back the old value when setting a new one and support for my own Option type. Currently I’d have to get the index at which the entry is located to modify the structure directly through the array (var i = FindEntry(key); entries[i].Value = value;). With ref returns I could write better code (FindEntry(key).Value = value;).

    As I’m working in this area myself right now, I think this would be incredibly useful for game development in general. There, it often makes sense to pack game object data into arrays of structs for performance reasons.

  6. Pingback: 为什么不# 39;T C #支持引用的回报? – CodingBlog

  7. Pingback: Why doesn't C# support the return of references? – CodingBlog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s