The answer to the string concatenation puzzle

Earlier on FAIC I asked for code that parses as an expression that produces different results for

s = s + expr;

and

s += expr;

This is a pretty easy puzzle; the answers posted in the comments could largely be grouped into two buckets. The first, which is a bit silly, are expressions which always produce a different value, so of course they produce different results in those two cases.

s = s + Guid.NewGuid();

produces a different result than

s += Guid.NewGuid();

but then again, it also produces different results every time you call

s += Guid.NewGuid();

so that’s not a particularly interesting answer.

The second, which is what I was looking for, are expressions which stop parsing as expressions when placed on the right side of an addition operator. When you say

s += 1 + 2;

that’s the same as

s = s + (1 + 2); // produces "3"

and not

s = s + 1 + 2;

which is the same as

s = (s + 1) + 2; // produces "12"

The trick is that the 1 + 2 portion of s = s + 1 + 2; is not an expression at all in that context.

This is particularly nasty when you consider something like:

s += M() ?? "(null)";

The intention here is to concatenate the value of M() to s, unless M() returns null in which case you concatenate "(null)". But

s = s + M() ?? "(null)";

is the same as

s = (s + M()) ?? "(null)";

and we know that s + M() is never null from previous episodes in this series.

The C# compiler could warn that the left side of a null coalescing operation is never null, but it does not; I’m considering adding this to a future version of Coverity’s C# analysis; what do you guys think? I’ve seen this “in the wild” on StackOverflow, but is it a likely bug in real production code?

15 thoughts on “The answer to the string concatenation puzzle

  1. I’ve seen the “s + M() ?? “(null)” as an interview question but never as actual production code. I’ve never worked with any developer confident enough with operator precedence to write that expression without putting parentheses (“just in case”).

    • Even if you fully understand operator precedence and how an expression will be evaluated, parentheses are useful as a form of documentation for future editors/readers of such code. It demonstrates that the behavior was there by intent not accident.

    • “I’ve seen the “s + M() ?? “(null)” as an interview question…” :
      With all due respect, the person who asks this as interview question is an idiot. Will he/she be happy when the source code of the product is filled with these and the code becomes unreadable?
      “I’ve never worked with any developer confident enough with operator precedence…” :
      It’s not about confidence of developers, let’s be honest! Who would want to work with people who code like this!(please calculate also hidden psychology of the guy who codes likes this thinking that he/she is one heck of a hacker) I wouldn’t want, that’s for sure.
      To finish up, these cases are great to analyze “deep/dark” aspects of the language and/or the compiler, but not to ask as an interview question!

      • Asking something in an interview question is not the same as recommending its use in production code. Obviously you wouldn’t want to see that kind of code without parentheses in production. Also, the topic is sufficiently obscure that you probably wouldn’t touch on it if you were interviewing for a junior position. But if it’s a senior position and the candidate is claiming (or the post requires) expert knowledge of C#, then a question on this topic seems quite valid as way of probing how much the candidate really knows. In that context I don’t think that would make the interviewer an idiot.

        After all, badly written, unclear, obscure code does exist in the wild, and does need to be debugged sometimes!

  2. This were never a bug in my case, but the conversion from object to string in the “+” operator already caused a lot of trouble for me, last week:
    aString + anObject == null ? “” : anObject.ToString()

  3. Warnings about conditions that’ll never occur are always nice, helps get rid of code bloat from minor paranoia.

    The order of operation influencing typing thing can be a problem with other types than string. Recently had an equation that required a division with integers followed by implicit conversion to float for the latter half. The intent was to use integer’s nature of automatically truncating the decimal, but what come out was division in floating point form due to one missed pair of parenthesis. Produced some pretty wild values well outside of the expected range and took digging into the generated IL to figure it out.

    • I agree about warnings for conditions that do not usually occur. I’m all in favour of software that protects me from my absent-mindedness.

      Also, the less effort needed to program defensively the better. If you know the tools will pick up a bad refactor or a typo, you can spend more mental effort on the actual features of your program. You can also feel free to make more extensive refactors, fighting the good fight against code-rot.

    • In the “Explicit Off” dialect of vb.net, different operators are used for floating-point division and integer division (int/int yields double; intint yields int); the only annoyance in that regard is that assigning the quotient of two integers to a Single (the .NET type that C# calls float) requires a silly explicit cast. Additionally, X+Y will perform concatenation if both X and Y are strings, but will not convert its operands to strings (attempting to concatenate to string a type which does not define an explicit overload for that purpose will fail). The preferred operator for string concatenation is the ampersand (the C# operator “&” is written as “And”, and the C# operator “&&” is written as “AndAlso”); also, the reference equality/inequality operators are written as “Is” and “IsNot”; the overridable equality operators only work with types that have explicitly defined behaviors.

      It’s curious that C# uses the “/”, “+”, “==”, and “!=” tokens to each have two very different meanings depending upon the compile-time types of their operands, in situations where VB.NET uses different operators. I wonder what criteria the designers of C# used in deciding when to assign multiple meanings to tokens versus defining new tokens. Certainly tokens like e.g. “===” aren’t going to be familiar to C or C++ programmers, but “X===Y” would be more concise than “Object.ReferenceEquals(X,Y)” and less ambiguous than “X==Y”.

  4. “The C# compiler could warn that the left side of a null coalescing operation is never null, but it does not; I’m considering adding this to a future version of Coverity’s C# analysis; what do you guys think?”

    It’s important enough that it’s included in ReSharper, so I’d say go ahead and include it 🙂

  5. > The C# compiler could warn that the left side of a null coalescing operation is never null, but it does not; I’m considering adding this to a future version of Coverity’s C# analysis; what do you guys think? I’ve seen this “in the wild” on StackOverflow, but is it a likely bug in real production code?

    It should be a warning.

    If the LHS of the null-coallescing op [in that statement] is never null, why is the null-coallescing op there?
    It must be an accident, and so it should be flagged.

  6. This example only proves that silent conversion of int into string is wrong concept. If compiler did not allow s += 1 + 2; but required s += 1.ToString() + 2.ToString(); then there would not be any problem.

  7. I’ve seen this is free code supposedly used in production. (I converted the code to VB.NET, and this bug was converted too.) I’m not at the computer where the souce code is, but its the Jamaa SMPP library. IIRC The error is only in the calculation of the original suggested capacity of a StringBuilder so it doesn’t actually cause a problem.

Leave a reply to Joshua Lawrence Cancel reply