A string concatenation puzzle

Happy Canada Day everyone! I’m back from a fabulous week in California and looking forward to taking a break from giving PowerPoint presentations.

Today, to follow up on my recent series on string concatenation, here’s a fairly easy little puzzle. I have a perfectly ordinary local variable:

string s = "";

Can you come up with some code that parses as a legal expression such that the statements

s = s + your_expression;

and

s += your_expression;

are both legal but produce completely different results in s? Post your proposals in the comments and I’ll give my answer later this week.

37 thoughts on “A string concatenation puzzle

  1. I don’t know if this counts as completely different results, but “s = s + 5 + 5;” results in “55” while “s += 5 + 5;” results in “10”.

  2. Looks like my operator precedence got bested by dtb’s operator associativity, but here’s my answer in 14 characters:
    1is int?”a”:””

    • It only took me this long because ‘is’ isn’t listed on the operator precedence list, and I didn’t think of it for a while.

  3. The solution I immediately thought of was to exploit the low precedence of the ‘as’ operator:
    s = s + (object)1 as string; // results in “1”
    s += (object)1 as string; // results in “”

    I wanted to avoid such parse tree tricks; so I thought maybe we can distinguish the two statements using a custom class that has different user-defined conversions to string based on whether the conversion is explicit or implicit. However, this doesn’t work as the compound assignment only uses an explicit conversion if no implicit conversions exists, and only if the operator is built-in.
    None of the built-in + operators have a return type that is explicitly but not implicitly convertible to string, so this semantic difference between the compound assignment and its expanded form cannot be used for a solution.
    Neither can we exploit the ‘only evaluated once’ semantics, as ‘s’ is a simple variable.
    That leaves no semantic difference between the two, so I conclude that any possible solution must involve some kind of parse tree tricks.

    • The documentation for += states that:
      An expression using the += assignment operator, such as
      x += y
      is equivalent to
      x = x + y
      except that x is only evaluated once.
      Since evaluation of s is irrelevant, that only leaves parse tree tricks.

      • That documentation is highly simplified; the rules in the C# specification are more complex.
        Try this:
        byte b = 0;
        b += 1; // OK
        b = b + 1; // CS0266: Cannot implicitly convert type ‘int’ to ‘byte’.

        ‘x += y’ is sometimes equivalent to ‘x = (T)(x + y)’ instead of ‘x = x + y’, and I was checking if that explicit conversion could be exploited to change the behavior.

  4. Ahh, evil type coercion. Visual Basic solved that problem in the 90’s when it introduced separate concatenation and addition operators. I was sorely disappointed when I first learned that C# didn’t learn from that lesson.

    • Unfortunately VB did not solve the problem; arguably it made it worse. In VB the “&” operator only concatenates strings but the “+” operator still works on strings; in VB, “123” + 123 is 246, but “123” + “123” is “123123”. This can be quite confusing. Also, string concatenation in VB6 and VBScript does not use the “hard vs soft” rule, but comparison does. See my article from 2004 on that subject for details.

      • VB.NET includes two languages: “Option Strict Off” is a weird goofy language which should never be used except in very limited contexts where dynamic binding is needed, or when it’s necessary to port horrible VB6 code that uses variable types in inconsistent fashion (e.g. a function that returns a string if it succeeds, or a numeric error code if it fails). I don’t know that anybody really likes the “Option Strict Off” language; it certainly shouldn’t have been the default.

        In the “Option Strict On” language, which is what any real users of VB.NET are really talking about when discussing the language, (“123″+123) won’t compile, but (“123” & 123) will yield “123123” [as will, incidentally, (123 & 123), since the keyword “And”, rather than the ampersand operator, is used for the Boolean operation.]

  5. Apart from playing tricks with the compiler you can also use old fashioned side effects:
    class A
    {
    static int count = 0;
    public override string ToString()
    {
    return count++.ToString();
    }
    }
    static void Main(string[] args)
    {
    string lret = “”;
    lret += new A();
    lret = lret + new A();
    Console.WriteLine(lret);
    }
    This will print 01 because every call to ToString will result in a different value. I know this is cheating because it has nothing to do with the used operators but it does produce the desired different output.

    • Seems like a lot of trouble to go to to produce a side-effected expression – there are dozens of ways to do that within the spirit of ‘just add an expression’, without the need to add a ton of boilerplate. Without leaving the System namespace, we can use “Guid.NewGuid()”, “new Random().Next()”, “DateTime.Now”, “Console.CursorLeft++”, “Environment.WorkingSet”…

      • You are right there are many more ways to do it. Just not DateTime.Now because it has only an accuracy of 15ms which will print (most) likely always the same two values if you call it directly in a row. Stopwatch.GetTimeStamp() would be a better alternative (if executed on recent Intel CPUs with a constant clock rate).

  6. If you want to control exactly what the values for s are in these two scenarios, you can use the following:

    class A
    {
    public static SPlusA operator +(string left, A right)
    {
    return null;
    }
    }

    class B
    {
    public static string operator +(A left, B right)
    {
    return “two”;
    }
    }

    class SPlusA
    {
    public static string operator +(SPlusA left, B right)
    {
    return “one”;
    }
    }

    putting whatever you like in place of “one” and “two”.

    With these types defined, “your_expression” can add any A to any B, e.g.:

    new A() + new B()

    Same basic trick that the earlier, simpler answers are using, of course… (Foolishly, when I set out to solve this, for some reason I had it in my head that we needed to be able to pick the results, which is how I ended up with this more complex variation on the theme.)

  7. Pingback: Beware operator precedence « Jim's Random Notes

  8. I cheats… šŸ™‚

    string my_expression(string s) {
    Thread.Sleep(new Random().Next(5, 10));
    return s + DateTime.Now.Ticks;
    }

    void Main()
    {
    string s = “”;
    s = s + my_expression(s);
    s.Dump();

    string t = “”;
    t += my_expression(t);
    t.Dump();
    }

  9. Abusing the fact that << binds looser than +, but tighter than +=:

    struct S {
    public readonly string Value;
    public S(string value) {
    this.Value = value;
    }
    public static S operator +(string s, S b) {
    return new S(string.Format("(({0}) + ({1}))", s, b.Value));
    }
    public static string operator <<(S s, int x) {
    return new S(string.Format("(({0}) << ({1}))", s.Value, x));
    }
    public static implicit operator string(S b) {
    return b.Value;
    }
    }
    void Main() {
    var a = "";
    a += new S("_") << 1;
    a.Dump(); // "((_) << (1))"
    a = "";
    a = a + new S("_") << 1;
    a.Dump(); // "(((() + (_))) << (1))"
    }

  10. String s = “”;
    Int32 i = 0;
    s = s + ++i;
    Console.WriteLine(s); // prints 1

    i = 0;
    s += ++i;
    Console.WriteLine(s); // prints 11

Leave a comment