a = b = c, a strange consequence of operator associativity

Author: Chloé Lourseyre
Editor: Peter Fordham

Case study

If you code in C++ regularly, you probably have encountered the following syntax:

class Foo;
Foo * make_Foo();

int main()
{

    Foo * my_foo;
    if (my_foo = make_Foo())
    {
        // ... Do things with the my_foo pointer
    }

    return 0;
}

In terms of semantics, this code is equivalent to the following:

class Foo;
Foo * make_Foo();

int main()
{

    Foo * my_foo = make_Foo();
    if (my_foo)
    {
        // ... Do things with the my_foo pointer
    }

    return 0;
}

This is today’s subject: assignment is an expression.

How does it work?

What is the value of such expression?

If we run the following code, we’ll have the answer:

int main()
{
    int foo = 2;
    std::cout << (foo = 3) << std::endl;
    return 0;
}

The standard output prints 3.

So we can say that the assignment expression is evaluated as the assigned variable, after it is assigned1.

One typo away from catastrophe

Let’s say that we have three variables, a, b and c. You want the value of a to be true if (and only if) b and c are equal.

So we will write this:

bool a, b, c;
// ...
a = b == c;

But, we are not very far away from a serious typo. This one:

bool a, b, c;
// ...
a = b = c;

This code will compile, and won’t give you the intended result. How come?

The expression a = b = c are two assignment operations within one expression. According to the C++ Operator Precedence Table, the associativity of = is from right to left. So the expression a = b = c is equivalent to a = (b = c).

Since (b = c) is evaluated (as seen earlier) as the value of b after assignment, a = b = c; is equivalent to b = c; a = b;

If you then use a as a boolean, it will be evaluated as true if (and only if) c is also true.

Conclusion about a = b = c

There may be cases where this syntax (with two = within a single expression) is useful, but most of the time I find it obtuse and confusing.

As of today, there is no way to efficiently prevent a typo like this from happening (adding parenthesis will not prevent the compilation in case of a typo is made). All you can do is open your eyes wide and use constant variables as much as possible (if b is const, then the code fails to compile if there is a typo)2.

The assignment operation returns an lvalue

I’ll finish this article by showing that the assignation operation is an lvalue.

Let’s take the a = b = c back and add parenthesis around the a = b.

int main()
{
    int a = 1, b = 2, c = 3;

    (a = b) = c;

    std::cout << a << b << c << std::endl;
    return 0;
}

This compiles and prints the following result: 323.

That means that a has been assigned the value of b, then the value of c. The expression a = b is indeed a lvalue.

void foo(int&);

int main()
{
    int a = 1, b = 2;

    foo(a = b); // Compiles because `a = b` is an lvalue
    foo(3); // Does not compile because `3` is an rvalue

    return 0;
}

More specifically, the assignment operation (of fundamental types) returns a reference to the resulting variable.

Assignment operation for user-defined types

However, when you define an operator=, the standard allows you to return any type you want (refer to the Canonical implementations section of operator overloading – cppreference.com for more details3).

You can of course return a reference to the assigned object, like so:

struct Foo
{
    Foo& operator=(const Foo&) { return *this; }
};

int main()
{
    Foo a, b, c;
    a = b = c;
    return 0;
}

You can also return a value instead of a reference:

struct Foo
{
    Foo operator=(const Foo&) { return *this; }
};

int main()
{
    Foo a, b, c;
    a = b = c; // Also works, but a copy is made
    return 0;
}

Since the result is copied, the assignment b = c becomes a rvalue. Indeed, if you now try to take a reference out of it, you have a compilation error:

struct Foo
{
    Foo operator=(const Foo& other) 
    { 
        val = other.val; 
        return *this; 
    }
    int val;
};

int main()
{
    Foo b = {1}, c = {2};
    Foo & a = b = c; // Does not compile because here, (b = c) is an rvalue
    return 0;
}

This code would compile if operator= returned a Foo& instead of a Foo.

You can also return nothing (using void as return value). In that case, you cannot write a = b = c at all.

struct Foo
{
    void operator=(const Foo&) {  }
};

int main()
{
    Foo a, b, c;
    a = b = c; // Does not compile because (b = c) returns nothing
    return 0;
}

This can be used as a good safeguard against the a = b = c syntax4.

About declarations

There are specific cases where you can write a declaration, within another statement (such as the assignment we have seen earlier).

You can use this specific syntax in most flow control statements (like if, while, switch and, of course for) and within function calls.

For instance, the very first example of this post can also be written like this5:

class Foo;
Foo * make_Foo();

int main()
{

    if (Foo * my_foo = make_Foo())
    {
        // ... Do things with the my_foo pointer
    }

    return 0;
}

However, the declaration itself is neither a lvalue nor a rvalue.

You can’t write this:

int main()
{
    int a = 1, c = 3;
    a = (int b = c); // Does not compile

    return 0;
}

nor this:

int main()
{
    int b = 2, c = 3;
    (int a = b) = c; // Does not compile

    return 0;
}

This is specified as “init-statement” in the standard, so when you see init-statement written in a prototype, you know you can put a declaration in there:

Wrapping up

Syntaxes like a = b = c and if (a = b) are intentional and clearly defined in the standard. However, they are alien to most developers and are so rarely used that they can easily be misleading.

Bugs can occur because the symbol = really looks like ==, so be wary of that. If you want to avoid it with your user-defined types, you can declare the operator= function returning void so the a = b = c syntax becomes invalid, but this is not possible with fundamental types, and is a constraint on its own.

Thanks for reading and see you next time!

Author: Chloé Lourseyre
Editor: Peter Fordham

Addendum

Notes

  1. It is in fact evaluated as a reference to the variable, and not the value of the variable. This will be demonstrated further in the article.
  1. You can actually activate specific warnings to prevent specific cases (for instance, -Wparentheses can be used with GCC to avoid assignment inside a flow control statement), but that doesn’t cover every case (typically a = b = c cannot be warned out) and sometimes you may not want to activate them depending on your appreciation of this syntax.
  1. The site cppreference.com says that “for example, assignment operators return by reference to make it possible to write a = b = c = d, because the built-in operators allow that.”. However, there is no mention of this specific intention in the 4th edition of The C++ Programming Language by Bjarne Stroustrup. I suspect this is a free interpretation.
  1. You can, as you may have guessed, also return any type you want, if you have very specific needs. The prototype int operator=(const Foo&); (member function of class Foo) is valid. This can be useful, for instance, if you want to return an error code.
  1. There is a difference in pragmatics in terms of scope (that is not the subject of today’s), because in the first example, the my_foo variable only lives within the if block, whereas in the first examples, it lives through the scope of main. But since it’s technically the same in this specific case (because there is nothing after the if block), I deem it not necessary to elaborate.

One thought on “a = b = c, a strange consequence of operator associativity

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s