a = b = c, a strange consequence of operator associativity

Author: Chloé Lourseyre
Editor: Peter Fordham

Case study

If you code in C++ regularly, you probably have encountered the following syntax:

class Foo;
Foo * make_Foo();
int main()
{
    Foo * my_foo;
    if (my_foo = make_Foo())
    {
        // ... Do things with the my_foo pointer
    }
    return 0;
}

In terms of semantics, this code is equivalent to the following:

class Foo;
Foo * make_Foo();
int main()
{
    Foo * my_foo = make_Foo();
    if (my_foo)
    {
        // ... Do things with the my_foo pointer
    }
    return 0;
}

This is today’s subject: assignment is an expression.

How does it work?

What is the value of such expression?

If we run the following code, we’ll have the answer:

int main()
{
    int foo = 2;
    std::cout << (foo = 3) << std::endl;
    return 0;
}

The standard output prints 3.

So we can say that the assignment expression is evaluated as the assigned variable, after it is assigned¹.

One typo away from catastrophe

Let’s say that we have three variables, a, b and c. You want the value of a to be true if (and only if) b and c are equal.

So we will write this:

bool a, b, c;
// ...
a = b == c;

But, we are not very far away from a serious typo. This one:

bool a, b, c;
// ...
a = b = c;

This code will compile, and won’t give you the intended result. How come?

The expression a = b = c are two assignment operations within one expression. According to the C++ Operator Precedence Table, the associativity of = is from right to left. So the expression a = b = c is equivalent to a = (b = c).

Since (b = c) is evaluated (as seen earlier) as the value of b after assignment, a = b = c; is equivalent to b = c; a = b;

If you then use a as a boolean, it will be evaluated as true if (and only if) c is also true.

Conclusion about a = b = c

There may be cases where this syntax (with two = within a single expression) is useful, but most of the time I find it obtuse and confusing.

As of today, there is no way to efficiently prevent a typo like this from happening (adding parenthesis will not prevent the compilation in case of a typo is made). All you can do is open your eyes wide and use constant variables as much as possible (if b is const, then the code fails to compile if there is a typo)².

The assignment operation returns an lvalue

I’ll finish this article by showing that the assignation operation is an lvalue.

Let’s take the a = b = c back and add parenthesis around the a = b.

int main()
{
    int a = 1, b = 2, c = 3;
    (a = b) = c;
    std::cout << a << b << c << std::endl;
    return 0;
}

This compiles and prints the following result: 323.

That means that a has been assigned the value of b, then the value of c. The expression a = b is indeed a lvalue.

void foo(int&);
int main()
{
    int a = 1, b = 2;
    foo(a = b); // Compiles because `a = b` is an lvalue
    foo(3); // Does not compile because `3` is an rvalue
    return 0;
}

More specifically, the assignment operation (of fundamental types) returns a reference to the resulting variable.

Assignment operation for user-defined types

However, when you define an operator=, the standard allows you to return any type you want (refer to the Canonical implementations section of operator overloading – cppreference.com for more details³).

You can of course return a reference to the assigned object, like so:

struct Foo
{
    Foo& operator=(const Foo&) { return *this; }
};
int main()
{
    Foo a, b, c;
    a = b = c;
    return 0;
}

You can also return a value instead of a reference:

struct Foo
{
    Foo operator=(const Foo&) { return *this; }
};
int main()
{
    Foo a, b, c;
    a = b = c; // Also works, but a copy is made
    return 0;
}

Since the result is copied, the assignment b = c becomes a rvalue. Indeed, if you now try to take a reference out of it, you have a compilation error:

struct Foo
{
    Foo operator=(const Foo& other) 
    { 
        val = other.val; 
        return *this; 
    }
    int val;
};
int main()
{
    Foo b = {1}, c = {2};
    Foo & a = b = c; // Does not compile because here, (b = c) is an rvalue
    return 0;
}

This code would compile if operator= returned a Foo& instead of a Foo.

You can also return nothing (using void as return value). In that case, you cannot write a = b = c at all.

struct Foo
{
    void operator=(const Foo&) {  }
};
int main()
{
    Foo a, b, c;
    a = b = c; // Does not compile because (b = c) returns nothing
    return 0;
}

This can be used as a good safeguard against the a = b = c syntax⁴.

About declarations

There are specific cases where you can write a declaration, within another statement (such as the assignment we have seen earlier).

You can use this specific syntax in most flow control statements (like if, while, switch and, of course for) and within function calls.

For instance, the very first example of this post can also be written like this⁵:

class Foo;
Foo * make_Foo();
int main()
{
    if (Foo * my_foo = make_Foo())
    {
        // ... Do things with the my_foo pointer
    }
    return 0;
}

However, the declaration itself is neither a lvalue nor a rvalue.

You can’t write this:

int main()
{
    int a = 1, c = 3;
    a = (int b = c); // Does not compile
    return 0;
}

nor this:

int main()
{
    int b = 2, c = 3;
    (int a = b) = c; // Does not compile
    return 0;
}

This is specified as “init-statement” in the standard, so when you see init-statement written in a prototype, you know you can put a declaration in there:

Wrapping up

Syntaxes like a = b = c and if (a = b) are intentional and clearly defined in the standard. However, they are alien to most developers and are so rarely used that they can easily be misleading.

Bugs can occur because the symbol = really looks like ==, so be wary of that. If you want to avoid it with your user-defined types, you can declare the operator= function returning void so the a = b = c syntax becomes invalid, but this is not possible with fundamental types, and is a constraint on its own.

Thanks for reading and see you next time!