Author: ChloƩ Lourseyre
Editor: Peter Fordham
Case study
If you code in C++ regularly, you probably have encountered the following syntax:
class Foo;
Foo * make_Foo();
int main()
{
Foo * my_foo;
if (my_foo = make_Foo())
{
// ... Do things with the my_foo pointer
}
return 0;
}
In terms of semantics, this code is equivalent to the following:
class Foo;
Foo * make_Foo();
int main()
{
Foo * my_foo = make_Foo();
if (my_foo)
{
// ... Do things with the my_foo pointer
}
return 0;
}
This is today’s subject: assignment is an expression.
How does it work?
What is the value of such expression?
If we run the following code, we’ll have the answer:
int main()
{
int foo = 2;
std::cout << (foo = 3) << std::endl;
return 0;
}
The standard output prints 3
.
So we can say that the assignment expression is evaluated as the assigned variable, after it is assigned1.
One typo away from catastrophe
Let’s say that we have three variables, a
, b
and c
. You want the value of a
to be true
if (and only if) b
and c
are equal.
So we will write this:
bool a, b, c;
// ...
a = b == c;
But, we are not very far away from a serious typo. This one:
bool a, b, c;
// ...
a = b = c;
This code will compile, and won’t give you the intended result. How come?
The expression a = b = c
are two assignment operations within one expression. According to the C++ Operator Precedence Table, the associativity of =
is from right to left. So the expression a = b = c
is equivalent to a = (b = c)
.
Since (b = c)
is evaluated (as seen earlier) as the value of b
after assignment, a = b = c;
is equivalent to b = c; a = b;
If you then use a
as a boolean, it will be evaluated as true
if (and only if) c
is also true
.
Conclusion about a = b = c
There may be cases where this syntax (with two =
within a single expression) is useful, but most of the time I find it obtuse and confusing.
As of today, there is no way to efficiently prevent a typo like this from happening (adding parenthesis will not prevent the compilation in case of a typo is made). All you can do is open your eyes wide and use constant variables as much as possible (if b
is const
, then the code fails to compile if there is a typo)2.
The assignment operation returns an lvalue
I’ll finish this article by showing that the assignation operation is an lvalue.
Let’s take the a = b = c
back and add parenthesis around the a = b
.
int main()
{
int a = 1, b = 2, c = 3;
(a = b) = c;
std::cout << a << b << c << std::endl;
return 0;
}
This compiles and prints the following result: 323
.
That means that a
has been assigned the value of b
, then the value of c
. The expression a = b
is indeed a lvalue.
void foo(int&);
int main()
{
int a = 1, b = 2;
foo(a = b); // Compiles because `a = b` is an lvalue
foo(3); // Does not compile because `3` is an rvalue
return 0;
}
More specifically, the assignment operation (of fundamental types) returns a reference to the resulting variable.
Assignment operation for user-defined types
However, when you define an operator=
, the standard allows you to return any type you want (refer to the Canonical implementations section of operator overloading – cppreference.com for more details3).
You can of course return a reference to the assigned object, like so:
struct Foo
{
Foo& operator=(const Foo&) { return *this; }
};
int main()
{
Foo a, b, c;
a = b = c;
return 0;
}
You can also return a value instead of a reference:
struct Foo
{
Foo operator=(const Foo&) { return *this; }
};
int main()
{
Foo a, b, c;
a = b = c; // Also works, but a copy is made
return 0;
}
Since the result is copied, the assignment b = c
becomes a rvalue. Indeed, if you now try to take a reference out of it, you have a compilation error:
struct Foo
{
Foo operator=(const Foo& other)
{
val = other.val;
return *this;
}
int val;
};
int main()
{
Foo b = {1}, c = {2};
Foo & a = b = c; // Does not compile because here, (b = c) is an rvalue
return 0;
}
This code would compile if operator=
returned a Foo&
instead of a Foo
.
You can also return nothing (using void
as return value). In that case, you cannot write a = b = c
at all.
struct Foo
{
void operator=(const Foo&) { }
};
int main()
{
Foo a, b, c;
a = b = c; // Does not compile because (b = c) returns nothing
return 0;
}
This can be used as a good safeguard against the a = b = c
syntax4.
About declarations
There are specific cases where you can write a declaration, within another statement (such as the assignment we have seen earlier).
You can use this specific syntax in most flow control statements (like if
, while
, switch
and, of course for
) and within function calls.
For instance, the very first example of this post can also be written like this5:
class Foo;
Foo * make_Foo();
int main()
{
if (Foo * my_foo = make_Foo())
{
// ... Do things with the my_foo pointer
}
return 0;
}
However, the declaration itself is neither a lvalue nor a rvalue.
You can’t write this:
int main()
{
int a = 1, c = 3;
a = (int b = c); // Does not compile
return 0;
}
nor this:
int main()
{
int b = 2, c = 3;
(int a = b) = c; // Does not compile
return 0;
}
This is specified as “init-statement” in the standard, so when you see init-statement written in a prototype, you know you can put a declaration in there:
Wrapping up
Syntaxes like a = b = c
and if (a = b)
are intentional and clearly defined in the standard. However, they are alien to most developers and are so rarely used that they can easily be misleading.
Bugs can occur because the symbol =
really looks like ==
, so be wary of that. If you want to avoid it with your user-defined types, you can declare the operator=
function returning void
so the a = b = c
syntax becomes invalid, but this is not possible with fundamental types, and is a constraint on its own.
Thanks for reading and see you next time!
Author: ChloƩ Lourseyre
Editor: Peter Fordham
Addendum
Notes
- It is in fact evaluated as a reference to the variable, and not the value of the variable. This will be demonstrated further in the article.
- You can actually activate specific warnings to prevent specific cases (for instance,
-Wparentheses
can be used with GCC to avoid assignment inside a flow control statement), but that doesn’t cover every case (typicallya = b = c
cannot be warned out) and sometimes you may not want to activate them depending on your appreciation of this syntax.
- The site cppreference.com says that “for example, assignment operators return by reference to make it possible to write
a = b = c = d
, because the built-in operators allow that.”. However, there is no mention of this specific intention in the 4th edition of The C++ Programming Language by Bjarne Stroustrup. I suspect this is a free interpretation.
- You can, as you may have guessed, also return any type you want, if you have very specific needs. The prototype
int operator=(const Foo&);
(member function of classFoo
) is valid. This can be useful, for instance, if you want to return an error code.
- There is a difference in pragmatics in terms of scope (that is not the subject of today’s), because in the first example, the
my_foo
variable only lives within theif
block, whereas in the first examples, it lives through the scope ofmain
. But since it’s technically the same in this specific case (because there is nothing after theif
block), I deem it not necessary to elaborate.