a = b = c, a strange consequence of operator associativity

Author: Chloé Lourseyre
Editor: Peter Fordham

Case study

If you code in C++ regularly, you probably have encountered the following syntax:

class Foo;
Foo * make_Foo();
int main()
{
    Foo * my_foo;
    if (my_foo = make_Foo())
    {
        // ... Do things with the my_foo pointer
    }
    return 0;
}

In terms of semantics, this code is equivalent to the following:

class Foo;
Foo * make_Foo();
int main()
{
    Foo * my_foo = make_Foo();
    if (my_foo)
    {
        // ... Do things with the my_foo pointer
    }
    return 0;
}

This is today’s subject: assignment is an expression.

How does it work?

What is the value of such expression?

If we run the following code, we’ll have the answer:

int main()
{
    int foo = 2;
    std::cout << (foo = 3) << std::endl;
    return 0;
}

The standard output prints 3.

So we can say that the assignment expression is evaluated as the assigned variable, after it is assigned1.

One typo away from catastrophe

Let’s say that we have three variables, a, b and c. You want the value of a to be true if (and only if) b and c are equal.

So we will write this:

bool a, b, c;
// ...
a = b == c;

But, we are not very far away from a serious typo. This one:

bool a, b, c;
// ...
a = b = c;

This code will compile, and won’t give you the intended result. How come?

The expression a = b = c are two assignment operations within one expression. According to the C++ Operator Precedence Table, the associativity of = is from right to left. So the expression a = b = c is equivalent to a = (b = c).

Since (b = c) is evaluated (as seen earlier) as the value of b after assignment, a = b = c; is equivalent to b = c; a = b;

If you then use a as a boolean, it will be evaluated as true if (and only if) c is also true.

Conclusion about a = b = c

There may be cases where this syntax (with two = within a single expression) is useful, but most of the time I find it obtuse and confusing.

As of today, there is no way to efficiently prevent a typo like this from happening (adding parenthesis will not prevent the compilation in case of a typo is made). All you can do is open your eyes wide and use constant variables as much as possible (if b is const, then the code fails to compile if there is a typo)2.

The assignment operation returns an lvalue

I’ll finish this article by showing that the assignation operation is an lvalue.

Let’s take the a = b = c back and add parenthesis around the a = b.

int main()
{
    int a = 1, b = 2, c = 3;
    (a = b) = c;
    std::cout << a << b << c << std::endl;
    return 0;
}

This compiles and prints the following result: 323.

That means that a has been assigned the value of b, then the value of c. The expression a = b is indeed a lvalue.

void foo(int&);
int main()
{
    int a = 1, b = 2;
    foo(a = b); // Compiles because `a = b` is an lvalue
    foo(3); // Does not compile because `3` is an rvalue
    return 0;
}

More specifically, the assignment operation (of fundamental types) returns a reference to the resulting variable.

Assignment operation for user-defined types

However, when you define an operator=, the standard allows you to return any type you want (refer to the Canonical implementations section of operator overloading – cppreference.com for more details3).

You can of course return a reference to the assigned object, like so:

struct Foo
{
    Foo& operator=(const Foo&) { return *this; }
};
int main()
{
    Foo a, b, c;
    a = b = c;
    return 0;
}

You can also return a value instead of a reference:

struct Foo
{
    Foo operator=(const Foo&) { return *this; }
};
int main()
{
    Foo a, b, c;
    a = b = c; // Also works, but a copy is made
    return 0;
}

Since the result is copied, the assignment b = c becomes a rvalue. Indeed, if you now try to take a reference out of it, you have a compilation error:

struct Foo
{
    Foo operator=(const Foo& other) 
    { 
        val = other.val; 
        return *this; 
    }
    int val;
};
int main()
{
    Foo b = {1}, c = {2};
    Foo & a = b = c; // Does not compile because here, (b = c) is an rvalue
    return 0;
}

This code would compile if operator= returned a Foo& instead of a Foo.

You can also return nothing (using void as return value). In that case, you cannot write a = b = c at all.

struct Foo
{
    void operator=(const Foo&) {  }
};
int main()
{
    Foo a, b, c;
    a = b = c; // Does not compile because (b = c) returns nothing
    return 0;
}

This can be used as a good safeguard against the a = b = c syntax4.

About declarations

There are specific cases where you can write a declaration, within another statement (such as the assignment we have seen earlier).

You can use this specific syntax in most flow control statements (like if, while, switch and, of course for) and within function calls.

For instance, the very first example of this post can also be written like this5:

class Foo;
Foo * make_Foo();
int main()
{
    if (Foo * my_foo = make_Foo())
    {
        // ... Do things with the my_foo pointer
    }
    return 0;
}

However, the declaration itself is neither a lvalue nor a rvalue.

You can’t write this:

int main()
{
    int a = 1, c = 3;
    a = (int b = c); // Does not compile
    return 0;
}

nor this:

int main()
{
    int b = 2, c = 3;
    (int a = b) = c; // Does not compile
    return 0;
}

This is specified as “init-statement” in the standard, so when you see init-statement written in a prototype, you know you can put a declaration in there:

Wrapping up

Syntaxes like a = b = c and if (a = b) are intentional and clearly defined in the standard. However, they are alien to most developers and are so rarely used that they can easily be misleading.

Bugs can occur because the symbol = really looks like ==, so be wary of that. If you want to avoid it with your user-defined types, you can declare the operator= function returning void so the a = b = c syntax becomes invalid, but this is not possible with fundamental types, and is a constraint on its own.

Thanks for reading and see you next time!

Author: Chloé Lourseyre
Editor: Peter Fordham

Addendum

Notes

  1. It is in fact evaluated as a reference to the variable, and not the value of the variable. This will be demonstrated further in the article.
  1. You can actually activate specific warnings to prevent specific cases (for instance, -Wparentheses can be used with GCC to avoid assignment inside a flow control statement), but that doesn’t cover every case (typically a = b = c cannot be warned out) and sometimes you may not want to activate them depending on your appreciation of this syntax.
  1. The site cppreference.com says that “for example, assignment operators return by reference to make it possible to write a = b = c = d, because the built-in operators allow that.”. However, there is no mention of this specific intention in the 4th edition of The C++ Programming Language by Bjarne Stroustrup. I suspect this is a free interpretation.
  1. You can, as you may have guessed, also return any type you want, if you have very specific needs. The prototype int operator=(const Foo&); (member function of class Foo) is valid. This can be useful, for instance, if you want to return an error code.
  1. There is a difference in pragmatics in terms of scope (that is not the subject of today’s), because in the first example, the my_foo variable only lives within the if block, whereas in the first examples, it lives through the scope of main. But since it’s technically the same in this specific case (because there is nothing after the if block), I deem it not necessary to elaborate.

3 interesting behaviors of C++ casts

Author: Chloé Lourseyre
Editor: Peter Fordham

This article is a little compilation1 of strange behaviors in C++, that would not make a long enough article on their own.

Static casting an object into their own type can call the copy constructor

When you use static_cast, by defaut (i.e. without optimizations activated) it calls the conversion constructor of the object you are trying to cast into (if it exists).

For instance, in this code.

class Foo;
class Bar;

int main()
{
    Bar bar;
    static_cast<Foo>(bar);
}

The highlighted expression would call the following constructor (if existent): Foo(const Bar &).

So far so good, and there is a good chance that you already knew that.

But do you know what happens if you try to static cast an object into its own type?

Let’s take the following code:

struct Foo
{
    Foo(): vi(0), vf(0) {};
    Foo(const Foo & other): vi(other.vi), vf(other.vf) {};
    long vi;
    double vf;
};

int main()
{
    Foo foo1, foo2, foo3;
    foo2 = foo1;    
    foo3 = static_cast<Foo>(foo1);

    return 0;
}

And look at the assembly of the highlighted lines

Line 12

        mov     rax, QWORD PTR [rbp-32]
        mov     rdx, QWORD PTR [rbp-24]
        mov     QWORD PTR [rbp-48], rax
        mov     QWORD PTR [rbp-40], rdx

Line 13

        lea     rdx, [rbp-32]
        lea     rax, [rbp-16]
        mov     rsi, rdx
        mov     rdi, rax
        call    Foo::Foo(Foo const&) [complete object constructor]
        mov     rax, QWORD PTR [rbp-16]
        mov     rdx, QWORD PTR [rbp-8]
        mov     QWORD PTR [rbp-64], rax
        mov     QWORD PTR [rbp-56], rdx

We can see that when we static cast the object foo1, it calls the copy constructor of Foo as if the copy constructor was actually a “conversion constructor of a type into itself”.

(Done using GCC 11.2 x86-64, Compiler Explorer (godbolt.org))

Of course, this behavior will disappear as soon as you put an optimization option in the compiler.

This is typically useless knowledge2 and something you doesn’t encounter often in real life (I happen to have encountered it once, but this was an unfortunate accident)

Static casts can call several conversion constructors

Talking about conversion constructors, they can be transitive when static_cast is used.

Take the following classes:

struct Foo
{  Foo() {};  };

struct Bar
{  Bar(const Foo & other) {};  };

struct FooBar
{  FooBar(const Bar & other) {};  };

struct BarFoo
{  BarFoo(const FooBar & other) {};  };

We have four types: Foo, Bar, FooBar, and BarFoo. The conversion constructors say we can convert a Foo into a Bar, a Bar into a FooBar, and a FooBar into a BarFoo.

If we try to execute the following code:

int main()
{
    Foo foo;
    BarFoo barfoo = foo;
    return 0;
}

There is a compilation error on line 4: conversion from 'Foo' to non-scalar type 'BarFoo' requested.

However, if we static_cast foo into a FooBar, as such:

int main()
{
    Foo foo;
    BarFoo barfoo = static_cast<FooBar>(foo);
    return 0;
}

The program compiles.

If we now take a look at the assembly code associated with line 4:

        lea     rdx, [rbp-3]
        lea     rax, [rbp-1]
        mov     rsi, rdx
        mov     rdi, rax
        call    Bar::Bar(Foo const&) [complete object constructor]
        lea     rdx, [rbp-1]
        lea     rax, [rbp-2]
        mov     rsi, rdx
        mov     rdi, rax
        call    FooBar::FooBar(Bar const&) [complete object constructor]
        lea     rdx, [rbp-2]
        lea     rax, [rbp-4]
        mov     rsi, rdx
        mov     rdi, rax
        call    BarFoo::BarFoo(FooBar const&) [complete object constructor]

There are no less than 3 conversions generated by that single statement.

(Done using GCC 11.2 x86-64, Compiler Explorer (godbolt.org))

Hold up!

You may be wondering why I didn’t cast foo into a BarFoo and I only cast it into a FooBar using the static_cast.

If we try and compile the following code:

int main()
{
    Foo foo;
    BarFoo barfoo = static_cast<BarFoo>(foo);
    return 0;
}

We end up with a compilation error!

<source>:16:44: error: no matching function for call to 'BarFoo::BarFoo(Foo&)'

In fact, static_cast is not transitive

What really happens is the following:

The expression static_cast<FooBar>(foo) tries to call the following constructor: FooBar(const Foo&). However, it doesn’t exist, the only conversion constructor FooBar has is FooBar(const Bar&). But, there is a conversion available from Foo to Bar, so the compiler implicitly converts foo into a Bar to call the FooBar(const Bar&).

Then we try to assign the resulting FooBar to a BarFoo. Or, more precisely, we try to construct a BarFoo using a FooBar, which calls the BarFoo(const FooBar&) constructor.

That is why there is a compilation error when we try to cast a Foo directly into a BarFoo.

In fact, static_cast is not really transitive.

What to do with this information?

Implicit conversion can happen anywhere. Since static_cast (and any cast) is, pragmatically3, a “function call” (in the sense that it takes an argument and returns a value) it gives two opportunities for the compiler to try an implicit conversion.

The behavior of C-style casts

Using C-style casts is a fairly widespread bad practice in C++. It really should have made into this old article A list of bad practices commonly seen in industrial projects.

Many C++ developers don’t understand the intricacies of what C-style casts actually do.

How do casts work in C?

If I remember right, casts in C has three uses.

First, they can convert one scalar type into another, like this:

int toto = 42;
printf("%f\n", (double)toto);

But this can only be used to convert scalar type. If we try to convert a C struct into another using a cast:

#include <stdio.h>

typedef struct Foo
{
    int toto;
    long tata;
} Foo;

typedef struct Bar
{
    long toto;
    double tata;
} Bar;


int main()
{
    Foo foo;
    foo.toto = 42;
    foo.tata = 666;
    
    Bar bar = (Bar)foo;
    
    printf("%l %d", bar.toto, bar.tata);

    return 0;
}

We obtain the following compilation error:

main.c:22:5: error: conversion to non-scalar type requested
   22 |     Bar bar = (Bar)foo;
      | 

(Source: GDB online Debugger | Code, Compile, Run, Debug online C, C++ (onlinegdb.com))

Second, they can be used to reinterpret a pointer into a pointer of another type, like this:

#include <stdio.h>

typedef struct Foo
{
    int toto;
    long tata;
    int tutu;
} Foo;

typedef struct Bar
{
    long toto;
    int tata;
    int tutu;
} Bar;


int main()
{
    Foo foo;
    foo.toto = 42;
    foo.tata = 666;
    foo.tutu = 1515;
    
    Bar* bar = (Bar*)&foo;
    
    printf("%ld %d %d", bar->toto, bar->tata, bar->tutu);

    return 0;
}

This prints the following output4:

42 666 0

(Source: GDB online Debugger | Code, Compile, Run, Debug online C, C++ (onlinegdb.com))

And finally, it can be used to add or remove a const qualifier:

#include <stdio.h>

int main()
{
    const int toto = 1;
    int * tata = (int*)(&toto);
    *tata = 42;
    
    printf("%d", toto);

    return 0;
}

This prints 42.

(Source: GDB online Debugger | Code, Compile, Run, Debug online C, C++ (onlinegdb.com))

This also works on structs.

And that’s pretty much all5.

So what happens in C++

C++ has its own cast operators (mainly static_cast, dynamic_cast, const_cast, and reinterpret_cast, but also many other casts like *_pointer_cast, etc.)

But C++ was also intended to be backward-compatible with C (at first). So we needed a way to implement the C-style casts so that they would work similarly to C casts, all in the C++ new way of casting.

So in C++, when you do a C-style cast, the compiler tries each one of the five following cast operations (in that order), and uses the first that works:

  • const_cast
  • static_cast
  • static_cast followed by const_cast
  • reinterpret_cast
  • reinterpret_cast followed by const_cast

More details here: Explicit type conversion – cppreference.com.

Why this is actually bad?

Most C++ developers agree that it is really bad practice to use C-style casts in C++. Here are the reasons why: it is not explicit what the compiler will do. The C-style cast will often work, even when there is an error, and silence that error. You always want only one of these casts, so you should explicitly call it. That way, if there is any mistake there’s a good chance the compiler will catch it. Objectively, there are absolutely no upsides to using a C-style cast.

Here is a longer argument against C-style casts: Coding Standards, C++ FAQ (isocpp.org).

Wrapping up

Casting is a delicate operation. It can be costly (more than you think because it gives room for implicit conversions) and still today, there are a lot of people using C-style without knowing how bad it is.

It is tedious, but we need to understand of casts work and the specificities of each one.

Thanks for your attention and see you next time!

Author: Chloé Lourseyre
Editor: Peter Fordham

Addenda

Static casting an object into their own type can call the copy constructor

Static casts can call several conversion constructors

The behavior of C-style casts

Notes

  1. Pun intended.
  2. If you know a case where it is useful, please share in the comments.
  3. In the linguistic field, pragmatics is the study of context (complementary to semantics, which studies the meaning, and many other fields). In terms of programming language, pragmatics can be interpreted as how features interact with others in a given context. In our example, a static_cast can hardly be considered a function call in the semantics sense, but act as one in the interactions it has with its direct environment (as it is explained in the paragraph). The technical truth is in-between: for POD it is not a function call, but for classes that define a copy-constructor it is.
  4. I won’t explain in detail why it prints 0 instead of 1515 for the value of tutu: just know that because we reinterpret the data stored in memory, reading a Foo as if it was a Bar leads to errors.
  5. I am not as fluent in C as I am in C++. I may have forgotten another use of C casts. If so, please contribute in the comments.

Yet another reason to not use printf (or write C code in general)

Author: Chloé Lourseyre

Recently, Joe Groff @jckarter tweeted a very interesting behavior inherited from C:

Obviously, it’s a joke, but we’re gonna talk more about what’s happening in the code itself.

So, what’s happening?

Just to be 100% clear, double(2101253) does not actually double the value of 2101253. It’s a cast from int to double.

If we write this differently, we can obtain this:

#include <cstdio>

int main() {
    printf("%d\n", 666);
    printf("%d\n", double(42));
}

On the x86_64 gcc 11.2 compiler, the prompt is as follows:

666
4202506

So we can see that the value 4202506 has nothing to do with the 666 nor the 42 values.

In fact, if we launch the same code in the x86_64 clang 12.0.1 compiler, things are a little bit different:

666
4202514

You can see the live results here: https://godbolt.org/z/c6Me7a5ee

You may have guessed it already, but this comes from line 5, where we print a double as an int. But this is not some kind of conversion error (of course that your computer knows how to convert from double to int, it will do it fine if this was what was happening), the issue comes from somewhere else.

The truth

If we want to understand how it works that way, we’ll have to take a look at the assembly code (https://godbolt.org/z/5YKEdj73r):

.LC0:
        .string "%d\n"
main:
        push    rbp
        mov     rbp, rsp
        mov     esi, 666
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        mov     rax, QWORD PTR .LC1[rip]
        movq    xmm0, rax
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 1
        call    printf
        mov     eax, 0
        pop     rbp
        ret
.LC1:
        .long   0
        .long   1078263808

(use this Godbolt link to have a clearer matching between the C++ code and the assembly instructions: https://godbolt.org/z/5YKEdj73r)

In the yellow zone of the assembly code (lines 6-to 9, the equivalent to printf("%d\n", 666);) we can see that everything’s fine, the 666 value is put in the esi register and then the function printf is called. So it’s an educated guess to say that when the printf function reads a %d in the string it is given, it’ll look in the esi register for what to print.

However, we can see in the blue part of the code (lines 10 to 14, the equivalent to printf("%d\n", double(42));) the value is put in another register: the xmm0 register. Since it is given the same string as before, it’s pretty guessable that the printf function will look into the esi register again, whatever there is in there.

We can prove that statement pretty easily. Take the following code:

#include <cstdio>

int main() {
    printf("%d\n", 666);
    printf("%d %d\n", double(42), 24);
}

It’s the same code, with an additional integer that is print in the second printf instruction.

If we look at the assembly (https://godbolt.org/z/jjeca8qd7):

.LC0:
        .string "%d %d\n"
main:
        push    rbp
        mov     rbp, rsp
        mov     esi, 666
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        mov     rax, QWORD PTR .LC1[rip]
        mov     esi, 24
        movq    xmm0, rax
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 1
        call    printf
        mov     eax, 0
        pop     rbp
        ret
.LC1:
        .long   0
        .long   1078263808

The double(42) value still goes into the xmm0 register, and the 24 integer, logically, ends up in the esi register. Thus, this happens in the output:

666
24 0

Why? Well, since we asked for two integers, the printf call will look into the first integer register (esi) and print its content (24, as we stated above), then look in the following integer register (edx) and print whatever is in it (incidentally 0).

In the end, the behavior we see occurs because of how the x86_64 architecture is made. If you want to learn more about that, follow these links:

What does the doc say?

The truth is that according to the reference (printf, fprintf, sprintf, snprintf, printf_s, fprintf_s, sprintf_s, snprintf_s – cppreference.com):

If a conversion specification is invalid, the behavior is undefined.

And this same reference is unambiguous about the %d conversion specifier:

converts a signed integer into decimal representation [-]dddd.
Precision specifies the minimum number of digits to appear. The default precision is 1.
If both the converted value and the precision are ​0​ the conversion results in no characters.

So, giving a double to a printf argument where you are supposed to give a signed integer is UB. So it was our mistake to write this in the first place.

This actually generates a warning with clang. But with gcc, you’ll have to activate -Wall to see any warning about that.

Wrapping up

The C language is a very, very old language. It’s older than the C++ (obviously) that is itself very old. As a reminder, the first edition of the K&R has been printed in 1978. This was thirteen years before my own birth. And unlike us humans, programming languages don’t age well.

I could have summarized this article with a classic “don’t perform UB”, but I think it’s a bit off-purpose this time. So I’ll go and say it: don’t use printf at all.

The problem is not with printf itself, it’s with using a feature from another language1 that was originally published forty-three years ago. In short: don’t write C code.

Thanks for reading and see you next week!

1. Yeah, like it or not, but C and C++ and different languages. Different purpose, different intentions, different meta. That is exactly why I always deny job offers that have the tag “C/C++” because they obviously can’t pick a side.

Author: Chloé Lourseyre

About sizes

Author: Chloé Lourseyre

If I’d come up with a pop quiz about sizes in C++, most C++ developers would fail it (I know I would), because sizes in C++ are complicated.

The size of every fundamental type is not fixed, they are always implementation-defined.

Sill, the standard defines constraints on these sizes. These constraints take the form of one of these:

  • A comparison of the types’ sizeof.
  • The type’s minimum number of bits.

What is sizeof()?

One of the most widespread (and harmless) misconceptions about type sizes is that a byte holds 8 bits.

Although this is mostly true in practice, this is technically false.

A byte is actually defined as the size of a char. Though a char must have at least 8 bits, it can hold more. sizeof(N) returns the number of bytes of the type N, and every type size is expressed in terms of a char length. Thus, a 4-byte int is at least 32-bit, but you may not assume more from it (it could be 128-bit if a char is 32-bit).

The actual size of a byte is recorded in CHAR_BIT

Summary of sizes in C++

Here are all the definitions and restrictions of fundamental types’ sizes in C++:

  • 1 ≡ sizeof(char) ≤ sizeof(short) ≤ sizeof(int) ≤ sizeof(long) ≤ sizeof(long long)
  • 1 ≤ sizeof(bool) ≤ sizeof(long)
  • sizeof(char) ≤ sizeof(wchar_t) ≤ sizeof(long)
  • sizeof(float) ≤ sizeof(double) ≤ sizeof(long double)
  • sizeof(N) ≡ sizeof(unsigned N) ≡ sizeof(signed N)
  • A char has at least 8 bits
  • A short has at least 16 bits
  • A long has at least 32 bits

… and nothing more.

Fun fact: according to this definition, it is technically possible that all fundamentals have 32 bits.

Two words of wisdom

Since the sizes of the fundamental types entirely depend on the architecture, it may sometimes be hard to write consistent code.

#include <limits>

limits of the standard library contains the upper and lower limits of every fundamental type, and allows you to know if a given type is signed or not.

Example:

#include <limits>
#include <iostream>

int main()
{
    std::cout << "largest double == " << std::numeric_limits<double>::max() << std::endl;
    std::cout << "char is signed == " << std::numeric_limits<char>::is_signed << std::endl;
}

More at std::numeric_limits – cppreference.com.

Reminder: signed integer overflow is undefined behavior. Using limits helps you in preventing that from happening.

#include <cstdint>

Sometimes you may want to deal with fixed-sized types (in terms of bit) and not rely on the implementation specifics. For instance, if you implement serialization, work on very limited memory space, or develop cross-platform software, you may want to explicitly provide the bit-size of the type you’re using.

You can do so with the library cstdint which contains fixed-size types.

Here are a few of them:

int8_t
int16_t
int32_t
int64_t
Signed integer type with width of exactly 8, 16, 32 and 64 bits respectively
with no padding bits and using 2’s complement for negative values
(provided only if the implementation directly supports the type)
int_least8_t
int_least16_t
int_least32_t
int_least64_t
Smallest signed integer type with width of at least 8, 16, 32 and 64 bits respectively
intmax_tMaximum-width signed integer type
uint8_t
uint16_t
uint32_t
uint64_t
Unsigned integer type with width of exactly 8, 16, 32 and 64 bits respectively
(provided only if the implementation directly supports the type)
uint_least8_t
uint_least16_t
uint_least32_t
uint_least64_t
smallest unsigned integer type with width of at least 8, 16, 32 and 64 bits respectively
uintmax_tmaximum-width unsigned integer type

More at Fixed width integer types (since C++11) – cppreference.com.

Wrapping up

If you want to read more about type size, refer to section §6.2.8 of The C++ Langage (Bjarne Stroustrup). More broadly, you can read about types and declarations in the whole section §6 of the book.

You can also refer to Fundamental types – cppreference.com for online documentation

Thanks for reading and see you next week!

Author: Chloé Lourseyre

Lambdas as const ref

Author: Chloé Lourseyre

Context

This week, I will present you a piece of code I bumped into not so long ago.

It looked like that:

#include <vector>
#include <algorithm>

int count_even(const std::vector<int>& v)
{
    const auto& my_func = [] (int i)->bool
    {
        return i%2==0; 
    };

    return std::count_if(std::cbegin(v), std::cend(v), my_func);
}

Seeing this, VisualStudio was unsure if it should compile or not.

And by unsure, I mean that a compilation error stating something like “my_func is used but it is already destroyed” was periodicly prompted then discarded during the compilation of the library. But in the end, it compiled.

When I saw this, I thought two things:

“Wait, we can bind a temporary to a const ref?”

and

“What’s the use of binding a lambda to a const ref?”

These are the questions I will answer today.

Binding a rvalue to a const reference

To be short: yes, you can bind a rvalue to a const ref.

Intuitivly, I would have said that trying to do so will only result in a dangling reference, but it will not.

This language is smart, I tend to forget it, but it is only logical that if you try to bind a cons ref to a temporary object, the object will be put on the memory stack and accessed via the const ref.

To illustrate this, here are two functions:

void foo()
{
    const int& i = 1;
    const int& j = i+1;
}

void bar()
{
    int iv = 1;
    const int& i = iv;
    int jv = i+1;
    const int& j = jv;
}

The function foo direclty binds the rvalues to const refs, as the function bar instanciate const values before binding them to cons refs (thus binding a lvalue to a const ref).

The assembly code that clang generate for this piece of code is the following:

foo():                                # @foo()
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 12], 1
        lea     rax, [rbp - 12]
        mov     qword ptr [rbp - 8], rax
        mov     rax, qword ptr [rbp - 8]
        mov     eax, dword ptr [rax]
        add     eax, 1
        mov     dword ptr [rbp - 28], eax
        lea     rax, [rbp - 28]
        mov     qword ptr [rbp - 24], rax
        pop     rbp
        ret
bar():                                # @bar()
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 1
        lea     rax, [rbp - 4]
        mov     qword ptr [rbp - 16], rax
        mov     rax, qword ptr [rbp - 16]
        mov     eax, dword ptr [rax]
        add     eax, 1
        mov     dword ptr [rbp - 20], eax
        lea     rax, [rbp - 20]
        mov     qword ptr [rbp - 32], rax
        pop     rbp
        ret

These two functions are (except from the stack alignment details, which are not really important) the same.

This fact is confirmed by the IBM documentation: Initialization of references (C++ only) – IBM Documentation, and, of course, by the standard (The C++11 Programming Language, §7.7.1).

This is a very simple fact that is really clear in the standard, but it is rarely used in code and referenced on the web.

The reason for that is that binding to a const ref instead of a const value – or even a plain value – seems useless.

But is it?

Binding a lambda to a const ref

Coming back to the initial context, the question was why it is legal to bind a lambda to a const ref and if it is useful.

As a reminder, here is the example I showed you earlier:

#include <vector>
#include <algorithm>

int count_even(const std::vector<int>& v)
{
    const auto& my_func = [] (int i)->bool
    {
        return i%2==0; 
    };

  	return std::count_if(std::cbegin(v), std::cend(v), my_func);
}

When we put it in C++ Insights (cppinsights.io), we obtain the following code:

#include <vector>
#include <algorithm>

int count_even(const std::vector<int, std::allocator<int> > & v)
{
    
  class __lambda_6_27
  {
    public: 
    inline /*constexpr */ bool operator()(int i) const
    {
      return (i % 2) == 0;
    }
    
    using retType_6_27 = auto (*)(int) -> bool;
    inline /*constexpr */ operator retType_6_27 () const noexcept
    {
      return __invoke;
    };
    
    private: 
    static inline bool __invoke(int i)
    {
      return (i % 2) == 0;
    }
    
    public: 
    // inline /*constexpr */ __lambda_6_27(const __lambda_6_27 &) noexcept = default;
    // inline /*constexpr */ __lambda_6_27(__lambda_6_27 &&) noexcept = default;
    // /*constexpr */ __lambda_6_27() = default;
    
  };
  
  const __lambda_6_27 & my_func = __lambda_6_27{};
  return static_cast<int>(std::count_if(std::cbegin(v), std::cend(v), __lambda_6_27(my_func)));
}

As you might have guessed, a lambda function is in fact a functor (here called __lambda_6_27). Thus, the assignation calls the constructor of that functor, which is a rvalue.

We just saw that we could bind a rvalue to a const ref, thus binding a lambda to a const ref is legal.

This is why we can bind a lambda to a const ref.

Performance and optimization

To answer the question that if we should bind a lambda to a const ref instead of a value, we have to evaluate if one method is faster that the other.

Execution time

I’ll use Quick C++ Benchmarks (quick-bench.com) to evaluate execution time.

Here are the snippets I’ll use:

std::vector<int> v = {0,1,2,3,4};

static void ConstRef(benchmark::State& state)
{
  const auto& l = [](int i)->bool{ return i%2 == 0;};
  for (auto _ : state)
  {
    std::count_if(cbegin(v), cend(v), l);
  }
}
BENCHMARK(ConstRef);

static void Plain(benchmark::State& state)
{
  auto l = [](int i)->bool{ return i%2 == 0;};
  for (auto _ : state)
  {
    std::count_if(cbegin(v), cend(v), l);
  }
}
BENCHMARK(Plain);

I ran benchmarks using Clang 11.0 and GCC 10.2, with all optimization options between -O0 and -O3.

And here are the results:

CompilerOptiomization
option
ET with
const ref
ET with
plain value
const ref / plain
value ratio
Clang 11.0-O032.30633.7320.958
Clang 11.0 -O1224.96204.921.097
Clang 11.0 -O23.9982e-64.0088e-60.997
Clang 11.0 -O33.7273e-64.1281e-60.903
GCC 10.2-O0 64.37965.0170.990
GCC 10.2 -O1 11.75411.8710.990
GCC 10.2 -O2 3.7470e-64.0196e-60.932
GCC 10.2 -O3 3.6523e-63.9021e-60.936

What you want to look at is the last column, which computes the ratio between const-ref and plain value (the units of the execution times have unlabeled units, it’s useless to try and compare them directly).

A value greater that 1 means the const-ref version is slower than the plain version, while a value lower that 1 means the const-ref is slower.

Here are the charts:

All in all we can see that while the const-ref version is still faster than the value version, the difference doesn’t go above 10%. The highest differences are at the maximum optimization level.

If the code in a bottleneck (in the 20% of Pareto’s law) then this 10% can make a little difference, but I wouldn’t expect more than one or two percent gain overall (including other parts of your code).

However, if this is not a bottleneck, then there’s no version to prefer over the other one.

Compilation time

Does the const ref binding can affect compilation time? To answer this, I used C++ Build Benchmarks (build-bench.com).

I compiled the following codes in the same circumstances, using Clang 11.0 and GCC 10.2, with all optimization options between -O0 and -O3:

#include <vector>

int main() 
{
    const auto& l = [](int i)->bool{ return i%2 == 0;};
    std::vector<int> v= {0,1,2,3,4};
    std::count_if(cbegin(v), cend(v), l);
}

Then, without using const ref:

#include <vector>

int main() 
{
    auto l = [](int i)->bool{ return i%2 == 0;};
    std::vector<int> v= {0,1,2,3,4};
    std::count_if(cbegin(v), cend(v), l);
}

And here are the results:

CompilerOptimization
option
BT with
const ref
BT with
plain value
const ref / plain
value ratio
Clang 11.0-O00.36290.35101.034
Clang 11.0 -O10.40100.40350.994
Clang 11.0 -O20.37550.37650.997
Clang 11.0 -O30.37450.37351.003
GCC 10.2 -O0 0.39150.39001.004
GCC 10.2 -O1 0.38300.38101.005
GCC 10.2 -O2 0.37650.37750.997
GCC 10.2 -O3 0.37650.37501.004

In any case, there is less than 4% difference between the two versions, and in most cases it’s event under 1%. We can say that there is no effective difference.

Conclusion

There is not real advantage in binding a rvalue to a const ref instead of a plain value. Most of the time, you’ll prefer to use a const value just for the sake of saving a symbol (but there is no huge difference).

In case you are in a bottleneck, you might consider using const refs instead of values, but I suggest you do your own benchmarks, tied to your specific context, in order to detemine which version is the better.

Thanks for reading, and see you next week!

Author: Chloé Lourseyre

windows.h breaks the standard library (and my will to live)

Author: Chloé Lourseyre

While working on not-so-old code, I ended up with a strange compilation error in MS Visual Studio :

Here was the incriminated code :

//...

const T_LongType BIG_OFFSET = std::numeric_limits<T_LongType>::max() / 2;

//...

And here the errors :

1>MyFile.cpp(42): error C2589: '(' : illegal token on the right side of '::'
1>MyFile.cpp(42): error C2059: syntax error : '::'
1>MyFile.cpp(42): error C2059: syntax error : ')'
1>MyFile.cpp(42): error C2059: syntax error : ')'

It came with a bunch of warnings I won’t even bother writing here.

I took me a hella lot of time to find out what was wrong. The type T_LongType was correctly defined (actually a typedef of long) and I didn’t forget to include <limits>.

Many of you may already know the culprit, and it’s the following line :

#include <windows.h>

Indeed, if we look into the code of this library, we can see a surprising piece of code :

#ifndef NOMINMAX

#ifndef max
#define max(a,b)            (((a) > (b)) ? (a) : (b))

#ifndef min
#define min(a,b)            (((a) < (b)) ? (a) : (b))

#endif  /* NOMINMAX */

To my defense, the file in which the error occured did not include windows.h. It was included by another header.

Explanation

The fact that windows.h defines macros called min and max implies that, during the preprocessor phase of the compilation, every instances of the words min and max will be replaced.

This means that, after the preprocessing, instead of seeing this :

const T_LongType BIG_OFFSET = std::numeric_limits<T_LongType>::max() / 2;

The compiler will see this :

const T_LongType BIG_OFFSET = std::numeric_limits<T_LongType>::(((a) > (b)) ? (a) : (b))() / 2;

Which makes no sense, thus the compilation errors mentioned above.

Many reasons to not include windows.h

Here is a non-exhaustive list that why it is a bad practice to include this header :

  • It breaks the standard library just by including it. The way we use the functionality of the standard library should work whatever the header files we include.
  • It forces you to define NOMINMAX and the beginning of each headers that include windows.h. And if you ever happen to forget this, every file that will include your header will have to define it.
  • Since it’s OS-dependant, it’s better to avoid it as long as you can. If you use where you don’t need to, you won’t be able to port your code to other systems, you may obtain bad coding habits (by relying too much on it) and to not forget that the more specific a library is, the less maintained it will be.

Conclusion

All in all there are ways to use windows.h safely, but you must be absolutely be solid in how you include it to avoid side-effects to other sources and headers.

As long as you can, don’t use it.

Author: Chloé Lourseyre