Yet another reason to not use printf (or write C code in general)

Author: Chloé Lourseyre

Recently, Joe Groff @jckarter tweeted a very interesting behavior inherited from C:

Obviously, it’s a joke, but we’re gonna talk more about what’s happening in the code itself.

So, what’s happening?

Just to be 100% clear, double(2101253) does not actually double the value of 2101253. It’s a cast from int to double.

If we write this differently, we can obtain this:

#include <cstdio>

int main() {
    printf("%d\n", 666);
    printf("%d\n", double(42));
}

On the x86_64 gcc 11.2 compiler, the prompt is as follows:

666
4202506

So we can see that the value 4202506 has nothing to do with the 666 nor the 42 values.

In fact, if we launch the same code in the x86_64 clang 12.0.1 compiler, things are a little bit different:

666
4202514

You can see the live results here: https://godbolt.org/z/c6Me7a5ee

You may have guessed it already, but this comes from line 5, where we print a double as an int. But this is not some kind of conversion error (of course that your computer knows how to convert from double to int, it will do it fine if this was what was happening), the issue comes from somewhere else.

The truth

If we want to understand how it works that way, we’ll have to take a look at the assembly code (https://godbolt.org/z/5YKEdj73r):

.LC0:
        .string "%d\n"
main:
        push    rbp
        mov     rbp, rsp
        mov     esi, 666
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        mov     rax, QWORD PTR .LC1[rip]
        movq    xmm0, rax
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 1
        call    printf
        mov     eax, 0
        pop     rbp
        ret
.LC1:
        .long   0
        .long   1078263808

(use this Godbolt link to have a clearer matching between the C++ code and the assembly instructions: https://godbolt.org/z/5YKEdj73r)

In the yellow zone of the assembly code (lines 6-to 9, the equivalent to printf("%d\n", 666);) we can see that everything’s fine, the 666 value is put in the esi register and then the function printf is called. So it’s an educated guess to say that when the printf function reads a %d in the string it is given, it’ll look in the esi register for what to print.

However, we can see in the blue part of the code (lines 10 to 14, the equivalent to printf("%d\n", double(42));) the value is put in another register: the xmm0 register. Since it is given the same string as before, it’s pretty guessable that the printf function will look into the esi register again, whatever there is in there.

We can prove that statement pretty easily. Take the following code:

#include <cstdio>

int main() {
    printf("%d\n", 666);
    printf("%d %d\n", double(42), 24);
}

It’s the same code, with an additional integer that is print in the second printf instruction.

If we look at the assembly (https://godbolt.org/z/jjeca8qd7):

.LC0:
        .string "%d %d\n"
main:
        push    rbp
        mov     rbp, rsp
        mov     esi, 666
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        mov     rax, QWORD PTR .LC1[rip]
        mov     esi, 24
        movq    xmm0, rax
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 1
        call    printf
        mov     eax, 0
        pop     rbp
        ret
.LC1:
        .long   0
        .long   1078263808

The double(42) value still goes into the xmm0 register, and the 24 integer, logically, ends up in the esi register. Thus, this happens in the output:

666
24 0

Why? Well, since we asked for two integers, the printf call will look into the first integer register (esi) and print its content (24, as we stated above), then look in the following integer register (edx) and print whatever is in it (incidentally 0).

In the end, the behavior we see occurs because of how the x86_64 architecture is made. If you want to learn more about that, follow these links:

What does the doc say?

The truth is that according to the reference (printf, fprintf, sprintf, snprintf, printf_s, fprintf_s, sprintf_s, snprintf_s – cppreference.com):

If a conversion specification is invalid, the behavior is undefined.

And this same reference is unambiguous about the %d conversion specifier:

converts a signed integer into decimal representation [-]dddd.
Precision specifies the minimum number of digits to appear. The default precision is 1.
If both the converted value and the precision are ​0​ the conversion results in no characters.

So, giving a double to a printf argument where you are supposed to give a signed integer is UB. So it was our mistake to write this in the first place.

This actually generates a warning with clang. But with gcc, you’ll have to activate -Wall to see any warning about that.

Wrapping up

The C language is a very, very old language. It’s older than the C++ (obviously) that is itself very old. As a reminder, the first edition of the K&R has been printed in 1978. This was thirteen years before my own birth. And unlike us humans, programming languages don’t age well.

I could have summarized this article with a classic “don’t perform UB”, but I think it’s a bit off-purpose this time. So I’ll go and say it: don’t use printf at all.

The problem is not with printf itself, it’s with using a feature from another language1 that was originally published forty-three years ago. In short: don’t write C code.

Thanks for reading and see you next week!

1. Yeah, like it or not, but C and C++ and different languages. Different purpose, different intentions, different meta. That is exactly why I always deny job offers that have the tag “C/C++” because they obviously can’t pick a side.

Author: Chloé Lourseyre

About sizes

Author: Chloé Lourseyre

If I’d come up with a pop quiz about sizes in C++, most C++ developers would fail it (I know I would), because sizes in C++ are complicated.

The size of every fundamental type is not fixed, they are always implementation-defined.

Sill, the standard defines constraints on these sizes. These constraints take the form of one of these:

  • A comparison of the types’ sizeof.
  • The type’s minimum number of bits.

What is sizeof()?

One of the most widespread (and harmless) misconceptions about type sizes is that a byte holds 8 bits.

Although this is mostly true in practice, this is technically false.

A byte is actually defined as the size of a char. Though a char must have at least 8 bits, it can hold more. sizeof(N) returns the number of bytes of the type N, and every type size is expressed in terms of a char length. Thus, a 4-byte int is at least 32-bit, but you may not assume more from it (it could be 128-bit if a char is 32-bit).

The actual size of a byte is recorded in CHAR_BIT

Summary of sizes in C++

Here are all the definitions and restrictions of fundamental types’ sizes in C++:

  • 1 ≡ sizeof(char) ≤ sizeof(short) ≤ sizeof(int) ≤ sizeof(long) ≤ sizeof(long long)
  • 1 ≤ sizeof(bool) ≤ sizeof(long)
  • sizeof(char) ≤ sizeof(wchar_t) ≤ sizeof(long)
  • sizeof(float) ≤ sizeof(double) ≤ sizeof(long double)
  • sizeof(N) ≡ sizeof(unsigned N) ≡ sizeof(signed N)
  • A char has at least 8 bits
  • A short has at least 16 bits
  • A long has at least 32 bits

… and nothing more.

Fun fact: according to this definition, it is technically possible that all fundamentals have 32 bits.

Two words of wisdom

Since the sizes of the fundamental types entirely depend on the architecture, it may sometimes be hard to write consistent code.

#include <limits>

limits of the standard library contains the upper and lower limits of every fundamental type, and allows you to know if a given type is signed or not.

Example:

#include <limits>
#include <iostream>

int main()
{
    std::cout << "largest double == " << std::numeric_limits<double>::max() << std::endl;
    std::cout << "char is signed == " << std::numeric_limits<char>::is_signed << std::endl;
}

More at std::numeric_limits – cppreference.com.

Reminder: signed integer overflow is undefined behavior. Using limits helps you in preventing that from happening.

#include <cstdint>

Sometimes you may want to deal with fixed-sized types (in terms of bit) and not rely on the implementation specifics. For instance, if you implement serialization, work on very limited memory space, or develop cross-platform software, you may want to explicitly provide the bit-size of the type you’re using.

You can do so with the library cstdint which contains fixed-size types.

Here are a few of them:

int8_t
int16_t
int32_t
int64_t
Signed integer type with width of exactly 8, 16, 32 and 64 bits respectively
with no padding bits and using 2’s complement for negative values
(provided only if the implementation directly supports the type)
int_least8_t
int_least16_t
int_least32_t
int_least64_t
Smallest signed integer type with width of at least 8, 16, 32 and 64 bits respectively
intmax_tMaximum-width signed integer type
uint8_t
uint16_t
uint32_t
uint64_t
Unsigned integer type with width of exactly 8, 16, 32 and 64 bits respectively
(provided only if the implementation directly supports the type)
uint_least8_t
uint_least16_t
uint_least32_t
uint_least64_t
smallest unsigned integer type with width of at least 8, 16, 32 and 64 bits respectively
uintmax_tmaximum-width unsigned integer type

More at Fixed width integer types (since C++11) – cppreference.com.

Wrapping up

If you want to read more about type size, refer to section §6.2.8 of The C++ Langage (Bjarne Stroustrup). More broadly, you can read about types and declarations in the whole section §6 of the book.

You can also refer to Fundamental types – cppreference.com for online documentation

Thanks for reading and see you next week!

Author: Chloé Lourseyre

Lambdas as const ref

Author: Chloé Lourseyre

Context

This week, I will present you a piece of code I bumped into not so long ago.

It looked like that:

#include <vector>
#include <algorithm>

int count_even(const std::vector<int>& v)
{
    const auto& my_func = [] (int i)->bool
    {
        return i%2==0; 
    };

    return std::count_if(std::cbegin(v), std::cend(v), my_func);
}

Seeing this, VisualStudio was unsure if it should compile or not.

And by unsure, I mean that a compilation error stating something like “my_func is used but it is already destroyed” was periodicly prompted then discarded during the compilation of the library. But in the end, it compiled.

When I saw this, I thought two things:

“Wait, we can bind a temporary to a const ref?”

and

“What’s the use of binding a lambda to a const ref?”

These are the questions I will answer today.

Binding a rvalue to a const reference

To be short: yes, you can bind a rvalue to a const ref.

Intuitivly, I would have said that trying to do so will only result in a dangling reference, but it will not.

This language is smart, I tend to forget it, but it is only logical that if you try to bind a cons ref to a temporary object, the object will be put on the memory stack and accessed via the const ref.

To illustrate this, here are two functions:

void foo()
{
    const int& i = 1;
    const int& j = i+1;
}

void bar()
{
    int iv = 1;
    const int& i = iv;
    int jv = i+1;
    const int& j = jv;
}

The function foo direclty binds the rvalues to const refs, as the function bar instanciate const values before binding them to cons refs (thus binding a lvalue to a const ref).

The assembly code that clang generate for this piece of code is the following:

foo():                                # @foo()
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 12], 1
        lea     rax, [rbp - 12]
        mov     qword ptr [rbp - 8], rax
        mov     rax, qword ptr [rbp - 8]
        mov     eax, dword ptr [rax]
        add     eax, 1
        mov     dword ptr [rbp - 28], eax
        lea     rax, [rbp - 28]
        mov     qword ptr [rbp - 24], rax
        pop     rbp
        ret
bar():                                # @bar()
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 1
        lea     rax, [rbp - 4]
        mov     qword ptr [rbp - 16], rax
        mov     rax, qword ptr [rbp - 16]
        mov     eax, dword ptr [rax]
        add     eax, 1
        mov     dword ptr [rbp - 20], eax
        lea     rax, [rbp - 20]
        mov     qword ptr [rbp - 32], rax
        pop     rbp
        ret

These two functions are (except from the stack alignment details, which are not really important) the same.

This fact is confirmed by the IBM documentation: Initialization of references (C++ only) – IBM Documentation, and, of course, by the standard (The C++11 Programming Language, §7.7.1).

This is a very simple fact that is really clear in the standard, but it is rarely used in code and referenced on the web.

The reason for that is that binding to a const ref instead of a const value – or even a plain value – seems useless.

But is it?

Binding a lambda to a const ref

Coming back to the initial context, the question was why it is legal to bind a lambda to a const ref and if it is useful.

As a reminder, here is the example I showed you earlier:

#include <vector>
#include <algorithm>

int count_even(const std::vector<int>& v)
{
    const auto& my_func = [] (int i)->bool
    {
        return i%2==0; 
    };

  	return std::count_if(std::cbegin(v), std::cend(v), my_func);
}

When we put it in C++ Insights (cppinsights.io), we obtain the following code:

#include <vector>
#include <algorithm>

int count_even(const std::vector<int, std::allocator<int> > & v)
{
    
  class __lambda_6_27
  {
    public: 
    inline /*constexpr */ bool operator()(int i) const
    {
      return (i % 2) == 0;
    }
    
    using retType_6_27 = auto (*)(int) -> bool;
    inline /*constexpr */ operator retType_6_27 () const noexcept
    {
      return __invoke;
    };
    
    private: 
    static inline bool __invoke(int i)
    {
      return (i % 2) == 0;
    }
    
    public: 
    // inline /*constexpr */ __lambda_6_27(const __lambda_6_27 &) noexcept = default;
    // inline /*constexpr */ __lambda_6_27(__lambda_6_27 &&) noexcept = default;
    // /*constexpr */ __lambda_6_27() = default;
    
  };
  
  const __lambda_6_27 & my_func = __lambda_6_27{};
  return static_cast<int>(std::count_if(std::cbegin(v), std::cend(v), __lambda_6_27(my_func)));
}

As you might have guessed, a lambda function is in fact a functor (here called __lambda_6_27). Thus, the assignation calls the constructor of that functor, which is a rvalue.

We just saw that we could bind a rvalue to a const ref, thus binding a lambda to a const ref is legal.

This is why we can bind a lambda to a const ref.

Performance and optimization

To answer the question that if we should bind a lambda to a const ref instead of a value, we have to evaluate if one method is faster that the other.

Execution time

I’ll use Quick C++ Benchmarks (quick-bench.com) to evaluate execution time.

Here are the snippets I’ll use:

std::vector<int> v = {0,1,2,3,4};

static void ConstRef(benchmark::State& state)
{
  const auto& l = [](int i)->bool{ return i%2 == 0;};
  for (auto _ : state)
  {
    std::count_if(cbegin(v), cend(v), l);
  }
}
BENCHMARK(ConstRef);

static void Plain(benchmark::State& state)
{
  auto l = [](int i)->bool{ return i%2 == 0;};
  for (auto _ : state)
  {
    std::count_if(cbegin(v), cend(v), l);
  }
}
BENCHMARK(Plain);

I ran benchmarks using Clang 11.0 and GCC 10.2, with all optimization options between -O0 and -O3.

And here are the results:

CompilerOptiomization optionET with const refET with plain valueconst ref / plain value ratio
Clang 11.0-O032.30633.7320.958
Clang 11.0 -O1224.96204.921.097
Clang 11.0 -O23.9982e-64.0088e-60.997
Clang 11.0 -O33.7273e-64.1281e-60.903
GCC 10.2-O0 64.37965.0170.990
GCC 10.2 -O1 11.75411.8710.990
GCC 10.2 -O2 3.7470e-64.0196e-60.932
GCC 10.2 -O3 3.6523e-63.9021e-60.936

What you want to look at is the last column, which computes the ratio between const-ref and plain value (the units of the execution times have unlabeled units, it’s useless to try and compare them directly).

A value greater that 1 means the const-ref version is slower than the plain version, while a value lower that 1 means the const-ref is slower.

Here are the charts:

All in all we can see that while the const-ref version is still faster than the value version, the difference doesn’t go above 10%. The highest differences are at the maximum optimization level.

If the code in a bottleneck (in the 20% of Pareto’s law) then this 10% can make a little difference, but I wouldn’t expect more than one or two percent gain overall (including other parts of your code).

However, if this is not a bottleneck, then there’s no version to prefer over the other one.

Compilation time

Does the const ref binding can affect compilation time? To answer this, I used C++ Build Benchmarks (build-bench.com).

I compiled the following codes in the same circumstances, using Clang 11.0 and GCC 10.2, with all optimization options between -O0 and -O3:

#include <vector>

int main() 
{
    const auto& l = [](int i)->bool{ return i%2 == 0;};
    std::vector<int> v= {0,1,2,3,4};
    std::count_if(cbegin(v), cend(v), l);
}

Then, without using const ref:

#include <vector>

int main() 
{
    auto l = [](int i)->bool{ return i%2 == 0;};
    std::vector<int> v= {0,1,2,3,4};
    std::count_if(cbegin(v), cend(v), l);
}

And here are the results:

CompilerOptimization optionBT with const refBT with plain value const ref / plain value ratio
Clang 11.0-O00.36290.35101.034
Clang 11.0 -O10.40100.40350.994
Clang 11.0 -O20.37550.37650.997
Clang 11.0 -O30.37450.37351.003
GCC 10.2 -O0 0.39150.39001.004
GCC 10.2 -O1 0.38300.38101.005
GCC 10.2 -O2 0.37650.37750.997
GCC 10.2 -O3 0.37650.37501.004

In any case, there is less than 4% difference between the two versions, and in most cases it’s event under 1%. We can say that there is no effective difference.

Conclusion

There is not real advantage in binding a rvalue to a const ref instead of a plain value. Most of the time, you’ll prefer to use a const value just for the sake of saving a symbol (but there is no huge difference).

In case you are in a bottleneck, you might consider using const refs instead of values, but I suggest you do your own benchmarks, tied to your specific context, in order to detemine which version is the better.

Thanks for reading, and see you next week!

Author: Chloé Lourseyre

windows.h breaks the standard library (and my will to live)

Author: Chloé Lourseyre

While working on not-so-old code, I ended up with a strange compilation error in MS Visual Studio :

Here was the incriminated code :

//...

const T_LongType BIG_OFFSET = std::numeric_limits<T_LongType>::max() / 2;

//...

And here the errors :

1>MyFile.cpp(42): error C2589: '(' : illegal token on the right side of '::'
1>MyFile.cpp(42): error C2059: syntax error : '::'
1>MyFile.cpp(42): error C2059: syntax error : ')'
1>MyFile.cpp(42): error C2059: syntax error : ')'

It came with a bunch of warnings I won’t even bother writing here.

I took me a hella lot of time to find out what was wrong. The type T_LongType was correctly defined (actually a typedef of long) and I didn’t forget to include <limits>.

Many of you may already know the culprit, and it’s the following line :

#include <windows.h>

Indeed, if we look into the code of this library, we can see a surprising piece of code :

#ifndef NOMINMAX

#ifndef max
#define max(a,b)            (((a) > (b)) ? (a) : (b))

#ifndef min
#define min(a,b)            (((a) < (b)) ? (a) : (b))

#endif  /* NOMINMAX */

To my defense, the file in which the error occured did not include windows.h. It was included by another header.

Explanation

The fact that windows.h defines macros called min and max implies that, during the preprocessor phase of the compilation, every instances of the words min and max will be replaced.

This means that, after the preprocessing, instead of seeing this :

const T_LongType BIG_OFFSET = std::numeric_limits<T_LongType>::max() / 2;

The compiler will see this :

const T_LongType BIG_OFFSET = std::numeric_limits<T_LongType>::(((a) > (b)) ? (a) : (b))() / 2;

Which makes no sense, thus the compilation errors mentioned above.

Many reasons to not include windows.h

Here is a non-exhaustive list that why it is a bad practice to include this header :

  • It breaks the standard library just by including it. The way we use the functionality of the standard library should work whatever the header files we include.
  • It forces you to define NOMINMAX and the beginning of each headers that include windows.h. And if you ever happen to forget this, every file that will include your header will have to define it.
  • Since it’s OS-dependant, it’s better to avoid it as long as you can. If you use where you don’t need to, you won’t be able to port your code to other systems, you may obtain bad coding habits (by relying too much on it) and to not forget that the more specific a library is, the less maintained it will be.

Conclusion

All in all there are ways to use windows.h safely, but you must be absolutely be solid in how you include it to avoid side-effects to other sources and headers.

As long as you can, don’t use it.

Author: Chloé Lourseyre