Retrospective: The simplest error handler

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

This week, we’ll talk about last week’s article and try to be critical about it. There are a few things to say and it occurred to me (thanks to some feedback I got through social media) that it can be improved a lot.

If you don’t know about SimplestErrorHandler, go read the article about it: One of the simplest error handlers ever written | Belay the C++ (belaycpp.com).

The repo containing the feature is still available and is up-to-date with the changes I’ll present here: SenuaChloe/SimplestErrorHandler (github.com).

Version 2: recursion was useless

Indeed.

Sometimes, when you unpack a parameter pack, you may need to differentiate between the main loop of the recursion and the base case. This happens when you need to do more things in the main loop than in the base case.

For the ErrorHandler, the base case does the same thing as the main loop (plus some extras). So we actually don’t need any recursion and can directly unfold the parameter pack into the stream (just like it is done in the concept declaration):

template<typename TExceptionType = BasicException, typename ...TArgs>
requires ErrorHandlerTemplatedTypesConstraints<TExceptionType, TArgs...>
void raise_error(const TArgs & ...args)
{
    std::ostringstream oss;
    (oss << ... << args);
    const std::string error_str = oss.str();
    std::cerr << error_str << std::endl;
    throw TExceptionType(error_str);
}

Avoiding recursion is preferred when you can because recursion accumulates stack frames, which are best to avoid for several reasons1.

Thanks to this new form, we don’t need an auxiliary function (to pass down the std::ostringstream) anymore. Thus, we also don’t need private functions. Without private functions, there is no use for the class, so we can use a namespace instead. And since we now use a namespace, we can declare the concept inside it.

namespace ErrorHandler
{   
    template<typename TExceptionType, typename ...TArgs>
    concept TemplatedTypesConstraints = requires(std::string s, std::ostringstream oss, TArgs... args)
    {
        TExceptionType(s); // TExceptionType must be constructible using a std::string
        (oss << ... << args); // All args must be streamable
    };

    // ...

    template<typename TExceptionType = BasicException, typename ...TArgs>
    requires TemplatedTypesConstraints<TExceptionType, TArgs...>
    void raise_error(const TArgs & ...args)
    {
        // ...
    }

    template<typename TExceptionType = BasicException, typename ...TArgs>
    requires TemplatedTypesConstraints<TExceptionType, TArgs...>
    void assert(bool predicate, const TArgs & ...args)
    {
       // ...
    }
};

Discussion: Need for speed?

String streams are slow. That is a fact2. Plus, in our case, they aren’t especially practical to use (we need to declare a std::ostringstream locally, which means an additional #include, just for that). Is there a way to get rid of that?

The main reason string streams are slow is to-string conversions. However, to keep it as simple as we can (in our ErrorHandler), we want to let the std::ostringstream do its magic, even if it means slower-code.

Time performance is rarely critical (we can safely say that 80% of the time it is not critical). What we are developing is an error thrower. The only reason an error thrower would be inside a performance-sensitive piece of code is if we use it as control flow.

But that would be a mistake. It is called ErrorHandler after all, not FlowControlHandler. By design, it is not to be used in critical code. The only good way to use it in critical code is to get out of it in case of error.

So no, we won’t try to optimize the handler time-wise. We will keep it simple and short. We don’t need speed.

The full code

Here is the last version of the error handler:

#pragma once

#include <iostream>
#include <sstream>

namespace ErrorHandler
{   
    template<typename TExceptionType, typename ...TArgs>
    concept TemplatedTypesConstraints = requires(std::string s, std::ostringstream oss, TArgs... args)
    {
        TExceptionType(s); // TExceptionType must be constructible using a std::string
        (oss << ... << args); // All args must be streamable
    };

    class BasicException : public std::exception
    {
    protected:
        std::string m_what;
    public:
        BasicException(const std::string & what): m_what(what) {}
        BasicException(std::string && what): m_what(std::forward<std::string>(what)) {}
        const char * what() const noexcept override { return m_what.c_str(); };
    };

    template<typename TExceptionType = BasicException, typename ...TArgs>
    requires TemplatedTypesConstraints<TExceptionType, TArgs...>
    void raise_error(const TArgs & ...args)
    {
        std::ostringstream oss;
        (oss << ... << args);
        const std::string error_str = oss.str();
        std::cerr << error_str << std::endl;
        throw TExceptionType(error_str);
    }

    template<typename TExceptionType = BasicException, typename ...TArgs>
    requires TemplatedTypesConstraints<TExceptionType, TArgs...>
    void assert(bool predicate, const TArgs & ...args)
    {
        if (!predicate)
            raise_error<TExceptionType>(args...);
    }
};

Wrapping up

Version 2 of the ErrorHandler is even shorter and simpler than the first. This is a major improvement.

I want to give special thanks to the people who noticed that mistakes were made and suggested improvement.

Thanks for reading and see you next week!

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addenda

Github repo

SenuaChloe/SimplestErrorHandler (github.com)

Notes

  1. Mainly to avoid stack overflow and make debugging more readable. At the end of the day, recursion is not really bad, especially in a case like this where stack overflow is unlikely to happen, but using a fold expression is shorter and clearer. Plus, there is also recursion at compile-time which slows down compilation and may even crash the compiler in improperly bounded cases. This shall never happen with a fold expression
  2. This is pretty hard to source as a fact since everyone prefers to get rid of streams instead of proving they are slow. Since I’m bad at benchmarks (but I’m working on it), I won’t develop that here. If you want to share your own benchmarks (either to prove me wrong or right), please do in the comments.

One of the simplest error handlers ever written

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

This week, I’ll present you a small device I wrote to handle basic errors, the most compact and generic I could think of.

It is certainly not perfect (mainly because perfection is subjective) but it is very light-weight and easy to use.

If you want to skip the article and go directly to the source, here is the Github repo: SenuaChloe/SimplestErrorHandler (github.com)

Specifications

In terms of error handling, my needs are usually as follow:

  • The error handler must write a message on the error output (std::cerr).
  • The error handler must be able to take several arguments and stream them into the error message.
  • The error handler must raise an exception.
  • The what() of the exception must return the same message that is prompted on std::cerr.
  • The specific type of exception raised must be configurable.
  • Raising an error must be a one-function call.
  • The error handler must not rely on macros.

So that’ll be the basic criteria I’ll be relying on to design the error handler (we may add a few specifications along the way).

Step 0: Setup

To make it very light and simple, all the code will be in a single header file (it’s simpler to include a header into your project than a lib). But since there will probably be auxiliary functions (that won’t be part of the interface), we need a way to hide them.

That’s why we will put all the code in a full-static class1. There will be private static member functions (for this internal functions), public static member functions (the interface), and possibly types and such.

Step 1: Basic recursion and variadic templates

Starting with a simple recursion

To have a fully customizable error message, we need a variable number of arguments (and thus some variadic templates). We will recurse over the arguments2, streaming each of them, starting with head, into a stream.

template<typename THead>
static void raise_error_recursion(const THead & arg_head)
{
    std::cerr << arg_head << std::endl;
    throw;
}

template<typename THead, typename ...TTail>
static void raise_error_recursion(const THead & arg_head, const TTail & ...arg_tail)
{
    std::cerr << arg_head;
    raise_error_recursion(arg_tail...);
}

The first raise_error_recursion represents the base condition of the recursion: if there is only one argument, then we print it then throw.

The second raise_error_recursion represents the recursion loop. As long as there is more than one argument in the arg_tail parameter pack, we call the second raise_error_recursion, which prints the arg_head into cerr and then calls itself back. As soon as there is only one parameter left in the parameter pack, we end up in the first overload that ends the recursion.

With a stream and a real exception

However, in the snippet just above, we don’t throw any exception, we just throw;. As a reminder, two of the specification were:

  • The error handler must raise an exception.
  • The what() of the exception must return the same message that is printed on std::cerr.

So we need to throw a real exception, and that exception must be constructed with our error message.

As an example, we’ll use the std::runtime_error exception, which can be constructed with a std::string.

The problem is then that we can’t just stream the error message into cerr anymore: we need a way to memorize the message to, at the end, stream it into cerr and construct our runtime_exception.

An solution to that is to add a stringstream as a parameter of the recursive functions.

template<typename THead>
static void raise_error_recursion(std::ostringstream & error_string_stream, const THead & arg_head)
{
    error_string_stream << arg_head;
    const std::string current_error_str = error_string_stream.str(); 

    std::cerr << current_error_str << std::endl;
    throw std::runtime_error(current_error_str);
}

template<typename THead, typename ...TTail>
static void raise_error_recursion(std::ostringstream & error_string_stream, const THead & arg_head, const TTail & ...arg_tail)
{
    error_string_stream << arg_head;
    raise_error_recursion(error_string_stream, arg_tail...);
}

There, in the body of the recursion, we stream the error message into the stringstream instead of cerr. In the base case of the recursion, we convert this stream into a string that is then used to construct the exception and to print the error output.

Is that it?

These two functions are the mainframe of the error handling. We don’t need anything more than that to fulfill most of the specifications.

But of course, it’d be great to add a few enhancements that’ll ease the use of the handler.

Step 2: Adding interface

This stringstream is a pain and should be invisible to the user. Thus, we’ll put the previous functions in the private part of the class and write a member function to be used as interface:

template<typename ...TArgs>
static void raise_error(const TArgs & ...args)
{
    std::ostringstream error_string_stream;
    raise_error_recursion(error_string_stream, args...);
}

raise_error is now a very simple function to use.

Step 3: Adding customizable exceptions

The exception as a template

The only spec that is not implemented is “The specific type exception raised must be configurable“.

To do that we will add a template to each function. This represents the exception that must be raised.

class ErrorHandler
{
    ErrorHandler(); // Private constructor -- this is a full-static class
    
    template<typename TExceptionType, typename THead>
    static void raise_error_recursion(std::ostringstream & error_string_stream, const THead & arg_head)
    {
        error_string_stream << arg_head;
        const std::string current_error_str = error_string_stream.str();

        std::cerr << current_error_str << std::endl;
        throw TExceptionType(current_error_str);
    }

    template<typename TExceptionType, typename THead, typename ...TTail>
    static void raise_error_recursion(std::ostringstream & error_string_stream, const THead & arg_head, const TTail & ...arg_tail)
    {
        error_string_stream << arg_head;
        raise_error_recursion<TExceptionType>(error_string_stream, arg_tail...);
    }

public:

    template<typename TExceptionType, typename ...TArgs>
    static void raise_error(const TArgs & ...args)
    {
        std::ostringstream error_string_stream;
        raise_error_recursion<TExceptionType>(error_string_stream, args...);
    }

    template<typename TExceptionType>
    static void raise_error()
    {
        raise_error<TExceptionType>("<Unknown error>");
    }
};

This way, you can call raise_error with any exception that is constructable with a std::string, like so:

ErrorHandler::raise_error<std::runtime_error>("Foo ", 42);

However, this is a bit heavy. Sometimes, you just wanna raise a generic error and don’t mind whether it is a runtime_error, an invalid_argument, etc.

That’s why we’ll add a default value for the template TException. Unfortunately, we can’t use std::exception for this default value because it can’t be constructed using a std::string.

What I suggest is to define our own generic exception, within the namespace of the ErrorHandler. This way, we’ll have a generic exception to be used as a default value, and users may use it as base class to implement custom exceptions, all related to error handling (which can be useful in try-catches).

A custom generic exception for the error handler

class BasicException : public std::exception
{
protected:
    std::string m_what;
public:
    BasicException(const std::string & what): m_what(what) {}
    BasicException(std::string && what): m_what(std::forward<std::string>(what)) {}
    const char * what() const noexcept override { return m_what.c_str(); };
};

Of course, there is a public inheritance of std::exception so that BasicException can be used like any other standard exception3.

I implemented two constructors, one that builds the error message using a constant reference string, and one that builds the error message using a r-value reference (to be able to move data into the constructor).

And, of course, the what() virtual overload that returns the error message.

Using this exception as default, the raise_error functions now look like this:

template<typename TExceptionType = BasicException, typename ...TArgs>
static void raise_error(const TArgs & ...args)
{
    std::ostringstream error_string_stream;
    raise_error_recursion<TExceptionType>(error_string_stream, args...);
}

template<typename TExceptionType = BasicException>
static void raise_error()
{
    raise_error<TExceptionType>("<Unknown error>");
}

Now you can raise an error without having to provide an exception:

ErrorHandler::raise_error("Foo ", 42);

This will throw a ErrorHandler::BasicException by default.

Step 4: Adding an assert one-liner

The most common situation when you have to raise an error is if <something is wrong> then <raise an error>. It can also be seen as assert <expression>, if false <raise an error>.

This is commonly encountered in unit-testing, functions that take the form of assert(expression, message_if_false);

That’s why I think it’s a good idea to add a single function that will take an expression and a parameter pack (the error message) and call raise_error if the expression is not true.

template<typename TExceptionType = BasicException, typename ...TArgs>
static void assert(bool predicate, const TArgs & ...args)
{
    if (!predicate)
        raise_error<TExceptionType>(args...);
}

Using this, instead of writing this:

bool result = compute_data(data);
if (result != ErroCode::NO_ERROR)
    ErrorHandler::raise_error("Error encountered while computing data. Error code is ", result);

You’ll be able to write something like this:

bool result = compute_data(data);
ErrorHandler::assert(result == ErroCode::NO_ERROR, "Error encountered while computing data. Error code is ", result);

Step 5: Concept and constraints

We use a lot of templates. Many templates mean that the user will be likely to misuse them. Leading to compilation errors. And when we talk about template-related compilation errors, we talk about almost illegible error messages.

But, lucky us, there is a way in C++20 to make these error more readable while protecting our functions better: concepts and constraints.

We currently have two constraints:

  • TExceptionType must with constructible using a std::string.
  • Every TArgs... must be streamable.

So we’ll implement these two constraints within a single concept4:

template<typename TExceptionType, typename ...TArgs>
concept ErrorHandlerTemplatedTypesConstraints = requires(std::string s, std::ostringstream oss, TArgs... args)
{
    TExceptionType(s); // TExceptionType must be constructible using a std::string
    (oss << ... << args); // All args must be streamable
};

We now only have to add this concept as a constraint on our interface member functions:

template<typename TExceptionType = BasicException, typename ...TArgs>
requires ErrorHandlerTemplatedTypesConstraints<TExceptionType, TArgs...>
static void raise_error(const TArgs & ...args)
{
    std::ostringstream error_string_stream;
    raise_error_recursion<TExceptionType>(error_string_stream, args...);
}

template<typename TExceptionType = BasicException, typename ...TArgs>
requires ErrorHandlerTemplatedTypesConstraints<TExceptionType, TArgs...>
static void assert(bool predicate, const TArgs & ...args)
{
    if (!predicate)
        raise_error<TExceptionType>(args...);
}

The complete code

If we put everything together, the resulting header file looks like this:

#pragma once

#include <iostream>
#include <sstream>

template<typename TExceptionType, typename ...TArgs>
concept ErrorHandlerTemplatedTypesConstraints = requires(std::string s, std::ostringstream oss, TArgs... args)
{
    TExceptionType(s); // TExceptionType must be constructible using a std::string
    (oss << ... << args); // All args must be streamable
};

class ErrorHandler
{
    ErrorHandler(); // Private constructor -- this is a full-static class
    
    template<typename TExceptionType, typename THead>
    static void raise_error_recursion(std::ostringstream & error_string_stream, const THead & arg_head)
    {
        error_string_stream << arg_head;
        const std::string current_error_str = error_string_stream.str();

        std::cerr << current_error_str << std::endl;
        throw TExceptionType(current_error_str);
    }

    template<typename TExceptionType, typename THead, typename ...TTail>
    static void raise_error_recursion(std::ostringstream & error_string_stream, const THead & arg_head, const TTail & ...arg_tail)
    {
        error_string_stream << arg_head;
        raise_error_recursion<TExceptionType>(error_string_stream, arg_tail...);
    }

public:

    class BasicException : public std::exception
    {
    protected:
        std::string m_what;
    public:
        BasicException(const std::string & what): m_what(what) {}
        BasicException(std::string && what): m_what(std::forward<std::string>(what)) {}
        const char * what() const noexcept override { return m_what.c_str(); };
    };

    template<typename TExceptionType = BasicException, typename ...TArgs>
    requires ErrorHandlerTemplatedTypesConstraints<TExceptionType, TArgs...>
    static void raise_error(const TArgs & ...args)
    {
        std::ostringstream error_string_stream;
        raise_error_recursion<TExceptionType>(error_string_stream, args...);
    }

    template<typename TExceptionType = BasicException, typename ...TArgs>
    requires ErrorHandlerTemplatedTypesConstraints<TExceptionType, TArgs...>
    static void assert(bool predicate, const TArgs & ...args)
    {
        if (!predicate)
            raise_error<TExceptionType>(args...);
    }
};

To go further

We could push the genericity of the handler a little further and try to replace the std::cerr output stream by a customizable output stream that takes std::cerr by default.

However, that would mean more functions, a longer code, and the goal is to keep the header as short as possible.

It’s up to you now to stop here or go further and complete the implementation.

Wrapping up

This is certainly not the most complete way to handle errors in your program, but this is, in my opinion, a simple and clean way to do it while fulfilling the established specifications.

Up to you now to define your own specifications and to write your own error handler if your needs are different than mine.

You can use this code (almost) as you wish, as it is under the CC0-1.0 License.

Thanks for reading and see you next week!

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addenda

Github repo

SenuaChloe/SimplestErrorHandler (github.com)

Useful documentation

I used a lot of advanced features of C++. To learn more about them, follow the links:

Notes

  1. “Full-static” means that the class won’t be instantiable. All its member functions and member variable will be static, and the constructor will be private. That’s why we need a class and can’t use a namespace here: with a namespace, we couldn’t hide any auxiliary function.
    If you don’t want to use a full-static class and still want to hide the auxiliary functions, you’d have to put them into a cpp file. But if you do this, you need to compile the error handler as a lib in order to import it in other projects.
  2. To learn about recursion, read this page: Recursion – GeeksforGeeks. To learn about variadic parameters and templates, here you go: Variadic arguments – cppreference.com and Parameter pack(since C++11) – cppreference.com.
  3. See std::exception – cppreference.com to have some insight into how exceptions work in C++.
  4. The only small problem with that is that we have to implement the concept in the global namespace. That is why I used a pretty long name that begins with “ErrorHandler”: to avoid name collision as much as possible.

The three types of development

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

This week we’ll discuss a serious topic affecting the developer community. This touches several languages, but the C++ community is one of the most affected by it1.

There are several “ways” to write C++. I mean “way” as a collection of constraints and circumstances that will affect what you can do, what you should do, and how you can and should do it.

This may seem vague, but think of it as types of environments that can drastically change your approach to the code you reading, editing, and writing.

Based on my experience, I can distinguish three types of development2.

The three categories

The (almost) solo development

This is the type of development that has the fewest constraints (if not none at all). When you are developing alone or with very few collaborators, you can freely choose what to do and how you want to do it.

The collaborative licensed development

When you are on a bigger project, you will see constraints arise. Most of the time, these constraints will consist in which library you can or cannot use.

For instance, if you want to sell your software, you can’t use a JRL licensed software, because it prohibits commercial uses.

This is generally a type of development that concerns small companies or freelance developers.

The industrial development

Some projects are launched by big companies or company groups. They can be developed over numerous years (even decades if you include the maintenance phase of these projects), but more importantly they have heavy constraints overs which libraries you can use and in what environment the development takes place.

This is typically the type of development where the C++ version is the oldest (often prior to C++17, sometimes even in C++03). This is because the management (not to say the salesmen) that pilot the budget of this kind of project and decide whether the environment can be migrated or not.

A lot of developers that work on this kind of project arrive in the middle of it and face heavy resistance when they try to improve the environment3.

In this kind of project, you often have to deal with legacy code or with a part of the codebase that you can’t edit4.

What is specific to C++?

C++ is a complicated language, not only because of its syntax and language specification, but also because there are hundreds (if not thousands) of different possible environments.

There are dozens of C++ compilers, ported on numerous operating systems. As of today, there are 5 different versions of the standard5 that are present in professional projects.

It is thus essential for each C++ developer to adapt their advice to the person they are talking to. Because depending on the situation, you may say the complete opposite of what you would have said otherwise.

Clashing grounds

There is one place where the three types of development can be represented at the same time: on the internet. When you lurk on dedicated forums, you’ll eventually find people that are currently working on the different types of projects.

Overall this is a good thing, that all kinds of developers can meet over the internet, but it can lead to communication issues.

Indeed, if one developer who has only performed one type of development ever tries to give advice or feedback to a developer from another type of environment, a lot of this advice and feedback will not take the developer’s constraints and circumstances into account and will not be useful.

Let’s take a few examples to illustrate that.

Example from r/cpp

Here is an example that comes for Reddit, specifically from the subreddit r/cpp:

This example is typical: while courteous, it misses the point and is based on two sophisms:

  • “Every modern C++ compiler produces warnings if […]”. It greatly depends on what “modern” means, but there is a lot of compilers that do not work like your standard compiler. I’m thinking about compiling for embedded systems, experimental compilers, and home-made compilers that you can sometimes encounter on very specific projects, or even older compilers that did not implement the said warning at that time. Trying to generalize in this context is somewhat of a fallacy, especially since “printf API […] doesn’t enforce it”.
  • “[…] honestly you should have compiler warnings enabled anyway.”. I hear that a lot, and I think most of those who say it never have worked under project and environment constraints. When you arrive on a project, you don’t always have a say on how the environment works, especially if the project was ongoing for several years when you arrived. Our work (as C++ experts and such) is to try and change mentalities, but sometimes it doesn’t work, unfortunately. There are also situations where when you arrive on the projects, there are hundreds and hundreds of warnings, and the management won’t give you the time to correct them all. In this context warnings-hunting is a lost cause.

Of course, we should always try to change the world for the better and try to destroy improper environments, but denying the existence of these contexts is denying the reality, how the world of development sometimes really works.

When that occurs, try to add nuance to your statements, leave it open for people to explain what are their constraints.

Instead of

“Yes you have to enable -Wall but honestly you should have compiling warnings enabled anyway.”

say something like:

“If you can you should enable -Wall because it will help you prevent the issue and others as well.”

Example from SO

Here is another example, taken from Stack Overflow:

Pretty short, but a lot to say nonetheless.

“Best advice is not to write macros like that.” Okay, no problem, but why? Because of how macros work? Because you can’t do whatever you want to do with it? Because macros are bad design and there is a working alternative?

The question states the following constraint:

Is the question “Why do you need to use __LINE__?” really relevant? Since the question is based on the statement just above, whether you know why does the user need __LINE__ or not won’t help the original poster6.

Writing relevant advice is really easy when you put your thought into it. For instance:

This comment simply states that pointers are usually bad, while admitting that depending on the case they may be needed. It has been written to hint at the original poster about the dangers of pointers while remaining relevant.

Wrapping up

When you want to be helpful to other developers, you have to pay attention to their circumstances. Your answer won’t reach its target if it is irrelevant.

Plus you have to ask yourself: are you really helping anyone if your advice can be summarized by “You have to change your environment” to someone who can’t or won’t? You have to adapt to these situations, put your words into perspective, so the person you are talking to will acknowledge your advice, even if they can’t apply it directly.

It’s easy to fall into sophism and authoritative arguments. Always try to explain your arguments, even if they seem trivial to you. This will give weight to them. Moreover, maybe it’s trivial to you but it may not be so for others. And if you don’t manage to simply explain your argument, there is a really good chance that it’s a fallacy.

Thanks for reading and see you next week!

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addendum

Notes

  1. In this article, I’ll use C++ to illustrate, but everything that is said can be applied to any programming language. I explain why C++ is specifically affected by this later in the article.
  2. Depending on your own experience, you may discover other types of development. They supplement the existing ones.
  3. The definition of improve here is the key. What a new developer on a project might consider an improvement isn’t the same as what a senior developer, project manager, accountant, salesman or customer would consider an improvement. “It’s great that you’ve spent a year bringing the codebase up to C++20 with new GCC and clang , but you haven’t fixed any of the reported bugs, implemented the new features we promised to the customer and now we don’t support our legacy platform anymore…”
  4. For instance: because it is owned by another team or company, because it has already been sold to the client, or because it has already been QA’d and it’d take weeks to be QA’d again.
  5. I’m only counting from C++03 (so C++03, 11, 14, 17 and 20) since C++98 is very similar to C++03.
  6. It may sometimes occur that the original poster states a constraint that they could avoid. But it is unconstructive to “babysit” the OP in that case, it would be better to propose alternative with examples.

Prettier switch-cases

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

I learned this syntax during a talk at the CppCon 2021 given by Herb Sutter, Extending and Simplifying C++: Thoughts on Pattern Matching using `is` and `as` – Herb Sutter – YouTube. You also find this talk on Sutter’s blog, Sutter’s Mill – Herb Sutter on software development.

Context

Say you have a switch-case bloc with no fallthrough (this is important), like this one:

enum class Foo {
    Alpha,
    Beta,
    Gamma,
};

int main()
{
    std::string s;
    Foo f;

    // ...
    // Do things with s and f 
    // ...

    switch (f)
    {
        case Foo::Alpha:
            s += "is nothing";
            break;
        case Foo::Beta:
            s += "is important";
            f = Foo::Gamma;
            break;
        case Foo::Gamma:
            s += "is very important";
            f = Foo::Alpha;
    }
    
    // ...
}

Nothing fantastic to say about this code: it appends a suffix to the string depending on the value of f, sometimes changing f at the same time.

Now, let’s say we add a Delta to the enum class Foo, which does exactly like Gamma, but with a small difference in the text. This has a good chance to be the result:

enum class Foo {
    Alpha,
    Beta,
    Gamma,
    Delta,
};

int main()
{
    std::string s;
    Foo f;

    // ...
    // Do things with s and f 
    // ...

    switch (f)
    {
        case Foo::Alpha:
            s += "is nothing";
            break;
        case Foo::Beta:
            s += "is important";
            f = Foo::Alpha;
            break;
        case Foo::Gamma:
            s += "is very important";
            f = Foo::Alpha;
        case Foo::Delta:
            s += "is not very important";
            f = Foo::Alpha;
    }

    // ...
}

The new case block is obviously copy-pasted. But did you notice the bug?

Since in the first version, the developer of this code did not feel it necessary to put break at the end, when we copy-pasted the Gamma case we left it without break. So there will be an unwanted fallthrough in this switch.

New syntax

The new syntax presented in this article makes this kind of mistake less likely and makes the code a bit clearer.

Here it is:

    switch (f)
    {
        break; case Foo::Alpha:
            s += "is nothing";
        break; case Foo::Beta:
            s += "is important";
            f = Foo::Alpha;
        break; case Foo::Gamma:
            s += "is very important";
            f = Foo::Alpha;
        break; case Foo::Delta:
            s += "is not very important";
            f = Foo::Alpha;
    }

This is it: we put the break statement before the case.

This may look strange to you since the very first break is useless and there is no closing break in the last case block, but this is really functional and convenient.

If you begin each of your case block with break; case XXX:, you will never have a fallthrough bug ever again.

Benefits

The first benefit is the avoidance of the presented bug in the first section, when you forget to add a break when adding a case block. Even if you don’t copy-paste to create your new block, it’ll be visually obvious if you forget the break (your case statement won’t be aligned with the others).

But the real benefit (in my opinion) is that the syntax is, overhaul, nicer. For each case, you save a line by not putting the break within the case block, and everyone will notice at first sight that the switch-case has no fallthrough.

Of course, beauty is subjective. That includes the beauty of code. However, things like better alignment, clearer intentions, and line economy1 are, it seems to me, quite objective as benefits.

Disclaimer

The first time I saw this syntax, I quickly understood how it worked and why it was better than the “classic” syntax. However, I know that several people were confused and had to ask for an explanation.

But that’s almost always the case when introducing a new syntax.

So keep in mind that your team may be confused at first if you use it in a shared codebase. Be sure to explain (either in person or in the comments) so people can quickly adapt to this new form.

Wrapping up

This is certainly not a life-changing tip that is presented here, but I wanted to share it because I really like how it looks.

It’s yet another brick in the wall of making your code prettier.

Thanks for reading and see you next week.

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addendum

Notes

  1. “Line economy” is beneficial when it discards non-informative statements, just like the break is in this context. I would never say huge one-liners are better than a more detailed block of code (because they aren’t). Reuniting the break and the case keywords let your code breathe (you can put an empty line in place of the break if you want to keep space).

Is my cat Turing-complete?

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

This article is an adaptation of a Lightning Talk I gave at CppCon2021. A link to the video will be given here as soon as it’s available.

I’ll touch on a lighter subject this week, nonetheless quite important: is my cat Turing-complete?

Meet Peluche

Peluche (meaning “plush” in French) is a smooth cat that somehow lives in my house.

She will be our test subject today.

Is Peluche Turing-complete?

What is Turing-completeness

Turing-completeness is the notion that if a device can emulate a Turing machine, then it can perform any kind of computation1.

It means that any machine that implements the eight following instructions is a computer (and can thus execute any kind of computation):

  • . and ,: Inputting and outputting a value
  • + and -: Increase and decrease the value contained in a memory cell2.
  • > and <: Shift the current memory tape left or right.
  • [ and ]: Performing loops.

So, if Peluche can perform these eight instructions, we can consider her Turing-complete.

Proof of the Turing-completeness

Input and output

First, I tried to poke Peluche if I could get a reaction.

She looked at me, then just turned around.

So here it is: I poked her, and I got a reaction. So she can process inputs and give outputs.

Input/output confirmed!

Increase and decrease memory value

The other day, I came back from work to this:

Kibbles everywhere…

But then I took a closer look and realized that the slabs could be numbered, like this:

This looks pretty much like a memory tape to me! Since she can spill kibbles on the tiles and then eat them directly from the floor, she can increase and decrease the values contained in a given memory cell.

Increase/decrease confirmed!

Shift the current memory cell left or right

Another time, I was doing the dishes and inadvertently spilled some water on Peluche. She began to run everywhere around the kitchen, making a huge mess.

If you look close (at the tip of the red arrow), you may notice that while making this mess, she displaced her bowl.

Displacing her bowl means she will spill her kibbles in another tile. This counts as shifting the memory head to edit another memory cell.

Shift of the memory tape confirmed!

Perform loops

So, after this mess, I (obviously) had to clean up.

No more than five minutes later, I went back to the kitchen to this:

Yeah… she can DEFINITELY perform loops…

Loops confirmed!

We have just proven that Peluche is, indeed, Turing-complete. So now, how can we use her to perform high-performance computations?

What to do with her?

Now that I’ve proven that Peluche is Turing-complete, I can literally do anything with her!

Thus, I tried to give her simple code to execute3:

😾😾😾😾😾😾😾😾
😿
🐈😾
🐈😾😾
🐈😾😾😾
🐈😾😾😾😾
🐈😾😾😾😾😾
🐈😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾😾😾😾😾😾
🐈😾😾😾😾😾😾😾😾😾😾😾😾😾😾😾😾
😻😻😻😻😻😻😻😻😻😻😻😻😻😻😻😻🐾
😸
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸ™€πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸΎπŸΎπŸ™€πŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸΎπŸΎπŸΎπŸ™€πŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸΎπŸΎπŸΎπŸ™€πŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸ™€πŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸ™€πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸ™€πŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸ™€πŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸ˜ΎπŸ˜ΎπŸ™€πŸΎπŸΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸΎπŸΎπŸΎπŸ™€πŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸΎπŸΎπŸΎπŸ™€πŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜ΎπŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»
πŸˆπŸˆπŸˆπŸˆπŸˆπŸˆπŸΎπŸΎπŸ™€πŸ˜ΎπŸ˜Ύ
πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ˜»πŸ™€

The result was final: she wouldn’t do a thing.

Though they can, maybe cats are not designed to execute code after all?

About “cat-computing”

Jokes aside, cat-computing is the name I give to this generalized practice. In my experience, it happens quite often that when someone discovers a new feature of a language, they begin to use it everywhere, just because they can and they want to.

However, just like you can execute code using a cat4 but shouldn’t, it’s not because you can use a feature that you should.

They were too busy wondering if they could to think about whether they should.

Dr Ian Malcolm, Jurassic Park

Wrapping up

Cat-computing seems to be a rookie mistake (and it is), but even the most experienced developers sometimes make rookie mistakes (and there’s no shame in that).

Every three years, a new version of C++ is published. Every time, it makes me want to use the new features in every possible situation. Though this is a good opportunity to build some experience around that (one of the best ways to avoid misuses of a feature is to perform these misuses once, in my opinion), this is also favorable ground for acquiring bad practices.

Always ask yourself if a feature is necessary5 before using it, or else you may do cat-computing.

Also, cat-computing is animal abuse, so don’t do it 😠.

Thanks for reading and see you next week!

(No cats were harmed during the writing of this article, but one was gently poked.)

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addendum

Notes

  1. This is a simplified definition, very inaccurate but accurate enough for this example. If you want the real definition, go there: Turing completeness – Wikipedia
  2. I did not state it explicitly, but a Turing machine has a “memory tape” with “memory cells” on it. The machine is always pointing to a memory cell, which is the mentioned “current” memory cell.
  3. You may not be able to read this sample of code — this is a fancy new language I designed called “braincat”.
  4. Actually, you can’t execute code using a cat, I know, but it’s for the sake of the metaphor that I assume you can.
  5. Of course, necessity occurs when there is a known benefit to the feature. I’m not talking about “absolute necessity” but about “practical necessity”.

Duff’s device in 2021

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

This year at the CppCon, Walter E. Brown made a Lightning Talk about Duff’s device (I’ll put a Youtube link here as soon as it’s available).

Duff’s device is a pretty old contraption and I was wondering “How useful can it be in 2021, with C++20 and all?”.

So here we go.

What is Duff’s device?

What we call Duff’s device (named after its creator Tom Duff) is a way to implement manual loop unrolling in the C language.

Loop unrolling is an execution time optimization technique in which we reduce the number loop control by manually “unrolling” the loop content. It trades execution time for binary size (because your code is generally larger when you use this technique).

The principle of Duff’s device is to perform several computations (usually four to eight) in the same loop, so the loop control is evaluated one out of a few computations instead of each computation.

So, instead of doing this:

void execute_loop(int & data, const size_t loop_size)
{
    for (int i = 0 ; i < loop_size ; ++i)
    {
        computation(data);
    }
}

We do something that looks like this:

void execute_loop(int & data, const size_t loop_size)
{
    for (int i = 0 ; i < loop_size/4 ; ++i)
    {
        computation(data);
        computation(data);
        computation(data);
        computation(data);
    }
}

However, as you might have noticed, if loop_size if not a multiple of 4, the function performs the wrong number of calls to computation(). To rectify this, Duff’s device uses the C switch fall-through functionality, just like this:

void execute_loop(int & data, const size_t loop_size)
{
    size_t i = 0;
    switch(loop_size%4)
    {
        do{
            case 0: computation(data);
            case 3: computation(data);
            case 2: computation(data);
            case 1: computation(data);
            ++i;
        } while (i < (loop_size+3)/4);
    }
}

This is a bit more alien written that way, so I’ll explain it here :

At the beginning of the function we enter the switch statement and immediately check for modulo of loop_size. Depending on the result, we end up in one of the four cases. Then, because of the switch fallthrough, we end up doing a different number of computation depending on this modulo. This allows us to rectifies the problem of doing the wrong number of computations when loop_size is not a multiple of 4.

But then what happens? After falling through, the program encounters the while keyword. Thus, since it’s technically inside a do while() loop, the program goes back to the do statement and continue the loop as normal.

After the first occurrence, the case N labels are ignored, so it is as if it was falling through again.

You can check the numbers if you like: every time, we end up doing the correct number of computations.

Is Duff’s device worth it?

Duff’s device is from another time, another era (heck, it’s even from another language), so my first reaction about it in 2021 would be “This kind of device is probably counter-productive, I’d rather let the compiler do the optimization for me.”

But I want tangible proof of that. So what about a few benchmarks?

Benchmarks

To do the benchmarks, I used this code: Quick C++ Benchmarks – Duff’s device (quick-bench.com).

Here are the results1:

CompilerOptimization optionBasic loop
(cpu_time)
Duff’s device
(cpu_time)
Duff’s device /
Basic loop
Clang 12.0-Og7.5657e+47.2965e+4– 3.6%
Clang 12.0-O17.0786e+47.3221e+4+ 3.4%
Clang 12.0-O21.2452e-11.2423e-1– 0.23%
Clang 12.0-O31.2620e-11.2296e-1– 2.6%
GCC 10.3-Og4.7117e+44.7933e+4+ 1.7%
GCC 10.3-O17.0789e+47.2404e+4+ 2.3%
GCC 10.3-O24.1516e-64.1224e-6– 0.70%
GCC 10.3-O34.0523e-64.0654e-6+ 0.32%

In this case, the difference is insignificant (3,5% on a benchmark really is not much, in live code this would be diluted into the rest of the codebase). Plus, whether it is the basic loop or Duff’s device which is the fastest depends on the optimization level and compiler.

After that, I used a more simple version of computation() (one the compiler will optimize easier), this one: Quick C++ Benchmarks – Duff’s device (quick-bench.com).

This give these results:

CompilerOptimization optionBasic loop
(cpu_time)
Duff’s device
(cpu_time)
Duff’s device /
Basic loop
Clang 12.0-Og5.9463e+45.9547e+4+ 0.14%
Clang 12.0-O15.9182e+45.9235e+4+ 0.09%
Clang 12.0-O24.0450e-61.2233e-1+ 3 000 000%
Clang 12.0-O34.0398e-61.2502e-1+ 3 000 000%
GCC 10.3-Og4.2780e+44.0090e+4– 6.3%
GCC 10.3-O11.1299e+45.9238e+4+ 420%
GCC 10.3-O23.8900e-63.8850e-6– 0.13%
GCC 10.3-O35.3264e-64.1162e-6– 23%

This is interesting, because we can see that Clang can, on its own, greatly optimize the basic loop without managing to optimize Duff’s device (with -O2 and -O3 the basic loop is 30 000 times faster than Duff’s device; this is because the compiler optimize the basic loop into a single operation, but consider Duff’s device too complicated to be optimized).

On the other hand, GCC does not manage to optimize the basic much more than Duff’s device in the end, so if at -O1 the basic loop is more than 5 times faster, with -O3 Duff’s device is almost 23% faster (which is significant)2.

Readability and semantics

At first glance, Duff’s device is a very odd contraption. However, it is relatively well known among C and C++ developers (especially the oldest ones). Plus, we already have a name for it and a pretty good Wikipedia page to explains how it works.

As long as you identify it as such in the comments, I think it s pretty safe to use Duff’s device, even if you know your coworkers don’t know about it (you can even put the Wikipedia link in the comments if you like!).

Trying to seek a very specific case

Principle

The loop unrolling specifically aims to reduce the number of loop controls that are evaluated. So I set up a specific case where the loop control (and index increment) are both heavier to evaluate.

So instead of using an integer as the loop index, I used this class:

struct MyIndex
{
  int index;
  
  MyIndex(int base_index): index(base_index) {}
  
  MyIndex& operator++() 
  {  
    if (index%2 == 0)
      index+=3;
    else
      index-=1;
    return *this;
  }

  bool operator<(const MyIndex& rhs)
  {
    if (index%3 == 0)
      return index < rhs.index;
    else if (index%3 == 1)
      return index < rhs.index+2;
    else
      return index < rhs.index+6;
  }
};

Each time we increment or compare the MyIndex, we perform at least one modulo operation (a pretty heavy arithmetic operation).

And I ran the benchmarks on it.

Benchmarks

So I use the following code: Quick C++ Benchmarks – Duff’s device with strange index (quick-bench.com)

This give these results:

CompilerOptimization optionBasic loop
(cpu_time)
Duff’s device
(cpu_time)
Duff’s device /
Basic loop
Clang 12.0-Og2.0694e+55.9710e+4– 71%
Clang 12.0-O11.8356e+55.8805e+4– 68%
Clang 12.0-O21.2318e-11.2582e-1+ 2.1%
Clang 12.0-O31.2955e-11.2553e-4– 3.1%
GCC 10.3-Og6.2676e+44.0014e+4– 36%
GCC 10.3-O17.0324e+46.0959e+4– 13%
GCC 10.3-O26.5143e+44.0898e-6– 100%
GCC 10.3-O34.1155e-64.0917e-6– 0.58%

Here, we can see that Duff’s device is always better in the low optimization levels, but never in a significant advantage at -O3. This means that the compiler manages to optimize the basic loop as much as Duff’s device in the higher grades of optimization. This is significantly different from the previous results.

Why are the results so inconsistent?

The benchmark show very inconsistent results: for instance, how come that in the context of the simple computation(), with GCC and -O1, the basic loop is more than five times faster than Duff’s device, whereas with -O3, it’s Duff’s device that is 23% faster? How come that for the same code, Clang show totally different results than GCC and show that the basic loop is thirty thousand times faster with -O2 and -O3?

This is because each compiler has its own ways to optimize these kinds of loops at different level of optimization.

I you want to look into it, you can compare the assembly code generated by each compiler, just like in this example: Compiler Explorer (godbolt.org) where the GCC and the Clang version of the -O3 level of optimization are put side-by-side.

I would have loved detailing that here, but unfortunately it would take more than one whole article to analyze them all. If you, reader of this article, want to take things into your own hands and perform the analysis yourself, I’ll be more than glad to publish your results on this blog (you can contact me here).

Wrapping up

I will summarize the results in the following chart, which indicate which device is best in the different implementations we saw:

CompilerOptimization optionComplex computation()Trivial computation()Heavy loop control
Clang 12.0-OgNoneNoneDuff’s device
Clang 12.0-O1NoneNoneDuff’s device
Clang 12.0-O2NoneBasic loopNone
Clang 12.0-O3NoneBasic loopNone
GCC 10.3-OgNoneNoneDuff’s device
GCC 10.3-O1NoneBasic loopDuff’s device
GCC 10.3-O2NoneNoneDuff’s device
GCC 10.3-O3NoneDuff’s deviceNone

How to interpret these results?

First, when we have a complex computation and a trivial loop control, there is no significant difference between the both.

Second, when to computation is trivial, it’s often the basic loop which is better, but not always.

Third, as expected, it is Duff’s device which is preferred with a heavy loop control, but it is not always necessary.

And finally, the results will almost always depend on your implementation. While doing my research for this article, I found myself trying several implementations of the code I used to illustrate Duff’s device, and I often ended up with pretty different benchmarks each time I made a tiny edit on the code.

My point here is that sometimes Duff’s device is better than a basic basic loop, and sometimes it’s the other way around (even if, most of the time, there is no major difference).

In conclusion, Duff’s device if still worth considering3, but you’ll have to do your own benchmarks to be sure where it is indeed useful. However, Duff’s device does add more verbosity to the code. Even if it’s easy to document (as stated before), you may not have the time (or not want to spend the time) to do benchmarks and consider Duff’s device. It’s up to you.

Thanks for reading and see you next week!

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addendum

Notes

  1. The “cpu_time” mentioned in the chart is an abstract unit of measure, prompted by quick-bench. It has no use on its own, it s only used to compare each benchmark. That’s why the order of magnitude may vary from one line to another. You want to look at the last column.
  2. The results presented here also depends on the implementation of each compute_*(). For instance, if you evaluate (loop_size+3)/4 each loop instead of using a const size_t to store it, GCC results are very different and Duff’s device is no longer significantly the best with -O3.
  3. I’ll just add this note here to remind you one trivial fact: execution time optimization is only worth considering when your code is time-sensitive. If you work on a non-time-sensitive code, you shouldn’t even consider Duff’s device in the first place. When possible, keep it simple, and keep in mind the 80:20 rule.

Pragma: once or twice?

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Context

Header guards

Every C++ developer has been taught header guards. Header guards are a way to prevent a header being included multiple times which would be problematic because it would mean that the variables, function and classes in that header would be defined several times, leading to a compilation error.

Example of a header guard:

#ifndef HEADER_FOOBAR
#define HEADER_FOOBAR

class FooBar
{
    // ...
};

#endif // HEADER_FOOBAR

For those who are not familiar with it, here is how it works: the first time the file is included, the macro HEADER_FOOBAR is not defined. Thus, we enter into the #ifndef control directive. In there, we defined HEADER_FOOBAR and the class FooBar. Later, if we include the file again, since HEADER_FOOBAR is defined, we don’t enter into the #ifndef again, so the class FooBar is not defined a second time.

#pragma once

#pragma is a preprocessor directive providing additional information to the compiler, beyond what is conveyed in the language itself.

Any compiler is free to interpret pragma directive as it wishes. However, over the years, some pragma directives have acquired significant popularity and are now almost-standard (such as #pragma once, which is the topic of this article, or #pragma pack).

#pragma once is a directive that indicates to the compiler to include the file only once. The compiler manages itself how it remembers which files are already included or not.

So, instinctively, we can think that the #pragma once directive does the job of a header guard, but with only one line and without having to think of a macro name.

Today?

In the past, the #pragma once directive was not implemented for every compiler, so it was less portable than header guards.

But today, in C++, I did not find a single instance of compiler that does not implement this directive.

So why bother using header guards anymore? Answer: because of the issue I’m about to describe.

A strange issues with #pragma once

There is actually one kind of issue that can occur with #pragma once that cannot occur with header guards.

Say, for instance, that your header file is duplicated somewhere. This is a not-so-uncommon issue that may have multiple causes:

  • You messed up a merge and your version control system duplicated some files.
  • The version control system messed up the move of some files and they ended up duplicated.
  • Your filesystem has two separate mount points that gives a path to the same files. They all appear as two different sets of files (since they are present on both disks).
  • Someone duplicates one of your file in another part of the project for its personal use, without renaming anything (this is bad manners, but it happens).

(please note that I encountered each of these four issues at some point in my career).

When this happens, when you have the same file duplicated, header guards and #pragma once do not behave the same way:

  • Since the macros that guard each file have the same name, the header guards will work perfectly fine and only include one file.
  • Since, from the FS point of view, files are different, #pragma once will behave as if they are different file, and thus include each file separately. This leads to a compilation error.

Issues with header guards?

Header guards can have issues too. You can have typos in the macro names, rendering your guards useless. That can’t happen with #pragma once, also it’s possible for macro name to clash if they are badly chosen, also can’t happen with #pragma once.

However, these issues can be easily avoided (typos are easy to detect and name clashes are prevented if you have a good naming convention).

A huge benefit though!

There is also a usage of header guards that is very useful for testing and that is not possible with #pragma once.

Say you want to test the class Foo (in file Foo.h) that uses the class Bar (in file Bar.h). But, for testing purpose, you want to stub class Bar.

One option header guards allows you is to create your won mock of class Bar in file BarMock.h. If the mock uses the same headers guards than the original Bar, then in you test, when you include BarMock.h then Foo.h, the header Bar.h will not be included (because the mock is already included and has the same guards).

So, should I use #pragma once or header guards?

This question is a bit difficult to answer. Let’s take a look at the cons of each method:

  • #pragma once are non-standard and are a major issue when you end up in a degraded environment.
  • Header guards may have issues if handled improperly.

In my opinion, #pragma directives are to be avoid when possible. If, in practice, they work, they are not formally standard.

Dear C++20, what about Modules?

Modules, one of the “big four” features of C++20, changes our vision of the “classical build process”. Instead of having source and header files, we can now have modules. They overcome the restrictions of header files and promise a lot: faster build-times, fewer violations of the One-Definition-Rule, less usage of the preprocessor.

Thanks to modules, we can say that #pragma once and header guards issues are no more.

To learn more about modules, check these out:

Wrapping up

This article, talking about pragmas and header guards, targets project that are prior to C++20. If you are in this case and hesitate between #pragma once and header guards, maybe it’s time to upgrade to C++20?

If you can’t upgrade to C++20 (few industrial project can), then choose wisely between #pragma once and header guards,

Thanks for reading and see you next time!

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addendum

Source

[History of C++] The genesis of casting.

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

C-style casts

First of all, to understand the rationale behind the design of C++ casts, I think it’s important to remind you how C-style casts work, both in C and in C++

In C1

In C, you have two ways to cast:

  • You perform a value cast, an arithmetic conversion from a numeral type into another numeral type. You can have data loss if the targeted type is narrower than the origin type (for instance, when you cast float into long, or if you cast long into int).
  • You perform a pointer cast, which converts a pointer of a type into a pointer of another type. This can work well, as in this example (https://onlinegdb.com/sYFCGeZmH), but it can quickly be error-prone, like in this example (https://onlinegdb.com/pWovM17X4) where types are not exactly the same and in this example (https://onlinegdb.com/HHjNS9NSb) where one structure in bigger than the other2.

Though it is a C feature that has its uses and misuses, in the C++ language this is a behavior we want to avoid.

In C++

The C-style cast does not work in exactly C++ the same way it works in C (even though the final behaviors are similar.

When you perform a C-cast in C++, the compiler tries the following cast operation, in order, until it finds something that it can compile:

  1. const_cast
  2. static_cast
  3. static_cast followed by const_cast
  4. reinterpret_cast
  5. reinterpret_cast followed by const_cast

This is a process that is not appreciated by C++ developer (to say the least) because the cast that is performed is not explicit and does not catch potential errors at compile time.

You can find more information about this behavior on the following page: Explicit type conversion – cppreference.com

Run-Time Type Information

Original idea and controversies

What we call Run-Time Type Information, often shortened to RTTI is a mechanism that allows the type of an object to be determined during program execution.

This is used in polymorphism, where you can manipulate an object through its base class interface (thus, you don’t know at compile-time which derived class you are manipulating).

RTTI for C++ was drafted from the earliest version, but its development and implementation were postponed in hope that it would prove unnecessary.

Some people, at that time, raised their voice against the feature, saying that this would need too much support, was to heavy to implement, too expensive, complicated and confusing, “inherently evil” (against the spirit of the language), seen as the beginning of an avalanche of new features, etc.

However, Bjarne Stroustrup finally decided that it was worth implementing, for three reasons: it is important to some people, it is harmless to those who won’t use it, and without it libraries will develop their own RTTI anyway, leading to inconsistency.

In the end, RTTI was implemented in three parts:

  • The dynamic_cast operator, allowing a pointer to a derived class to be obtained from a pointer to the base class — only if the pointer is effectively of the derived class.
  • The typeid operator, allowing identification of the exact type of an object given an object of the base class.
  • The type_info structure, giving additional run-time info on a given type.

Early in the process (and the main reason he decided to wait until RTTI was needed before implementing it), Stroustrup detected numerous misuses of the feature, and some people even labelled it as a “Dangerous feature”.

However, there is a major difference between a feature that can be misused and a feature that will be misused. That difference resides in education, design, testing, etc. But this has a cost, and the real question is: are the benefits of such a dangerous features worth the effort necessary to keep misuses at a anecdotal level?

The final decision was yes: it is worth the shot. But not all developers agreed at that time.

Syntax

Since casts couldn’t be inherently made safe, Stroustrup wanted to provide a syntax that both signaled the use of an unsafe feature and discouraged its use when there were alternatives.

The C++ crew originally considered either using Checked<T*>(p); for run-time checked conversion and Unchecked<T*>(p); for unchecked conversion, or using (virtual T*)p for dynamic cast only.

But in regard the constraints and the fact that dynamic_casts and “standard” casts (which we know call static_cast) are two whole different operations, the old syntax was abandoned in favor to more verbose unary operators. These are the operators we know today, dynamic_cast<T*>(p) and static_cast<T*>(p) (and, later, the other casts).

typeid() and type_info

The first implementation of RTTI only provided dynamic_cast. However, soon people wanted to know more about the types they were dynamically manipulating leading to the creation of typeid() and type_info.

The typeid() method can be called on any polymorphic object and returns a reference to a type_info that holds all the needed information. The reason that it returns a reference and not a pointer is to avoid pointer comparison and arithmetic on it.

type_infos are uncopiable, polymorphic, comparable, sortable (so it can be used in hashmaps and such) and hold the name of the type.

Uses and Misuses

Now, there are two categories of types: those that have type information at run time and those that don’t. It was decided that only be the polymorphic classes, i.e. the classes who can be manipulated though base classes, will hold RTTI.

At first, people wondered if it would not cause some problems (and frustration) because sometimes it is hard to tell (as a developer) if a function is polymorphic or not. But this is not a big issue, because the compiler is able to tell at compile time if the use of typeid and of type_of is illegal or not.

The main issue that was anticipated was the over-use of RTTI where it isn’t needed. For instance, we can expect such code:

void rotate(const Shape& r)
{
    if (typeid(r) == typeid(Circle)) 
    {
        // do nothing
    }
    else if (typeid(r) == typeid(Triangle)) 
    {
        // rotate triangle
    }
    else if (typeid(r) == typeid(Square)) 
    {
        // rotate square
    }
}

However, this is a broken use of RTTI because it does not correctly handle classes derived from the ones mentioned. A better way to do this would be via virtualization.

Another misuse would be with unnecessary type-checking, like in the following code:

Crate* foobar(Crate* crate, MyContainer* cont)
{
    cont->put(crate);

    // do things...

    Object* obj = cont->get();
    Crate* cr = dynamic_cast<Crate*>(obj)
    if (cr)
        return cr;
    // else, handle error
}

Here, we manually check the type of the object in MyContainer, although it would be better to use a templated version of the container, like so:

Crate* foobar(Crate* crate, MyContainer<Crate>* cont)
{
    cont->put(crate);

    // do things...

    return cont->get();
}

Here, no need to check for errors and, most of all, no use of RTTI.

Theses two misuses of C++ RTTI are most commonly performed by developers following the guidelines of other languages (like C, Pascal, etc.) where such code is accepted, even encouraged. But it doesn’t fit the C++ design.

Abandoned features

Here is a list of features that have been considered for the C++ RTTI, but not adopted in the end:

  • Meta-Objects: it replaces the type_info by a mecanism (the meta-object) that can accept (at run time) requests to perform any operation that can be requested of an object of the language. However it embeds an interpreter for the complete language, which is a threat to its efficiency.
  • The Type-inquiry Operator: An alternative to the dynamic_cast was an operator that can say if an object is of a derived class or not. If so, it would allow us to then cast it (old-style) to the derived class in order to use it. However, dynamic_cast and static_cast can both be applied to pointers and hold different result, so we needed to make the distinction, because old-style-casting pointers would not always give us the result we expect. Plus decorrelating the check and the cast can cause mismatch.
  • Type Relations: Using comparison operators (such as < and <=) was suggested, but it was judged “too cute” (meaning it is a non-mathematical interpretation of the operator, giving meaning to an operation that has no mathematical meaning). Plus, this has the same check/cast decorrelation as the type-inquiry operator.
  • Multi-methods: it is the ability to select a virtual function based on more than one object. Such mechanism may be useful to people who develop binary operators. However, at that time, Stroustrup was not familiar with the concept and decided it would be implemented only if needed later.
  • Unconstrained Methods: this is the mechanism that allows a polymorphic object to call any method that could be called, checking at run time if it can effectively be called, handling errors accordingly. However, with the dynamic_cast we can check this ourselves, which is more efficient and type-safe.
  • Checked Initialization: this is the ability to initialize a derived class object from a polymorphic object, checking at run-time if the type actually match. However there was syntax complication, error-handling uncertainties and it can be done using, again, a dynamic_cast.

C++-style casts

Problems and consequencies

The C-style cast is (quoting B.S.) “a sledgehammer”. It means that when you write (B)expr you say “make expr a B, and whatever happens happens.”. This can become very unfortunate when const or volatile qualifiers are involved.

In addition to that, the syntax is simplistic. Hard to see, hard to parse, and you need an overuse of parentheses when you want to use a derived method in a polymorphic context3.

Thus, it was decided to separate the different ways to cast into separate operators. This way, when you write a cast, you write how you want to cast. Plus, this adds some verbosity to the operation which makes parsing easier and warns the reader that a potentially dangerous operation is happening.

Since there are really bad behaviors (from the C++ point of view) in C-style casts, there are C++ specific cast operators that are meant to not be used (to be isolated from “good” cast operators). These behavior are not deprecated from the language because in some specific contexts they can be useful, but they need to be separated from the others so they can not be used by accident and it is obvious when they are used.

The different casting operators

dynamic_cast

I won’t talk much about dynamic_cast, since this operator is covered in the previous section (about RTTI). Just keep in mind that the keyword dynamic_cast is the one associated with the RTTI solution.

dynamic_cast makes a conversion that is checked dynamically, i.e. at run-time. If you want a static check, i.e. at compile-time, you would prefer static_cast.

static_cast

The static_cast can be described as the inverse operation to the implicit conversion. If A can be implicitly converted to B, then B can be static_casted to A. The operator can also perform any conversion that can be implicitly done.

This alone covers the majority of conversion that does not require dynamic type checking.

static_cast respect constness (making it safer than C-style casts) and is static (any error will be detected at compile time).

Whenever it is relevant, the static_casting to a user-defined type seeks any single-parameter constructor that can match the conversion (if you try to statically cast an Foo into a Bar, the compiler will look for the Bar(Foo) constructor) or any relevant cast operator. See user-defined conversion function – cppreference.com for more info.

Also, you cannot perform a static_cast to or from a pointer to an incomplete type (which can be done using another C++-style cast).

reinterpret_cast

The reinterpret_cast holds the “unsafe” part of the C-style cast. With it, you can cast values from a class to another unrelated class, or from and to a pointer to an incomplete type.

This conversion basically reinterprets the argument its given. You can thus convert from pointer to function and from pointer to member.

This is inherently unsafe and must be performed with great caution. Wherever you write or see a reinterpret_cast, you know you must be extra careful. Using reinterpret_cast is almost as unsafe as C-style casts.

reinterpret_cast can easily lead to undefined behavior if not used following a specific set of rules (which you can find on its documentation page: reinterpret_cast conversion – cppreference.com)

For instance: if you use reinterpret_cast to go from one pointer type to another and then dereference that pointer to access it’s content, that’s likely undefined behavior.

const_cast

The goal to this operator is that the const and volatile qualifiers are never silently casted away.

To perform this operation, the source and destination types must be the same, except the const and volatile qualifiers which can differ.

This is a very dangerous operation and must be use with great caution. Always remember that casting away const from an object originally defined as const is undefined behavior.

bit_cast

Not really historical (it was introduced in C++20) but std::bit_cast was basically made to replace the std::memcpy() manual conversion.

The bit_cast can be undefined if there is no value of the destination type corresponding to the value representation produced (just like with memcpy).

Unlike reinterpret_cast, if you to go from one pointer type to another and then dereference that pointer to access it’s content using bit_cast it’s not undefined behavior if you know for sure that those bits are a valid representation of the target type. The difference here is subtle but it allows the compiler to safely make lots of cases work efficiently and do the right thing in more complex cases without invoking undefined behavior. Typical use case is for serialization.

Wrapping up

Historically, the way C-cast operator was split into four C++ operators follows three simple rules:

  • If you need to check the types dynamically, then use dynamic_cast.
  • If you can check the types statically, then use static_cast.
  • In any other case, it is reinterpret_cast or const_cast that you need, but this is very dangerous.

I’ll add to that that, in any situation, do no perform reinterpret_cast or const_cast unless you know know what you are doing. You should never ever perform these cast only because the other ones did not work.

RTTI in its wholeness is a useful –but totally optional– feature. But it is not a simple to master.

In modern C++, we want to perform checks as much as possible at compile time (for security and performance), so when we are able, we want to use static features instead of dynamic ones.

Of course, you should not force static code where dynamic code would be better, but you should always think of a static solution before a dynamic one.

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addenda

Sources

Notes

1. As much as I consider myself an expert in the C++ language, my knowledge of the C language is much more limited. There may be errors in this subsection. If so, please tell me in comments so I can edit the article.

2. You can also cast away the const qualifier through the pointer cast (https://onlinegdb.com/8HIJIeonA) but I don’t think it’s a whole different way to cast.

3. For instance, if px is a pointer to an object of type X (implemented as B) and B is a derived class of X that has a method g. you need to write ((B*)px)->g() to call g from px. A simpler syntax could have been px->B::g().

[History of C++] Templates: from C-style macros to concepts

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Introduction: Parametrized types

Templates are the C++ feature (or group of features, as the word is used in several contexts) that implement parametrized types.

The notion of a parametrized type is very important in modern programming and consists of using a type as a parameter of a feature, which means you can use that feature with different types, the same way you use a feature with different values.

The most simple example is with std::vector. When you declare a vector as such: std::vector<int> foo;, the type int is parametrized. You could have put another type, like double, void* a user-defined class or even another list instead of int.

It is a way to achieve metaprogramming, a programming technique that aims to apply programs on other program data.

For the rest of the article I will use the word “template” to refer to either the notion of parametrized types or the C++ template implementation (unless I want to explicitly make the distinction).

Before Templates

Before the creation of template, in early C++, people still had to writer C-style macros to emulate templates.

One way of doing this is like that:

foobar.h
void foobar(FOOBAR_TYPE my_val);
foobar.cpp
void foobar(FOOBAR_TYPE my_val)
{
    // do stuff
}
main.cpp
#define FOOBAR_TYPE int
#include "foobar.h"
#include "foobar.cpp" // Only do this in a source file
#undef FOOBAR_TYPE

#define FOOBAR_TYPE double
#include "foobar.h"
#include "foobar.cpp" // Only do this in a source file
#undef FOOBAR_TYPE

int main()
{
    int toto = 42;
    double tata = 84;
    foobar(toto);
    foobar(tata);
}

Don’t do this at home, though! This is not something we ought to do nowadays (especially the #include "foobar.cpp" part). Also note that that code takes advantage of function overloading and therefore does not compile in C.

With our modern eyes, this seems very limited and error-prone. But the interesting thing is that even before C++ templates were implemented in early C++ design teams could use the macro approach to gain experience with them.

Timing

Templates were introduced in Release 3.0 of the language, in October 1991. In The Design and Evolution of C++, Stroustrup reveals that it was a mistake to introduce this feature so late, and in retrospect it would have been better to do so in Release 2.0 (June, 1989) at the expense of less important features, like multiple inheritance:

Also, adding multiple inheritance in Release 2.0 was a mistake. Multiple inheritance belongs in C++, but is far less important than parameterized types — and to some people, parameterized types are again less important than exception handling.

Bjarne Stroustrup, The Design and Evolution of C++, chapter 12: Multiple Inheritance, Β§1 – Introduction

From today, it is clear that Stroustrup was right and that templates have impacted the scenery of C++ much, much more than multiple inheritance.

This addition came late because it was really time-consuming to explore the design and implementation issues.

Needs and goals

The original need for templates was to express parametrization of container classes. But to do that job, macros were too limited. They fail to obey scope and type rules and don’t interact well with tools (especially debuggers). Before C++ templates, it was very hard to maintain code that used parametrized types, you needed the lowest level of abstraction and you needed to add each parametrized type manually.

The first concerns regarding templates were whether templates would be easy to use and templated objects be as easy to use as hand-coded objects, whether the compilation and linking speed would be significantly impacted and whether they would be portable.

The build process of templates

Syntax

The angle brackets

Designing the syntax of a feature is not an easy job, and requires extensive discussion and refinement.

The choice of the brackets <...> for the template parameter was made because, even though parentheses would have been easier to parse, they are overused in C++ and brackets are (empirically) more pleasant for a human reader.

However, this causes a problem for nested brackets, such as:

List<List<int>> a;

In the code above, in earlier C++, you would get a compilation error. The closing >> are seen by the compiler as operator>> and not two closing brackets.

A lexical trick was added later in the language (in C++141) so that this was not seen as a syntax error anymore.

The template argument

Initially, the template argument would have been placed just after the object name:

class Foo<class T>
{
    // ...
};

However, that caused two major issues:

  • It is a bit too hard to read for parsers and humans. Since the template syntax in nested within the syntax of the class, it is a bit tough to detect.
  • In the case of function templates, the templated type can be used before it is declared. For instance, in this declaraction: T at<class T>(const std::vector<T>& v, size_t index) { return v[index]; }, since T is the return type it is parsed before we even know it is a template parameter.

Both issues are resolved if we put the template argument before the declaration, and this is what was done:

template<class T> class Foo
{
    // ...
};

template<class T> T at(const std::vector<T>& v, size_t index) { return v[index]; }

Constraints of template parameters

In C++, the constraints on template arguments are implicit2.

The dilemma over if the constraints should be explicit in the template argument (like below) or if they should be deduced from the usage occurred. An example of such explicit constraint would be like this:

template < class T {
        int operator==(const T&, const T&); 
        T& operator=(const T&);
        bool operator<(const T&, int);
    };
>
class Foo {
    // ...
};

But this was judged to be way to verbose to be readable and it would require way more templates for the same number of features. Moreover, this kind of over-restricts the class you’re implementing, giving constraints that excludes some implementations that would have been perfectly fine and correct without them3.

However, having explicit constraints was not off the table, but it is just that function type is a too specific way to express this.

This could have been achieved through derivation: by specifying that your templated type must derive from another class, you can have explicit constraint on this type.

template <class T>
class TBase {
    int operator==(const T&, const T&); 
    T& operator=(const T&);
    bool operator<(const T&, int);
};

template <class T : TBase>
class Foo {
    // ...
};

However this generates more issues. The programmers are, because of that, encouraged to express constraints as classes, leading to an overuse of inheritance. There is a loss in expressivity and semantics, because “T must have be comparable to an int” become “T must inherit from TBase”. In addition to that, you could not express constraints on type that can’t have a base class, like ints and doubles.

This is mainly why we did not have explicit constraints on template parameters in C++4 for a long time.

However, the discussion on template constraints was revived in the late 2010s and a new notion made its appearance in C++20: Concepts (c.f. Modern evolutions – Concepts below).

Templated object generation

How templates are compiled is very simple: for every set of template parameters that is used on the templated object, the compiler will generate a implementation of this object using explicitly those parameters.

So, writting this:

template <class T> class Foo { /* ... do things with T ... */ };
template <class T, class U> class Bar { /* ... do things with T  and U... */ };

Foo<int> foo1;
Foo<double> foo2;
Bar<int, int> bar1;
Bar<int, double> bar2;
Bar<double, double> bar3;
Bar< Foo<int>, Foo<long> > bar4;

Is the same thing as writing this:

class Foo_int { /* ... do things with int ... */ };
class Foo_double { /* ... do things with double ... */ };
class Foo_long { /* ... do things with long ... */ };
class Bar_int_int { /* ... do things with int  and int... */ };
class Bar_int_double { /* ... do things with int  and double... */ };
class Bar_double_double { /* ... do things with double  and double... */ };
class Bar_Foo_int_Foo_long { /* ... do things with Foo_int  and Foo_long... */ };

Foo_int foo1;
Foo_double foo2;
Bar_int_int bar1;
Bar_int_double bar2;
Bar_double_double bar3;
Bar_Foo_int_Foo_long bar4;

… only it is more verbose (even more in real code) and less readable.

Class templates

Templates were imagined and designed primarily for classes, mostly to allow for the implementation of standard containers. They were designed to be as simple to use as standard classes and as efficient as macros. These two facts were decided so that low-level arrays could be abandoned when they were not specifically needed (in low-level programming) and that templatized containers would be preferred in the higher levels.

In addition to type argument, template can take non-type argument, like this:

template <class T, int Size>
class MyContainer {
    T m_collection[Size];
    int m_size;
public:
    MyContainer(): m_size(Size) {}
    // ...
};

This was introduced in order to allow for static sizing of containers. Carrying the size in the type information make the implementations more efficient, because you don’t have to track it separately and you don’t loose it through pointer decay as you do with C-style arrays.

class Foo;

int main()
{
    Foo[700] fooTable; // low-level container
    MyContainer<Foo, 700> fooCnt; // high-level container, as efficient as the previous one
}

Function templates

The idea of function templates comes from the need of having templated class methods and from the idea that function templates are in the logical extension of class templates.

Today, the most obvious examples we can provide are the STL algorithms (std::find(), std::find_first_of() , std::merge(), etc.). Though at its creation, the STL algorithms did not exist, it was these kind of functions that inspired function templates (the most symbolic being sort()).

The main issue with function templates was deducing the function template arguments and return type, so we don’t have to explicitly specify them at each function call-site.

In this context, it has been decided that template argument could be deduced (when possible) and specified (when needed). This was extremely useful to specify return values, because they can not always be deduced, such as in this example:

template <class TTo, class TFrom>
TTo convert(TFrom val)
{
    return val;
}

int main()
{
    int val = 4;
    convert(val); // Error: TTo is ambiguous
    convert<double, int>(val) // Correct: TTo is double; TFrom is int
    convert<double>(val) // Correct: TTo is double; TFrom is int; 
}

As you can see on line 12, the trailing template arguments can be omitted as can be trailing function arguments (when they have a default value).

The way templates are generated (see section Templated objects generation section above) works perfectly fine with function overloading. The only subtlety is when there are both non-templated and templated overloads of a function. Then, the non-templated overload is called if there is a perfect match, else it will be the templated overload that will be called if there is a perfect match possible, else we apply ordinary overload resolution.

Template instantiation

At the beginning, explicit template instantiation was not intended. It was because it would create hard issues in some specific circumstances, like if two unrelated parts of a program both request the same instantiation of a templated object, which would have to be done without code replication and without disturbing dynamic linking. This is why implicit templates instantiation was preferred at first.

The first automatic implementation for template instantiation was as follows: when the linker is run, it searches for missing template instantiations. If found, the compiler is invoked again to produce the missing code. This process is repeated until there are no more missing template instantiations.

However, this process had several problems, one of them being very poor compile and link performance.

It is to mitigate this that optional explicit template instantiation is allowed.

The development of the template instantiation process had many more issues, such as the point of instantiation (the “name problem”, i.e. pinpointing which declaration the names used in a template definition refer to), dependencies problems, solving ambiguities, etc. Discussing all of these would require a dedicated article.

Modern evolutions

Templates are a feature that continued to evolve even as we entered the Modern C++ era (beginning with C++11).

Variadic templates

Variadic templates are templates that have at least one parameter pack. In C++, a parameter pack is a way to say that a function or a template has a variable number of arguments.

For example, the following function declaration uses a parameter pack:

void foobar(int args...);

And can be called with any number of argument (greater than one).

foobar(1);
foobar(42, 666);
foobar(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);

Variadic templates allows you to have a variable number of arguments, that can be of different types.

With that, we can write more generic functions. For instance:

#include <iostream>

struct MyLogger 
{
    static int log_counter;

    template<typename THead>
    static void log(const THead& h)
    {
        std::cout << "[" << (log_counter++) << "] " << h << std::endl;
    }

    template<typename THead, typename ...TTail>
    static void log(const THead& h, const TTail& ...t)
    {
        log(h);
        log(t...);
    }
};

int MyLogger::log_counter = 0;

int main()
{
    MyLogger::log(1,2,3,"FOO");
    MyLogger::log('f', 4.2);
}

This generates the following output:


[0] 1
[1] 2
[2] 3
[3] FOO
[4] f
[5] 4.2

It is safe to assume that a significant motivation for the addition variadic templates and parameter packs was to be able to allow the implementation more generic functions, even if it may lead to voluminous generated code (for instance, in the previous example, the MyLogger class has 8 instantiations of the function log5).

Full details are available here: Parameter pack(since C++11) – cppreference.com.

Concepts

Concepts are a C++20 feature that aims to give the developer a way to declare constraints over template parameters. This leads to clearer code (with a higher level of abstraction) and clearer error message (if any).

For instance, here is a concept declaration:

template<typename T_>
concept Addable = requires(T_ a, T_ b)
{
    a + b;
};

And here are example of its usage:

template<typename T_>
requires Addable<T_>
T_ foo(T_ a, T_ b);

template<typename T_>
T_ bar(T_ a, T_ b) requires Addable<T_>;

auto l = []<typename T_> requires Addable<T_> (T_ a, T_ b) {};

Before that, template-related error were barely readable. Concepts were a highly anticipated feature of C++20.

A good overview of concepts can be found on Oleksandr Koval’s blog: All C++20 core language features with examples | Oleksandr Koval’s blog (oleksandrkvl.github.io)

Deduction guides

Template deduction guides are a C++17 feature and are patterns associated with a templated object that tell the compiler how to translate a set of parameter (and their types) into template arguments.

For instance:

template<typename T_>
struct Foo
{
  T_ t;
};
 
Foo(const char *) -> Foo<std::string>;
 
Foo foo{"A String"};

In this code, the object foo is a Foo<std::string> and not a Foo<const char*>, and thus foo.t is a std::string. Thanks to the deduction guide, the compiler understand that3 when we use a const char*, we want to use the std::string instantiation of the template.

This is peculiarly useful for object such as vectors, which can have this kind of constructor:

template<typename Iterator> vector(Iterator b, Iterator e) -> vector<typename std::iterator_traits<Iterator>::value_type>;

This way, if we call the vector constructor with an iterator, the compiler will be able to deduce the templated parameter of the vector.

Substitution Failure Is Not An Error

Substitution Failure Is Not An Error, SFINAE for short, is a rule that applies during template overloaded function resolution.

It basically means that if the (deduced or explicitly specified) type for the template parameter fails, the specialization is discarded instead of causing a compile error.

For instance, take the following code:

struct Foo {};
struct Bar { Bar(Foo){} }; // Bar can be created from Foo
 
template <class T>
auto f(T a, T b) -> decltype(a+b); // 1st overload
 
Foo f(Bar, Bar);  // 2nd overload
 
Foo a, b;
Foo x3 = f(a, b);

Instinctively, we could think that this is the first overload that is called on the highlighted line (because the template instantiation using Foo as T is a better overload that the second one, which requires a conversion).

However, the expression (a+b) is ill-formed with Foo. Instead of generating an error, the overload auto f(Foo a, Foo b) -> decltype(a+b); is discarded. Thus, this is the other overload that is called, with an implicit conversion.

This kind of substitution occurs in all types used in the function type, all types used in the template parameter declarations. Since C++11, it also occurs in all expressions used in the function type and all expressions used in a template parameter declaration. Since C++20, it also occurs in all expressions used in the explicit specifier.

The full documentation about SFINAE can be found here: SFINAE – cppreference.com.

Other features in C++20

Templates continue to evolve. Here are a small list of the C++20 templates feature I couldn’t fit in this article:

  • Template parameter list for generic lambdas. Sometimes generic lambdas are too generic. C++20 allows to use familiar template function syntax to introduce type names directly.
  • Class template argument deduction for aggregates. In C++17 to use aggregates with class template argument deduction we need explicit deduction guides, that’s unnecessary now.
  • Class types in non-type template parameters. Non-type template parameters now can be of literal class types.
  • Generalized non-type template parameters. Non-type template parameters are generalized to so-called structural types.
  • Class template argument deduction for alias templates. Class template argument deduction works with type aliases now.

Exceptions and templates: two sides of the same coin

I did not talk about exceptions in this article, but for Stroustrup, exceptions and templates are complementary features:

To my mind, templates and exceptions are two sides of the same coin: templates allow a reduction in the number of run-time errors by extending the range of problems handled by static type checking; exceptions provide a mechanism for dealing with the remaining run-time errors. Templates make exception handling manageable by reducing the need for run-time error handling to the essential cases. Exceptions make general template-based libraries manageable by providing a way for such libraries to report errors.

Bjarne Stroustrup, The Design and Evolution of C++, chapter 15: Templates, Β§1 – Introduction

So, by design, templates and exception are closely intermingled, in addition to raising the level of abstraction for error-handling.

However, exception and templates (especially templates) have evolved greatly since then, so I think this may not be true anymore.

Wrapping up

In my opinion, templates are the biggest fish in the C++ metaphorical pond. We will never talk enough about them, and I suspect they will continue to evolve for decades.

This is so because in modern C++ one of the key idea is to write intentions instead of actions. We want higher levels of abstraction and more metaprogramming. It is only normal that template are at the hearts of the modern evolutions of the language.

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addenda

Sources

Notes

1. I managed to locate this change in the GCC compiler at release 6 (https://godbolt.org/z/vndGdd7Wh), suggesting that this indeed occurred with C++14. I managed to see the same thing with the clang compiler at release 6 (https://godbolt.org/z/ssfxvb4cM), proving this right.

2. This is called duck-typing, from the saying if it looks like a duck, swims like a duck and it quacks like a duck then it probably is a duck.

3. I have no concrete example to provide and I’m pretty much paraphrasing Stroustrup in his retrospection, but the idea is that by having user-defined constraints, you close some doors that you didn’t even know existed and that others could have exploited. I’ve done and seen very interesting things using templates, and the fact that the only constraint we have is that the templated code makes sense with their given parameters opens up as many possibilities as we can imagine.

4. There were other tries to imagine a way to specify constraints, but to no avail. More details in section Β§15.4 of Stroustrup: The Design and Evolution of C++.

5. These instantiations are (and according to the assembly – Compiler Explorer (godbolt.org)):

  • log(int);
  • log(char[4]);
  • log(char);
  • log(double);
  • log(int,int,int,char[4]);
  • log(int,int,char[4]);
  • log(int,char[4]);
  • log(char,double);

[History of C++] Explanation on why the keyword `class` has no more reason to exist

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Introduction to the new concept : History of C++

A few months back (at the start of this blog) I was thinking about interesting things you can find in C++, then I realized one thing: the keyword class doesn’t really have a good reason to exists!

This may seem a bit harsh said that way, but I will explain my statement later in the article.

Anyway, studying the reason for the word class to exist lead me to look into the history of the language. It was very interesting and I really enjoyed it.

So I decided to write a mini-series of articles about the history of C++ aiming to present concepts that may be outdated today, in C++20, and explain why some strange or debatable things were made that way in the past.

Sources

For this miniseries, I have three main sources:

I have a few things to say about them.

First, Sibling Rivalry: C and C++ and A History of C++: 1979-1991 are both fully available on Stroustrup’s website (just follow the links).

Second, I find it quite sad that I could only find sources from one author. Sure, Bjarne Stroustrup is probably the best individual to talk about his own work, but I would have like the insight of other authors (if you know any, please tell me in the comments).

Why the keyword class could simply not exist in C++20?

Today, in C++20, we have two very similar keywords that work almost exactly the same: class and struct.

There is one and only one difference between class and struct: by default, the members and the inheritance of struct are public, but in those of class are private.

Here are three reason as to why this tiny difference is not worth a different keyword1:

  • In practice, the default access modifier is almost never used, in my experience. Most developers don’t use default access modifier and prefer to specify it.
  • In 2021, a good code is a clear code. Explicitly writing the access specifier is -in that regard- better than leaving it implicit. This may be arguable on solo or small projects, but when you start to develop with many peers, it is better to write a few more characters to be sure that the code is clear for everyone.
  • Having two keywords is more ambiguous than having one. I have very often encountered developers that think there are more differences than that between class and struct, and they have sometimes even told me what they thought these additional differences were. If there was only one keyword, this confusion wouldn’t exist.

I can hear that some people have counter-arguments. The ones I heard a lot are:

  • This is syntactic sugar2.
  • They actually use the implicit access specifier3.
  • There is semantics behind the use of each keyword that go beyond the sole technical meaning4.

All in all, what I’m trying to say is that the language would practically be the same if the keyword class did not exist. So, in the mindset of C++20, we could ask ourselves “what is the purpose of adding a keyword that is neither needed nor useful?”.

I know one thing: that class is one of the oldest C++-specific keywords. Let’s dive into the history of C++ to understand why it exists.

History of the keyword class

Birth

The first official appearance of the keyword class was in Classes: An Abstract Data Type Facility for the C language (Bjarne Stroustrup, 1980), and was actually not talking about C++, but about what we call C with classes.

What is “C with classes”? I think I’ll dedicate a whole article to this subject so I’ll keep it short here. It is C++ immediate predecessor, started in 1979. The original goal was to add modularity to the C language, inspired by Simula5 classes. At first, it was a mere tool, but soon evolved to a whole language. However, since C with Classes was a mild success while needing continuous support, Bjarne decide to abandon it and create a new language, using his experience of C with Classes, and that aimed to be more popular. He called this new language C++.

The choice of the word “class” directly comes from Simula, and the fact that Stroustrup dislikes inventing new terminology.

You’ll find more about C with Classes in the book The Design and Evolution Of C++ (Bjarne Stroustrup), where a whole section is dedicated to it.

So the keyword class was actually born within the predecessor of C++. In terms of design, it’s one of the oldest concepts and even the motivation behind the creation of C with classes.

Original difference between struct and class

In C with Classes, struct and class and quite different.

Structures works just like in C, they are simple data structure, whereas it is within classes that the concept of methods are introduced.

At that time, there was a real difference between structures and classes, thus the distinction6.

Into C++

The two of the greatest features of early C++ were virtual functions and functions overloading.

In addition to that, namespaces rules where introduced to define on how scopes names would behave. Among those rules:

  • Names are private unless they are explicitly declared public.
  • A class is a scope (implying that classes nest properly).
  • C structure names don’t nest (even when they are lexically nested).

These rules make structures and classes behave differently in terms of scopes and names.

For instance, this was legal:

struct outer {
    struct inner {
        int i;
    };
};

struct inner a = { 1 };

But replacing struct with class provoked a compilation error.

In later C++, the code above doesn’t compile (it needs to be outer::inter a = {1};).

“Fusion” with the keyword struct

It’s difficult to say when this occurred specifically, because none of the sources I found clearly state “This is when structures and classes became the same concept, but we can investigate.

According to The C++ Programming Language – Reference Manual (Bjarne Stroustrup, 1984), the first published document specifically about C++:

(listing derived types)

classes containing a sequence of objects of various types, a set of functions for manipulating these objects, and a set of restrictions on the access of these objects and function;

structures which are classes without access restrictions;

Bjarne Stroustrup, The C++ Programming Language – Reference Manual, Β§4.4 Derived types

Moreover, if we take a look at the feedback Stroustrup gives about virtual functions and the object layout model (concepts introduced in 1986):

At his point, the object model becomes real in the sense that an object is more than the simple aggregation of the data members of a class. […] Then why did I not at this point choose to make structs and classes different notions?

My intent was to have a single concept: a single set of layout rules, a single set of lookup rules, a single set of resolution rules, etc. […]

Bjarne Stroustrup, The Design and Evolution of C++, Β§3.5.1 The Object Layout Model

Even though it seems that structures couldn’t hold private members at that time (the private keyword didn’t even exist then!), we can safely say this was the moment where structures and classes were “fused”.

It was at the creation of C++ that the structures from C and the classes from Simula merged together.

But when did they actually become the same?

But technically, structs and classes became what they are today the moment the keyword private was invented. It seems that happened the same time the keyword protected was introduced, for Release 1.2 in 1987.

From then to now

Despite all that, despite the fact that class is technically useless, there is a lot more to it.

The keyword class has acquired semantics.

Indeed, nowadays, writing the keyword class means that you are implementing a class that is not a data bucket, whereas the keyword struct is mainly used for data buckets and similar data structures. In their technicity, these words do not differ, but because of usage, because these keywords have history, they acquired more meaning than the sole technical meaning.

Go see Jonathan Boccara’s very relevant article: The real difference between struct and class – Fluent C++ (fluentcpp.com) for more details. This article is inspired from the Core Guidelines.

The fact that nowadays’ class has lived more than forty years makes it very different from the class from 1980 and the class that it would be if it were introduced in 2020.

But the question that immediately pops up in my head is the following: should we continue to use class this way? Should we stick to the semantics it acquired or should we seek evolution towards a more modern meaning? The answer is simple: it’s up to each of us. We, C++ devs, are the ones that make the language evolve, every day, in every single line of code.

The Core Guidelines tells us how we should use each feature today, but maybe tomorrow someone (you?) will come up with a better, safer, clearer way to code. Understanding what were structs and classes in the past and what they are today is the first step to define what they will be tomorrow.

Wrapping up

The best way to summarize the history of these two keyword is this way: “The structures from C and the classes from Simula merged together at the creation of C++.”, but we can also say that thanks to that, despite representing the same feature, the both have different meaning.

This article is not a pamphlet against class and I will not conclude this article with an half-educated half-authoritative argument like I often do7. Instead, I will tell you that I realized how important it is to contextualize articles like the ones I publish on this blog with the version of C++ that is used and the articles and books that are used as examples and inspirations.

I think it s important to understand history to be able to judge the practices we have today. Do we do something out of habit or is there a real advantage to it? It s a question that needs to be asked every day, or else we’ll end up writing outdated code in an outdated mindset.

How C++ developers think8 evolves from decade to decade. Each eon, developers have different mindsets, different goals, different issues, a different education and so on… I don’t blame how people coded in the past, but I do blame those who code now like we coded a decade ago, and in the future I hope to see my peers blaming me when I code in “old C++”.

Thanks for reading and see you next week!

Author: ChloΓ© Lourseyre
Editor: Peter Fordham

Addenda

To go further

Here are two thoughts that are a bit off-topic.

Namespaces in C

Something that wasn’t inherited from C into C++ was the struct namespace.

Indeed, in C, the namespace containing the struct names isn’t the same as the gobal C namespace. In C struct foo and foo aren’t refereeing to the same object. In C++, assuming that foo is a structure, struct foo and foo are the same name.

There is a way, in C, to link these two namespace, using typedef. To learn more about that, read this article: How to use the typedef struct in C (educative.io)

The class keyword in templates

Maybe have you already seen this syntax:

template <class C_>
void foo(C_ arg)
{
    // ...
}

What does class mean in this context?

It actually means nothing more that “a type, whatever which one”.

This is a bit confusing, because we may think that, as type, C_ is supposed to be a class. But it is not. Later, the typename keyword was introduced to lift this confusion, but we can still use class if we want to.

Annotations

1. Since struct and class are so similar, I choose to consider class to be the keyword in excess, simply because struct exists in C and not class, and that it is the process of the keyword class that brought them both so close.

2. This argument is not true. By essence, syntactic sugar is supposed to make the code easier to read or write. Since struct/class is just a substitute of a keyword for another, there is no gain in clarity whatsoever. The only reason the code may seem easier to read is because of the semantics these keyword hold, but not the syntax itself.

3. Yes, I know, some people actually use the implicit. I do too. But what you forget when you say such a thing is that most of the software industry doesn’t think like you nor write code like you. The statement of the struct/class duplicity comes from an empiric fact. Our own individual practices can never be an argument against that fact.

4. While this is factually true, the reasoning is upside-down. It’s because of their history that they have different semantic meaning. If they were created today, from nothing, they would not have that semantics, only their duplicity. I’ll talk about that near the end of the article.

5. Simula is the name of two simulation programming languages, Simula I and Simula 67, developed in the 1960s at the Norwegian Computing Center in Oslo, and is considered the first object-oriented programming language. Simula is very unknown amongst the community of developers, but has greatly influenced other famous languages. Simula-type objects are reimplemented in Object Pascal, Java, C#, and, of course, C++.

6. At that time, you could emulate any structure with a class, but it was still interesting, all the more in the mindset of then, to make the distinction.

7. I tend to always agree with the C++ Core Guidelines (isocpp.github.io), even if I am always trying to be critical as of our habits and practices. But keep in mind that the guidelines we have today may be different from the one we’ll have tomorrow.

8. I think this statement is actually true for every languages, but C++ is the perfect archetype as it is one of the oldest most-used language on the market, today in 2021.

Sources

In order of appearance:

Off-topic: Feedspot’s Top 30 C++ Programming Blogs and Websites

Recently, Feedspot made a top-list of the best C++ programming blogs and website, and Belay the C++ ended up in 9th position!

You can find the top-list here: Top 30 C++ Programming Blogs and Websites You Must Follow in 2021 (feedspot.com)