[History of C++] The genesis of casting.

Author: Chloé Lourseyre
Editor: Peter Fordham

C-style casts

First of all, to understand the rationale behind the design of C++ casts, I think it’s important to remind you how C-style casts work, both in C and in C++

In C1

In C, you have two ways to cast:

  • You perform a value cast, an arithmetic conversion from a numeral type into another numeral type. You can have data loss if the targeted type is narrower than the origin type (for instance, when you cast float into long, or if you cast long into int).
  • You perform a pointer cast, which converts a pointer of a type into a pointer of another type. This can work well, as in this example (https://onlinegdb.com/sYFCGeZmH), but it can quickly be error-prone, like in this example (https://onlinegdb.com/pWovM17X4) where types are not exactly the same and in this example (https://onlinegdb.com/HHjNS9NSb) where one structure in bigger than the other2.

Though it is a C feature that has its uses and misuses, in the C++ language this is a behavior we want to avoid.

In C++

The C-style cast does not work in exactly C++ the same way it works in C (even though the final behaviors are similar.

When you perform a C-cast in C++, the compiler tries the following cast operation, in order, until it finds something that it can compile:

  1. const_cast
  2. static_cast
  3. static_cast followed by const_cast
  4. reinterpret_cast
  5. reinterpret_cast followed by const_cast

This is a process that is not appreciated by C++ developer (to say the least) because the cast that is performed is not explicit and does not catch potential errors at compile time.

You can find more information about this behavior on the following page: Explicit type conversion – cppreference.com

Run-Time Type Information

Original idea and controversies

What we call Run-Time Type Information, often shortened to RTTI is a mechanism that allows the type of an object to be determined during program execution.

This is used in polymorphism, where you can manipulate an object through its base class interface (thus, you don’t know at compile-time which derived class you are manipulating).

RTTI for C++ was drafted from the earliest version, but its development and implementation were postponed in hope that it would prove unnecessary.

Some people, at that time, raised their voice against the feature, saying that this would need too much support, was to heavy to implement, too expensive, complicated and confusing, “inherently evil” (against the spirit of the language), seen as the beginning of an avalanche of new features, etc.

However, Bjarne Stroustrup finally decided that it was worth implementing, for three reasons: it is important to some people, it is harmless to those who won’t use it, and without it libraries will develop their own RTTI anyway, leading to inconsistency.

In the end, RTTI was implemented in three parts:

  • The dynamic_cast operator, allowing a pointer to a derived class to be obtained from a pointer to the base class — only if the pointer is effectively of the derived class.
  • The typeid operator, allowing identification of the exact type of an object given an object of the base class.
  • The type_info structure, giving additional run-time info on a given type.

Early in the process (and the main reason he decided to wait until RTTI was needed before implementing it), Stroustrup detected numerous misuses of the feature, and some people even labelled it as a “Dangerous feature”.

However, there is a major difference between a feature that can be misused and a feature that will be misused. That difference resides in education, design, testing, etc. But this has a cost, and the real question is: are the benefits of such a dangerous features worth the effort necessary to keep misuses at a anecdotal level?

The final decision was yes: it is worth the shot. But not all developers agreed at that time.

Syntax

Since casts couldn’t be inherently made safe, Stroustrup wanted to provide a syntax that both signaled the use of an unsafe feature and discouraged its use when there were alternatives.

The C++ crew originally considered either using Checked<T*>(p); for run-time checked conversion and Unchecked<T*>(p); for unchecked conversion, or using (virtual T*)p for dynamic cast only.

But in regard the constraints and the fact that dynamic_casts and “standard” casts (which we know call static_cast) are two whole different operations, the old syntax was abandoned in favor to more verbose unary operators. These are the operators we know today, dynamic_cast<T*>(p) and static_cast<T*>(p) (and, later, the other casts).

typeid() and type_info

The first implementation of RTTI only provided dynamic_cast. However, soon people wanted to know more about the types they were dynamically manipulating leading to the creation of typeid() and type_info.

The typeid() method can be called on any polymorphic object and returns a reference to a type_info that holds all the needed information. The reason that it returns a reference and not a pointer is to avoid pointer comparison and arithmetic on it.

type_infos are uncopiable, polymorphic, comparable, sortable (so it can be used in hashmaps and such) and hold the name of the type.

Uses and Misuses

Now, there are two categories of types: those that have type information at run time and those that don’t. It was decided that only be the polymorphic classes, i.e. the classes who can be manipulated though base classes, will hold RTTI.

At first, people wondered if it would not cause some problems (and frustration) because sometimes it is hard to tell (as a developer) if a function is polymorphic or not. But this is not a big issue, because the compiler is able to tell at compile time if the use of typeid and of type_of is illegal or not.

The main issue that was anticipated was the over-use of RTTI where it isn’t needed. For instance, we can expect such code:

void rotate(const Shape& r)
{
    if (typeid(r) == typeid(Circle)) 
    {
        // do nothing
    }
    else if (typeid(r) == typeid(Triangle)) 
    {
        // rotate triangle
    }
    else if (typeid(r) == typeid(Square)) 
    {
        // rotate square
    }
}

However, this is a broken use of RTTI because it does not correctly handle classes derived from the ones mentioned. A better way to do this would be via virtualization.

Another misuse would be with unnecessary type-checking, like in the following code:

Crate* foobar(Crate* crate, MyContainer* cont)
{
    cont->put(crate);

    // do things...

    Object* obj = cont->get();
    Crate* cr = dynamic_cast<Crate*>(obj)
    if (cr)
        return cr;
    // else, handle error
}

Here, we manually check the type of the object in MyContainer, although it would be better to use a templated version of the container, like so:

Crate* foobar(Crate* crate, MyContainer<Crate>* cont)
{
    cont->put(crate);

    // do things...

    return cont->get();
}

Here, no need to check for errors and, most of all, no use of RTTI.

Theses two misuses of C++ RTTI are most commonly performed by developers following the guidelines of other languages (like C, Pascal, etc.) where such code is accepted, even encouraged. But it doesn’t fit the C++ design.

Abandoned features

Here is a list of features that have been considered for the C++ RTTI, but not adopted in the end:

  • Meta-Objects: it replaces the type_info by a mecanism (the meta-object) that can accept (at run time) requests to perform any operation that can be requested of an object of the language. However it embeds an interpreter for the complete language, which is a threat to its efficiency.
  • The Type-inquiry Operator: An alternative to the dynamic_cast was an operator that can say if an object is of a derived class or not. If so, it would allow us to then cast it (old-style) to the derived class in order to use it. However, dynamic_cast and static_cast can both be applied to pointers and hold different result, so we needed to make the distinction, because old-style-casting pointers would not always give us the result we expect. Plus decorrelating the check and the cast can cause mismatch.
  • Type Relations: Using comparison operators (such as < and <=) was suggested, but it was judged “too cute” (meaning it is a non-mathematical interpretation of the operator, giving meaning to an operation that has no mathematical meaning). Plus, this has the same check/cast decorrelation as the type-inquiry operator.
  • Multi-methods: it is the ability to select a virtual function based on more than one object. Such mechanism may be useful to people who develop binary operators. However, at that time, Stroustrup was not familiar with the concept and decided it would be implemented only if needed later.
  • Unconstrained Methods: this is the mechanism that allows a polymorphic object to call any method that could be called, checking at run time if it can effectively be called, handling errors accordingly. However, with the dynamic_cast we can check this ourselves, which is more efficient and type-safe.
  • Checked Initialization: this is the ability to initialize a derived class object from a polymorphic object, checking at run-time if the type actually match. However there was syntax complication, error-handling uncertainties and it can be done using, again, a dynamic_cast.

C++-style casts

Problems and consequencies

The C-style cast is (quoting B.S.) “a sledgehammer”. It means that when you write (B)expr you say “make expr a B, and whatever happens happens.”. This can become very unfortunate when const or volatile qualifiers are involved.

In addition to that, the syntax is simplistic. Hard to see, hard to parse, and you need an overuse of parentheses when you want to use a derived method in a polymorphic context3.

Thus, it was decided to separate the different ways to cast into separate operators. This way, when you write a cast, you write how you want to cast. Plus, this adds some verbosity to the operation which makes parsing easier and warns the reader that a potentially dangerous operation is happening.

Since there are really bad behaviors (from the C++ point of view) in C-style casts, there are C++ specific cast operators that are meant to not be used (to be isolated from “good” cast operators). These behavior are not deprecated from the language because in some specific contexts they can be useful, but they need to be separated from the others so they can not be used by accident and it is obvious when they are used.

The different casting operators

dynamic_cast

I won’t talk much about dynamic_cast, since this operator is covered in the previous section (about RTTI). Just keep in mind that the keyword dynamic_cast is the one associated with the RTTI solution.

dynamic_cast makes a conversion that is checked dynamically, i.e. at run-time. If you want a static check, i.e. at compile-time, you would prefer static_cast.

static_cast

The static_cast can be described as the inverse operation to the implicit conversion. If A can be implicitly converted to B, then B can be static_casted to A. The operator can also perform any conversion that can be implicitly done.

This alone covers the majority of conversion that does not require dynamic type checking.

static_cast respect constness (making it safer than C-style casts) and is static (any error will be detected at compile time).

Whenever it is relevant, the static_casting to a user-defined type seeks any single-parameter constructor that can match the conversion (if you try to statically cast an Foo into a Bar, the compiler will look for the Bar(Foo) constructor) or any relevant cast operator. See user-defined conversion function – cppreference.com for more info.

Also, you cannot perform a static_cast to or from a pointer to an incomplete type (which can be done using another C++-style cast).

reinterpret_cast

The reinterpret_cast holds the “unsafe” part of the C-style cast. With it, you can cast values from a class to another unrelated class, or from and to a pointer to an incomplete type.

This conversion basically reinterprets the argument its given. You can thus convert from pointer to function and from pointer to member.

This is inherently unsafe and must be performed with great caution. Wherever you write or see a reinterpret_cast, you know you must be extra careful. Using reinterpret_cast is almost as unsafe as C-style casts.

reinterpret_cast can easily lead to undefined behavior if not used following a specific set of rules (which you can find on its documentation page: reinterpret_cast conversion – cppreference.com)

For instance: if you use reinterpret_cast to go from one pointer type to another and then dereference that pointer to access it’s content, that’s likely undefined behavior.

const_cast

The goal to this operator is that the const and volatile qualifiers are never silently casted away.

To perform this operation, the source and destination types must be the same, except the const and volatile qualifiers which can differ.

This is a very dangerous operation and must be use with great caution. Always remember that casting away const from an object originally defined as const is undefined behavior.

bit_cast

Not really historical (it was introduced in C++20) but std::bit_cast was basically made to replace the std::memcpy() manual conversion.

The bit_cast can be undefined if there is no value of the destination type corresponding to the value representation produced (just like with memcpy).

Unlike reinterpret_cast, if you to go from one pointer type to another and then dereference that pointer to access it’s content using bit_cast it’s not undefined behavior if you know for sure that those bits are a valid representation of the target type. The difference here is subtle but it allows the compiler to safely make lots of cases work efficiently and do the right thing in more complex cases without invoking undefined behavior. Typical use case is for serialization.

Wrapping up

Historically, the way C-cast operator was split into four C++ operators follows three simple rules:

  • If you need to check the types dynamically, then use dynamic_cast.
  • If you can check the types statically, then use static_cast.
  • In any other case, it is reinterpret_cast or const_cast that you need, but this is very dangerous.

I’ll add to that that, in any situation, do no perform reinterpret_cast or const_cast unless you know know what you are doing. You should never ever perform these cast only because the other ones did not work.

RTTI in its wholeness is a useful –but totally optional– feature. But it is not a simple to master.

In modern C++, we want to perform checks as much as possible at compile time (for security and performance), so when we are able, we want to use static features instead of dynamic ones.

Of course, you should not force static code where dynamic code would be better, but you should always think of a static solution before a dynamic one.

Author: Chloé Lourseyre
Editor: Peter Fordham

Addenda

Sources

Notes

1. As much as I consider myself an expert in the C++ language, my knowledge of the C language is much more limited. There may be errors in this subsection. If so, please tell me in comments so I can edit the article.

2. You can also cast away the const qualifier through the pointer cast (https://onlinegdb.com/8HIJIeonA) but I don’t think it’s a whole different way to cast.

3. For instance, if px is a pointer to an object of type X (implemented as B) and B is a derived class of X that has a method g. you need to write ((B*)px)->g() to call g from px. A simpler syntax could have been px->B::g().

[History of C++] Templates: from C-style macros to concepts

Author: Chloé Lourseyre
Editor: Peter Fordham

Introduction: Parametrized types

Templates are the C++ feature (or group of features, as the word is used in several contexts) that implement parametrized types.

The notion of a parametrized type is very important in modern programming and consists of using a type as a parameter of a feature, which means you can use that feature with different types, the same way you use a feature with different values.

The most simple example is with std::vector. When you declare a vector as such: std::vector<int> foo;, the type int is parametrized. You could have put another type, like double, void* a user-defined class or even another list instead of int.

It is a way to achieve metaprogramming, a programming technique that aims to apply programs on other program data.

For the rest of the article I will use the word “template” to refer to either the notion of parametrized types or the C++ template implementation (unless I want to explicitly make the distinction).

Before Templates

Before the creation of template, in early C++, people still had to writer C-style macros to emulate templates.

One way of doing this is like that:

foobar.h
void foobar(FOOBAR_TYPE my_val);
foobar.cpp
void foobar(FOOBAR_TYPE my_val)
{
    // do stuff
}
main.cpp
#define FOOBAR_TYPE int
#include "foobar.h"
#include "foobar.cpp" // Only do this in a source file
#undef FOOBAR_TYPE

#define FOOBAR_TYPE double
#include "foobar.h"
#include "foobar.cpp" // Only do this in a source file
#undef FOOBAR_TYPE

int main()
{
    int toto = 42;
    double tata = 84;
    foobar(toto);
    foobar(tata);
}

Don’t do this at home, though! This is not something we ought to do nowadays (especially the #include "foobar.cpp" part). Also note that that code takes advantage of function overloading and therefore does not compile in C.

With our modern eyes, this seems very limited and error-prone. But the interesting thing is that even before C++ templates were implemented in early C++ design teams could use the macro approach to gain experience with them.

Timing

Templates were introduced in Release 3.0 of the language, in October 1991. In The Design and Evolution of C++, Stroustrup reveals that it was a mistake to introduce this feature so late, and in retrospect it would have been better to do so in Release 2.0 (June, 1989) at the expense of less important features, like multiple inheritance:

Also, adding multiple inheritance in Release 2.0 was a mistake. Multiple inheritance belongs in C++, but is far less important than parameterized types — and to some people, parameterized types are again less important than exception handling.

Bjarne Stroustrup, The Design and Evolution of C++, chapter 12: Multiple Inheritance, §1 – Introduction

From today, it is clear that Stroustrup was right and that templates have impacted the scenery of C++ much, much more than multiple inheritance.

This addition came late because it was really time-consuming to explore the design and implementation issues.

Needs and goals

The original need for templates was to express parametrization of container classes. But to do that job, macros were too limited. They fail to obey scope and type rules and don’t interact well with tools (especially debuggers). Before C++ templates, it was very hard to maintain code that used parametrized types, you needed the lowest level of abstraction and you needed to add each parametrized type manually.

The first concerns regarding templates were whether templates would be easy to use and templated objects be as easy to use as hand-coded objects, whether the compilation and linking speed would be significantly impacted and whether they would be portable.

The build process of templates

Syntax

The angle brackets

Designing the syntax of a feature is not an easy job, and requires extensive discussion and refinement.

The choice of the brackets <...> for the template parameter was made because, even though parentheses would have been easier to parse, they are overused in C++ and brackets are (empirically) more pleasant for a human reader.

However, this causes a problem for nested brackets, such as:

List<List<int>> a;

In the code above, in earlier C++, you would get a compilation error. The closing >> are seen by the compiler as operator>> and not two closing brackets.

A lexical trick was added later in the language (in C++141) so that this was not seen as a syntax error anymore.

The template argument

Initially, the template argument would have been placed just after the object name:

class Foo<class T>
{
    // ...
};

However, that caused two major issues:

  • It is a bit too hard to read for parsers and humans. Since the template syntax in nested within the syntax of the class, it is a bit tough to detect.
  • In the case of function templates, the templated type can be used before it is declared. For instance, in this declaraction: T at<class T>(const std::vector<T>& v, size_t index) { return v[index]; }, since T is the return type it is parsed before we even know it is a template parameter.

Both issues are resolved if we put the template argument before the declaration, and this is what was done:

template<class T> class Foo
{
    // ...
};

template<class T> T at(const std::vector<T>& v, size_t index) { return v[index]; }

Constraints of template parameters

In C++, the constraints on template arguments are implicit2.

The dilemma over if the constraints should be explicit in the template argument (like below) or if they should be deduced from the usage occurred. An example of such explicit constraint would be like this:

template < class T {
        int operator==(const T&, const T&); 
        T& operator=(const T&);
        bool operator<(const T&, int);
    };
>
class Foo {
    // ...
};

But this was judged to be way to verbose to be readable and it would require way more templates for the same number of features. Moreover, this kind of over-restricts the class you’re implementing, giving constraints that excludes some implementations that would have been perfectly fine and correct without them3.

However, having explicit constraints was not off the table, but it is just that function type is a too specific way to express this.

This could have been achieved through derivation: by specifying that your templated type must derive from another class, you can have explicit constraint on this type.

template <class T>
class TBase {
    int operator==(const T&, const T&); 
    T& operator=(const T&);
    bool operator<(const T&, int);
};

template <class T : TBase>
class Foo {
    // ...
};

However this generates more issues. The programmers are, because of that, encouraged to express constraints as classes, leading to an overuse of inheritance. There is a loss in expressivity and semantics, because “T must have be comparable to an int” become “T must inherit from TBase”. In addition to that, you could not express constraints on type that can’t have a base class, like ints and doubles.

This is mainly why we did not have explicit constraints on template parameters in C++4 for a long time.

However, the discussion on template constraints was revived in the late 2010s and a new notion made its appearance in C++20: Concepts (c.f. Modern evolutions – Concepts below).

Templated object generation

How templates are compiled is very simple: for every set of template parameters that is used on the templated object, the compiler will generate a implementation of this object using explicitly those parameters.

So, writting this:

template <class T> class Foo { /* ... do things with T ... */ };
template <class T, class U> class Bar { /* ... do things with T  and U... */ };

Foo<int> foo1;
Foo<double> foo2;
Bar<int, int> bar1;
Bar<int, double> bar2;
Bar<double, double> bar3;
Bar< Foo<int>, Foo<long> > bar4;

Is the same thing as writing this:

class Foo_int { /* ... do things with int ... */ };
class Foo_double { /* ... do things with double ... */ };
class Foo_long { /* ... do things with long ... */ };
class Bar_int_int { /* ... do things with int  and int... */ };
class Bar_int_double { /* ... do things with int  and double... */ };
class Bar_double_double { /* ... do things with double  and double... */ };
class Bar_Foo_int_Foo_long { /* ... do things with Foo_int  and Foo_long... */ };

Foo_int foo1;
Foo_double foo2;
Bar_int_int bar1;
Bar_int_double bar2;
Bar_double_double bar3;
Bar_Foo_int_Foo_long bar4;

… only it is more verbose (even more in real code) and less readable.

Class templates

Templates were imagined and designed primarily for classes, mostly to allow for the implementation of standard containers. They were designed to be as simple to use as standard classes and as efficient as macros. These two facts were decided so that low-level arrays could be abandoned when they were not specifically needed (in low-level programming) and that templatized containers would be preferred in the higher levels.

In addition to type argument, template can take non-type argument, like this:

template <class T, int Size>
class MyContainer {
    T m_collection[Size];
    int m_size;
public:
    MyContainer(): m_size(Size) {}
    // ...
};

This was introduced in order to allow for static sizing of containers. Carrying the size in the type information make the implementations more efficient, because you don’t have to track it separately and you don’t loose it through pointer decay as you do with C-style arrays.

class Foo;

int main()
{
    Foo[700] fooTable; // low-level container
    MyContainer<Foo, 700> fooCnt; // high-level container, as efficient as the previous one
}

Function templates

The idea of function templates comes from the need of having templated class methods and from the idea that function templates are in the logical extension of class templates.

Today, the most obvious examples we can provide are the STL algorithms (std::find(), std::find_first_of() , std::merge(), etc.). Though at its creation, the STL algorithms did not exist, it was these kind of functions that inspired function templates (the most symbolic being sort()).

The main issue with function templates was deducing the function template arguments and return type, so we don’t have to explicitly specify them at each function call-site.

In this context, it has been decided that template argument could be deduced (when possible) and specified (when needed). This was extremely useful to specify return values, because they can not always be deduced, such as in this example:

template <class TTo, class TFrom>
TTo convert(TFrom val)
{
    return val;
}

int main()
{
    int val = 4;
    convert(val); // Error: TTo is ambiguous
    convert<double, int>(val) // Correct: TTo is double; TFrom is int
    convert<double>(val) // Correct: TTo is double; TFrom is int; 
}

As you can see on line 12, the trailing template arguments can be omitted as can be trailing function arguments (when they have a default value).

The way templates are generated (see section Templated objects generation section above) works perfectly fine with function overloading. The only subtlety is when there are both non-templated and templated overloads of a function. Then, the non-templated overload is called if there is a perfect match, else it will be the templated overload that will be called if there is a perfect match possible, else we apply ordinary overload resolution.

Template instantiation

At the beginning, explicit template instantiation was not intended. It was because it would create hard issues in some specific circumstances, like if two unrelated parts of a program both request the same instantiation of a templated object, which would have to be done without code replication and without disturbing dynamic linking. This is why implicit templates instantiation was preferred at first.

The first automatic implementation for template instantiation was as follows: when the linker is run, it searches for missing template instantiations. If found, the compiler is invoked again to produce the missing code. This process is repeated until there are no more missing template instantiations.

However, this process had several problems, one of them being very poor compile and link performance.

It is to mitigate this that optional explicit template instantiation is allowed.

The development of the template instantiation process had many more issues, such as the point of instantiation (the “name problem”, i.e. pinpointing which declaration the names used in a template definition refer to), dependencies problems, solving ambiguities, etc. Discussing all of these would require a dedicated article.

Modern evolutions

Templates are a feature that continued to evolve even as we entered the Modern C++ era (beginning with C++11).

Variadic templates

Variadic templates are templates that have at least one parameter pack. In C++, a parameter pack is a way to say that a function or a template has a variable number of arguments.

For example, the following function declaration uses a parameter pack:

void foobar(int args...);

And can be called with any number of argument (greater than one).

foobar(1);
foobar(42, 666);
foobar(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);

Variadic templates allows you to have a variable number of arguments, that can be of different types.

With that, we can write more generic functions. For instance:

#include <iostream>

struct MyLogger 
{
    static int log_counter;

    template<typename THead>
    static void log(const THead& h)
    {
        std::cout << "[" << (log_counter++) << "] " << h << std::endl;
    }

    template<typename THead, typename ...TTail>
    static void log(const THead& h, const TTail& ...t)
    {
        log(h);
        log(t...);
    }
};

int MyLogger::log_counter = 0;

int main()
{
    MyLogger::log(1,2,3,"FOO");
    MyLogger::log('f', 4.2);
}

This generates the following output:


[0] 1
[1] 2
[2] 3
[3] FOO
[4] f
[5] 4.2

It is safe to assume that a significant motivation for the addition variadic templates and parameter packs was to be able to allow the implementation more generic functions, even if it may lead to voluminous generated code (for instance, in the previous example, the MyLogger class has 8 instantiations of the function log5).

Full details are available here: Parameter pack(since C++11) – cppreference.com.

Concepts

Concepts are a C++20 feature that aims to give the developer a way to declare constraints over template parameters. This leads to clearer code (with a higher level of abstraction) and clearer error message (if any).

For instance, here is a concept declaration:

template<typename T_>
concept Addable = requires(T_ a, T_ b)
{
    a + b;
};

And here are example of its usage:

template<typename T_>
requires Addable<T_>
T_ foo(T_ a, T_ b);

template<typename T_>
T_ bar(T_ a, T_ b) requires Addable<T_>;

auto l = []<typename T_> requires Addable<T_> (T_ a, T_ b) {};

Before that, template-related error were barely readable. Concepts were a highly anticipated feature of C++20.

A good overview of concepts can be found on Oleksandr Koval’s blog: All C++20 core language features with examples | Oleksandr Koval’s blog (oleksandrkvl.github.io)

Deduction guides

Template deduction guides are a C++17 feature and are patterns associated with a templated object that tell the compiler how to translate a set of parameter (and their types) into template arguments.

For instance:

template<typename T_>
struct Foo
{
  T_ t;
};
 
Foo(const char *) -> Foo<std::string>;
 
Foo foo{"A String"};

In this code, the object foo is a Foo<std::string> and not a Foo<const char*>, and thus foo.t is a std::string. Thanks to the deduction guide, the compiler understand that3 when we use a const char*, we want to use the std::string instantiation of the template.

This is peculiarly useful for object such as vectors, which can have this kind of constructor:

template<typename Iterator> vector(Iterator b, Iterator e) -> vector<typename std::iterator_traits<Iterator>::value_type>;

This way, if we call the vector constructor with an iterator, the compiler will be able to deduce the templated parameter of the vector.

Substitution Failure Is Not An Error

Substitution Failure Is Not An Error, SFINAE for short, is a rule that applies during template overloaded function resolution.

It basically means that if the (deduced or explicitly specified) type for the template parameter fails, the specialization is discarded instead of causing a compile error.

For instance, take the following code:

struct Foo {};
struct Bar { Bar(Foo){} }; // Bar can be created from Foo
 
template <class T>
auto f(T a, T b) -> decltype(a+b); // 1st overload
 
Foo f(Bar, Bar);  // 2nd overload
 
Foo a, b;
Foo x3 = f(a, b);

Instinctively, we could think that this is the first overload that is called on the highlighted line (because the template instantiation using Foo as T is a better overload that the second one, which requires a conversion).

However, the expression (a+b) is ill-formed with Foo. Instead of generating an error, the overload auto f(Foo a, Foo b) -> decltype(a+b); is discarded. Thus, this is the other overload that is called, with an implicit conversion.

This kind of substitution occurs in all types used in the function type, all types used in the template parameter declarations. Since C++11, it also occurs in all expressions used in the function type and all expressions used in a template parameter declaration. Since C++20, it also occurs in all expressions used in the explicit specifier.

The full documentation about SFINAE can be found here: SFINAE – cppreference.com.

Other features in C++20

Templates continue to evolve. Here are a small list of the C++20 templates feature I couldn’t fit in this article:

  • Template parameter list for generic lambdas. Sometimes generic lambdas are too generic. C++20 allows to use familiar template function syntax to introduce type names directly.
  • Class template argument deduction for aggregates. In C++17 to use aggregates with class template argument deduction we need explicit deduction guides, that’s unnecessary now.
  • Class types in non-type template parameters. Non-type template parameters now can be of literal class types.
  • Generalized non-type template parameters. Non-type template parameters are generalized to so-called structural types.
  • Class template argument deduction for alias templates. Class template argument deduction works with type aliases now.

Exceptions and templates: two sides of the same coin

I did not talk about exceptions in this article, but for Stroustrup, exceptions and templates are complementary features:

To my mind, templates and exceptions are two sides of the same coin: templates allow a reduction in the number of run-time errors by extending the range of problems handled by static type checking; exceptions provide a mechanism for dealing with the remaining run-time errors. Templates make exception handling manageable by reducing the need for run-time error handling to the essential cases. Exceptions make general template-based libraries manageable by providing a way for such libraries to report errors.

Bjarne Stroustrup, The Design and Evolution of C++, chapter 15: Templates, §1 – Introduction

So, by design, templates and exception are closely intermingled, in addition to raising the level of abstraction for error-handling.

However, exception and templates (especially templates) have evolved greatly since then, so I think this may not be true anymore.

Wrapping up

In my opinion, templates are the biggest fish in the C++ metaphorical pond. We will never talk enough about them, and I suspect they will continue to evolve for decades.

This is so because in modern C++ one of the key idea is to write intentions instead of actions. We want higher levels of abstraction and more metaprogramming. It is only normal that template are at the hearts of the modern evolutions of the language.

Author: Chloé Lourseyre
Editor: Peter Fordham

Addenda

Sources

Notes

1. I managed to locate this change in the GCC compiler at release 6 (https://godbolt.org/z/vndGdd7Wh), suggesting that this indeed occurred with C++14. I managed to see the same thing with the clang compiler at release 6 (https://godbolt.org/z/ssfxvb4cM), proving this right.

2. This is called duck-typing, from the saying if it looks like a duck, swims like a duck and it quacks like a duck then it probably is a duck.

3. I have no concrete example to provide and I’m pretty much paraphrasing Stroustrup in his retrospection, but the idea is that by having user-defined constraints, you close some doors that you didn’t even know existed and that others could have exploited. I’ve done and seen very interesting things using templates, and the fact that the only constraint we have is that the templated code makes sense with their given parameters opens up as many possibilities as we can imagine.

4. There were other tries to imagine a way to specify constraints, but to no avail. More details in section §15.4 of Stroustrup: The Design and Evolution of C++.

5. These instantiations are (and according to the assembly – Compiler Explorer (godbolt.org)):

  • log(int);
  • log(char[4]);
  • log(char);
  • log(double);
  • log(int,int,int,char[4]);
  • log(int,int,char[4]);
  • log(int,char[4]);
  • log(char,double);

[History of C++] Explanation on why the keyword `class` has no more reason to exist

Author: Chloé Lourseyre
Editor: Peter Fordham

Introduction to the new concept : History of C++

A few months back (at the start of this blog) I was thinking about interesting things you can find in C++, then I realized one thing: the keyword class doesn’t really have a good reason to exists!

This may seem a bit harsh said that way, but I will explain my statement later in the article.

Anyway, studying the reason for the word class to exist lead me to look into the history of the language. It was very interesting and I really enjoyed it.

So I decided to write a mini-series of articles about the history of C++ aiming to present concepts that may be outdated today, in C++20, and explain why some strange or debatable things were made that way in the past.

Sources

For this miniseries, I have three main sources:

I have a few things to say about them.

First, Sibling Rivalry: C and C++ and A History of C++: 1979-1991 are both fully available on Stroustrup’s website (just follow the links).

Second, I find it quite sad that I could only find sources from one author. Sure, Bjarne Stroustrup is probably the best individual to talk about his own work, but I would have like the insight of other authors (if you know any, please tell me in the comments).

Why the keyword class could simply not exist in C++20?

Today, in C++20, we have two very similar keywords that work almost exactly the same: class and struct.

There is one and only one difference between class and struct: by default, the members and the inheritance of struct are public, but in those of class are private.

Here are three reason as to why this tiny difference is not worth a different keyword1:

  • In practice, the default access modifier is almost never used, in my experience. Most developers don’t use default access modifier and prefer to specify it.
  • In 2021, a good code is a clear code. Explicitly writing the access specifier is -in that regard- better than leaving it implicit. This may be arguable on solo or small projects, but when you start to develop with many peers, it is better to write a few more characters to be sure that the code is clear for everyone.
  • Having two keywords is more ambiguous than having one. I have very often encountered developers that think there are more differences than that between class and struct, and they have sometimes even told me what they thought these additional differences were. If there was only one keyword, this confusion wouldn’t exist.

I can hear that some people have counter-arguments. The ones I heard a lot are:

  • This is syntactic sugar2.
  • They actually use the implicit access specifier3.
  • There is semantics behind the use of each keyword that go beyond the sole technical meaning4.

All in all, what I’m trying to say is that the language would practically be the same if the keyword class did not exist. So, in the mindset of C++20, we could ask ourselves “what is the purpose of adding a keyword that is neither needed nor useful?”.

I know one thing: that class is one of the oldest C++-specific keywords. Let’s dive into the history of C++ to understand why it exists.

History of the keyword class

Birth

The first official appearance of the keyword class was in Classes: An Abstract Data Type Facility for the C language (Bjarne Stroustrup, 1980), and was actually not talking about C++, but about what we call C with classes.

What is “C with classes”? I think I’ll dedicate a whole article to this subject so I’ll keep it short here. It is C++ immediate predecessor, started in 1979. The original goal was to add modularity to the C language, inspired by Simula5 classes. At first, it was a mere tool, but soon evolved to a whole language. However, since C with Classes was a mild success while needing continuous support, Bjarne decide to abandon it and create a new language, using his experience of C with Classes, and that aimed to be more popular. He called this new language C++.

The choice of the word “class” directly comes from Simula, and the fact that Stroustrup dislikes inventing new terminology.

You’ll find more about C with Classes in the book The Design and Evolution Of C++ (Bjarne Stroustrup), where a whole section is dedicated to it.

So the keyword class was actually born within the predecessor of C++. In terms of design, it’s one of the oldest concepts and even the motivation behind the creation of C with classes.

Original difference between struct and class

In C with Classes, struct and class and quite different.

Structures works just like in C, they are simple data structure, whereas it is within classes that the concept of methods are introduced.

At that time, there was a real difference between structures and classes, thus the distinction6.

Into C++

The two of the greatest features of early C++ were virtual functions and functions overloading.

In addition to that, namespaces rules where introduced to define on how scopes names would behave. Among those rules:

  • Names are private unless they are explicitly declared public.
  • A class is a scope (implying that classes nest properly).
  • C structure names don’t nest (even when they are lexically nested).

These rules make structures and classes behave differently in terms of scopes and names.

For instance, this was legal:

struct outer {
    struct inner {
        int i;
    };
};

struct inner a = { 1 };

But replacing struct with class provoked a compilation error.

In later C++, the code above doesn’t compile (it needs to be outer::inter a = {1};).

“Fusion” with the keyword struct

It’s difficult to say when this occurred specifically, because none of the sources I found clearly state “This is when structures and classes became the same concept, but we can investigate.

According to The C++ Programming Language – Reference Manual (Bjarne Stroustrup, 1984), the first published document specifically about C++:

(listing derived types)

classes containing a sequence of objects of various types, a set of functions for manipulating these objects, and a set of restrictions on the access of these objects and function;

structures which are classes without access restrictions;

Bjarne Stroustrup, The C++ Programming Language – Reference Manual, §4.4 Derived types

Moreover, if we take a look at the feedback Stroustrup gives about virtual functions and the object layout model (concepts introduced in 1986):

At his point, the object model becomes real in the sense that an object is more than the simple aggregation of the data members of a class. […] Then why did I not at this point choose to make structs and classes different notions?

My intent was to have a single concept: a single set of layout rules, a single set of lookup rules, a single set of resolution rules, etc. […]

Bjarne Stroustrup, The Design and Evolution of C++, §3.5.1 The Object Layout Model

Even though it seems that structures couldn’t hold private members at that time (the private keyword didn’t even exist then!), we can safely say this was the moment where structures and classes were “fused”.

It was at the creation of C++ that the structures from C and the classes from Simula merged together.

But when did they actually become the same?

But technically, structs and classes became what they are today the moment the keyword private was invented. It seems that happened the same time the keyword protected was introduced, for Release 1.2 in 1987.

From then to now

Despite all that, despite the fact that class is technically useless, there is a lot more to it.

The keyword class has acquired semantics.

Indeed, nowadays, writing the keyword class means that you are implementing a class that is not a data bucket, whereas the keyword struct is mainly used for data buckets and similar data structures. In their technicity, these words do not differ, but because of usage, because these keywords have history, they acquired more meaning than the sole technical meaning.

Go see Jonathan Boccara’s very relevant article: The real difference between struct and class – Fluent C++ (fluentcpp.com) for more details. This article is inspired from the Core Guidelines.

The fact that nowadays’ class has lived more than forty years makes it very different from the class from 1980 and the class that it would be if it were introduced in 2020.

But the question that immediately pops up in my head is the following: should we continue to use class this way? Should we stick to the semantics it acquired or should we seek evolution towards a more modern meaning? The answer is simple: it’s up to each of us. We, C++ devs, are the ones that make the language evolve, every day, in every single line of code.

The Core Guidelines tells us how we should use each feature today, but maybe tomorrow someone (you?) will come up with a better, safer, clearer way to code. Understanding what were structs and classes in the past and what they are today is the first step to define what they will be tomorrow.

Wrapping up

The best way to summarize the history of these two keyword is this way: “The structures from C and the classes from Simula merged together at the creation of C++.”, but we can also say that thanks to that, despite representing the same feature, the both have different meaning.

This article is not a pamphlet against class and I will not conclude this article with an half-educated half-authoritative argument like I often do7. Instead, I will tell you that I realized how important it is to contextualize articles like the ones I publish on this blog with the version of C++ that is used and the articles and books that are used as examples and inspirations.

I think it s important to understand history to be able to judge the practices we have today. Do we do something out of habit or is there a real advantage to it? It s a question that needs to be asked every day, or else we’ll end up writing outdated code in an outdated mindset.

How C++ developers think8 evolves from decade to decade. Each eon, developers have different mindsets, different goals, different issues, a different education and so on… I don’t blame how people coded in the past, but I do blame those who code now like we coded a decade ago, and in the future I hope to see my peers blaming me when I code in “old C++”.

Thanks for reading and see you next week!

Author: Chloé Lourseyre
Editor: Peter Fordham

Addenda

To go further

Here are two thoughts that are a bit off-topic.

Namespaces in C

Something that wasn’t inherited from C into C++ was the struct namespace.

Indeed, in C, the namespace containing the struct names isn’t the same as the gobal C namespace. In C struct foo and foo aren’t refereeing to the same object. In C++, assuming that foo is a structure, struct foo and foo are the same name.

There is a way, in C, to link these two namespace, using typedef. To learn more about that, read this article: How to use the typedef struct in C (educative.io)

The class keyword in templates

Maybe have you already seen this syntax:

template <class C_>
void foo(C_ arg)
{
    // ...
}

What does class mean in this context?

It actually means nothing more that “a type, whatever which one”.

This is a bit confusing, because we may think that, as type, C_ is supposed to be a class. But it is not. Later, the typename keyword was introduced to lift this confusion, but we can still use class if we want to.

Annotations

1. Since struct and class are so similar, I choose to consider class to be the keyword in excess, simply because struct exists in C and not class, and that it is the process of the keyword class that brought them both so close.

2. This argument is not true. By essence, syntactic sugar is supposed to make the code easier to read or write. Since struct/class is just a substitute of a keyword for another, there is no gain in clarity whatsoever. The only reason the code may seem easier to read is because of the semantics these keyword hold, but not the syntax itself.

3. Yes, I know, some people actually use the implicit. I do too. But what you forget when you say such a thing is that most of the software industry doesn’t think like you nor write code like you. The statement of the struct/class duplicity comes from an empiric fact. Our own individual practices can never be an argument against that fact.

4. While this is factually true, the reasoning is upside-down. It’s because of their history that they have different semantic meaning. If they were created today, from nothing, they would not have that semantics, only their duplicity. I’ll talk about that near the end of the article.

5. Simula is the name of two simulation programming languages, Simula I and Simula 67, developed in the 1960s at the Norwegian Computing Center in Oslo, and is considered the first object-oriented programming language. Simula is very unknown amongst the community of developers, but has greatly influenced other famous languages. Simula-type objects are reimplemented in Object Pascal, Java, C#, and, of course, C++.

6. At that time, you could emulate any structure with a class, but it was still interesting, all the more in the mindset of then, to make the distinction.

7. I tend to always agree with the C++ Core Guidelines (isocpp.github.io), even if I am always trying to be critical as of our habits and practices. But keep in mind that the guidelines we have today may be different from the one we’ll have tomorrow.

8. I think this statement is actually true for every languages, but C++ is the perfect archetype as it is one of the oldest most-used language on the market, today in 2021.

Sources

In order of appearance:

Off-topic: Feedspot’s Top 30 C++ Programming Blogs and Websites

Recently, Feedspot made a top-list of the best C++ programming blogs and website, and Belay the C++ ended up in 9th position!

You can find the top-list here: Top 30 C++ Programming Blogs and Websites You Must Follow in 2021 (feedspot.com)