Don’t use raw loops

Author: Chloé Lourseyre

Intentions

Sean Parent once said “No raw loops“. This was eight years ago.

Today, among the people who don’t have a strong C++ background, almost nobody knows this statement (let alone to know who Sean Parent is). As a result, in 2021, many C++ projects use tons of raw loops and almost no algorithms.

This article intends to teach you, the people who still use raw loops in 2021, why and how to use algorithms instead.

Resources

This subject has been covered by many C++ experts already, so if you feel technical and have some free time, check the following resources:

Sean Parent’s talk that contains the no raw loops statement:
C++ Seasoning | GoingNative 2013 | Channel 9 (msdn.com)

A Jason Turner’s talk about code smells (including the use of raw loops):
C++ Code Smells – Jason Turner – CppCon 2019 – YouTube

Jonathan Boccara’s talk about STL algorithms:
CppCon 2018: Jonathan Boccara “105 STL Algorithms in Less Than an Hour” – YouTube.

As a bonus, here is a map representation of the STL algorithms:
The World Map of C++ STL Algorithms – Fluent C++ (fluentcpp.com).

What are raw loops?

What we call raw loops are you most basic design of loops. It is usually the use of the keywords for, while, or do while accompanied by a block of code.

Raw loops are opposed to algorithms, that are encapsulated loops (or non-loops, how could you know since it’s encapsulated?) serving a specific or generic purpose and that can be called using dedicated functions.

Why you should not use raw loops

It’s a matter of semantics. A raw loop can only express the technical fact that it’s a loop, you write how your algorithm proceeds.

When you call an algorithm, you write what you intend, you write what you want to obtain.

For example, let’s look at this sample of code:

//...

for (size_t i = 0; i < my_vect.size() && predicate(my_vect, i) ; ++i)
{
    //...
}

//...

This tells you that you perform a loop, indexed over a vector and controlled by a custom predicate, but no more. It doesn’t tell you what the loop does, and what result is expected from it. You’ll have to study the body of the loop to know that.

Now let’s look at this algorithm call:

//...

auto it_found = std::find_if(cbegin(my_vect), cend(my_vect), predicate);

//...

Even if you don’t know how find_if() proceeds, you understand it will return an element of my_vect that matches the condition predicate(). You understand what it does, not how it is done.

From there, there are several points to consider:

  • Algorithms raise the level of abstraction and help you understand the intention behind the code.
  • Good semantics leads to good readability, good readability leads to better maintainability, better maintainability leads to fewer regressions.
  • Calling an algorithm is less verbose than re-writing it.
  • Raw loops are prone to several common mistakes, like off-by-one, empty loops, naive complexity, etc.

What to do when there is no existing algorithm?

There are times when you need an algorithm to perform a specific task, but this is so specific that none of the existing algorithms is suitable. What to do in this case?

Combine algorithms into a new one

Often, your specific algorithm can be resolved by doing a combination of existing algorithms, or by implementing a specific version of an existing one. To do so, just implement a function that does the combination for you and give it an explicit name. Then anyone can call it like any algorithm.

For example, you need to verify if, in a vector, all elements that match condition A also match condition B. To do this, you can use the algorithm std::all_of with a custom predicate:

template< typename Iterator, typename PredA, typename PredB >
bool both_or_none( Iterator first, Iterator last, PredA & predA, PredB & predB )
{
    auto pred = [&predA,&predB](const auto& elt)
    {
        return predA(elt) == predB(elt); 
    };
    return all_of(first, last, pred);
}

The body of the algorithm is pretty short: it creates a function that combines both predicates to implement our specific condition, then applies std::all_of(), the algorithm that verifies that the condition is true on every element of the collection.

Write a raw loop inside a function

There are some times when trying to combine existing algorithms is futile and may feel forced and artificial.

What you need to do when this occurs is to write your own raw loop in a dedicated function, that will act as your algorithm. Be sure to give it an explicit name.

For example, you have a collection and you need to get the maximum element that matches a condition. This is what the algorithm max_if() would be if it existed.

However, you can’t pull it out easily by combining existing algorithms. You would first need to get the subset of your collection that matches the condition, then calling std::max() on it. But the only way1 to get that subset is to use a std::copy_if, which copies elements. Copies may be expensive so you don’t want that.

What to do then? Write the raw loop that implements the max_if() yourself, and encapsulate it in a function:

template< typename Iterator, typename Pred >
constexpr Iterator max_if( Iterator first, Iterator last, Pred & pred )
{
    Iterator max_element = last;
    for (auto it = first ; it != last ; ++it)
    {
        if (pred(*it) && (max_element == last || *it > *max_element))
            max_element = it;
    }
    return max_element;
}

Then, the use of max_if will be semantically explicit, matching all the upsides of a true algorithm.

1There actually are other ways, like using find_if then remove in successions, but they are pretty much worse.

STL algorithms examples

There are plenty of algorithms in the STL. I suggest you be curious and explore by yourself: Algorithms library – cppreference.com.

As an appetizer, here are a few very basic and common algorithms that, if you don’t already know them, should learn about.

  • std::find(): Searches for an element equal to a given value.
  • std::find_if(): Searches for an element for which a given predicate returns true.
  • std::for_each(): Applies the given function object to the result of dereferencing every iterator in the range, order.
  • std::transform(): Applies the given function to a range and stores the result in another range, keeping the original elements order.
  • std::all_of(): Checks if the given unary predicate returns true for all elements in the range.
  • std::any_of(): Checks if the given unary predicate returns true for at least one element in the range.
  • std::copy_if(): Copies the elements for which the given predicate returns true.
  • std::remove_if(): Removes the elements for which the given predicate returns true.
  • std::reverse(): Reverses the order of the elements in the range.
  • And many more…

If you want to go further, here is a one-hour talk presenting more than a hundred algorithms: CppCon 2018: Jonathan Boccara “105 STL Algorithms in Less Than an Hour” – YouTube

Wrapping up

Many C++ experts agree that loops are to disappear in the higher abstraction levels, only used to write lower-level algorithms. This statement is not an absolute, but an ideal to keep in mind when you code.

If like many C++ developers, you tend to use raw loops instead of algorithms, you certainly should check the resources provided in this article. As you get familiar with the most basic algorithm and begin to use them in practice, you’ll find them more and more convenient.

Thanks for reading and see you next week!

Author: Chloé Lourseyre

One thought on “Don’t use raw loops

  1. I think you’re presenting a false dichotomy here. The “raw loops” version would be more like this in modern C++:

    for (const auto& x : my_vect)
    {
    if (predicate(x)) {
    // do something with x
    break;
    }
    }

    Even though with this code “You’ll have to study the body of the loop” to find out what result is expected from it, the loop is so trivial it’s immediately obvious even to a beginner. Basically the only mistake you can make here is an accidental copy of “x” if you forget to take a reference to it.

    And for the claim that “Calling an algorithm is less verbose than re-writing it”, well, let’s have a look at the “both_or_none” example. If we assume that we are operating on the whole container and not on range (as is almost always the case), we end up with this:

    bool allSame = true;
    for (const auto& x : my_vect)
    {
    if (predicateA(x) != predicateB(x)) {
    allSame = false;
    break;
    }
    }

    That’s 131 characters vs 296 in the implementation of “both_or_none”, so the extra complexity of following the STL algorithm format more than doubles the amount of code in this case.

    Personally I’d really like to use the standard “algorithms” more but the design mistake of always requiring a iterator range and not just a reference to a container makes them unwieldy to work with. Of course with more complex algorithms such as sorting and even std::reverse() it’s worth the trouble but the examples in this post don’t really make a convincing case.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s