You shouldn’t assume accessors are fast

Author: Chloé Lourseyre

“Accessors are fast.” I heard this leitmotiv so many times over the years that I couldn’t not write about it.

Explanation

What’s the use for accessors?

During the researches I made for this article, I was surprised that there actually is a formal definition of accessors¹.

You can find this definition in section § 18.3.5 Accessor functions of the fourth edition of The C++ Programming Language (Bjarne Stroustrup).

Long story short, the definition says that, though is unadvised to abuse that feature, you can design a method to read and edit private attributes of a class to write a clean algorithm (I’m paraphrasing).

So, in short, accessors are used as an interface between the private members of a class and the user of this class.

Like any kind of interface, accessors can thus obfuscate how the actual access of the private members is done, to encapsulate more complex code and ease how the class is used.

1. I’m not discussing the fact that there is a pseudo-consensual definition of accessors. However, since this article aims to deconstruct that definition, I won’t rely on it to build my arguments. You can find this pseudo-consensual definition in the following paragraph, where I describe why it is a bad one.

People don’t use accessors properly…

Every day I see people using accessors improperly. Here are a few examples:

Writing the simplest get/set for every member of a class without thinking. If your class makes every attributes available with no subtlety and nothing else is private, then it’s not a class, it’s a struct².
Writing accessors that are never used. Don’t write something you don’t need, it’s just polluting the codebase.
Realizing that there are not accessors for a given attribute and implementing them without questioning. Sometimes (often), there is a good reason that an attribute doesn’t have accessors.
Declaring a getter that is not const (even though it should be). By default, make everything const unless you need it non-const.
Declaring a getter that is const even though it shouldn’t be. This one is very rare, but I did see several times writing a dangerous const-cast just to make a getter const. You should avoid writing const-casts at all costs (and the examples I recall there wasn’t a good reason for it).

There are a lot of other bad practices that are too verbose to describe in this article (which purpose is not to teach you to use accessors properly).

2. Here, I abusively use the word class to name a data structure with private members and the struct to name a data structure that is nothing more than a data bucket. This is done on purpose to smooth the read of the article.

…leading to false assumptions

Bad practices lead to a bad way of thinking. And a bad way of thinking leads to false assumptions.

The most widespread false assumption over accessors is the following:

All accessors should be fast to execute.

Assuming accessors are fast is a useless limitation

Without talking about the fact that it can be harmful (if you use a function thinking it is fast, the day it is not you may have a bad surprise), by making this assumption you put a limitation on yourself.

This limitation is not effectively useful (at best, it saves you the time to read the documentation of the accessors — which should not be overlooked anyway), it restrains your ability to innovate and do something useful with your accessor.

Demonstration with a useful pattern

I’m going to show you a pattern that I love to use because it is really useful and saves a lot of lines of code.

Consider this: you have to implement a class that, given several input parameters, provides several outputs (two in our example). You need to use this class to fill two arrays with the outputs, but the parameters don’t always change (so you don’t always have to perform the computation). Since the computation is time-consuming, you don’t want to re-compute when you don’t have to³.

One (naïve) way to do this would be that (full code here https://godbolt.org/z/a4x769jra) :

class FooBarComputer
{
public:
    FooBarComputer();

    // These update the m_has_changed attribute if relevant
    void set_alpha(int alpha);
    void set_beta(int beta);
    void set_gamma(int gamma);

    int get_foo() const;
    int get_bar() const;

    bool has_changed() const;
    void reset_changed();

    void compute();

private:
    int  m_alpha, m_beta, m_gamma;
    int m_foo, m_bar;
    bool m_has_changed;
};


//  ...


//==============================
bool FooBarComputer::has_changed() const
{
    return m_has_changed;
}

void FooBarComputer::reset_changed()
{
    m_has_changed = false;
}


//==============================
// Output getters
int FooBarComputer::get_foo() const
{
    return m_foo;
}

int FooBarComputer::get_bar() const
{
    return m_bar;
}


//  ...


//==============================
// main loop
int main()
{
    std::vector<int> foo_vect, bar_vect;
    FooBarComputer fbc;

    for (int i = 0 ; i < LOOP_SIZE ; ++i)
    {
        fbc.set_alpha( generate_alpha() );
        fbc.set_beta( generate_beta() );
        fbc.set_gamma( generate_gamma() );

        if ( fbc.has_changed() )
        {
            fbc.compute();
            fbc.reset_changed();
        }

        foo_vect.push_back( fbc.get_foo() );
        bar_vect.push_back( fbc.get_bar() );
    }
}

However, it is possible to write a better version of this class by tweaking the getter, like so (full code here https://godbolt.org/z/aqznsr6KP):

class FooBarComputer
{
public:
    FooBarComputer();

    // These update the m_has_changed attribute if relevant
    void set_alpha(int alpha);
    void set_beta(int beta);
    void set_gamma(int gamma);

    int get_foo();
    int get_bar();

private:
    void check_change();
    void compute();

    int  m_alpha, m_beta, m_gamma;
    int m_foo, m_bar;
    bool m_has_changed;
};


//  ...


//==============================
void FooBarComputer::check_change()
{
    if (m_has_changed)
    {
        compute();
        m_has_changed = false;
    }
}


//==============================
// Output getters
int FooBarComputer::get_foo()
{
    check_change();
    return m_foo;
}

int FooBarComputer::get_bar()
{
    check_change();
    return m_bar;
}


//  ...


//==============================
// main loop
int main()
{
    std::vector<int> foo_vect, bar_vect;
    FooBarComputer fbc;

    for (int i = 0 ; i < LOOP_SIZE ; ++i)
    {
        fbc.set_alpha( generate_alpha() );
        fbc.set_beta( generate_beta() );
        fbc.set_gamma( generate_gamma() );

        foo_vect.push_back( fbc.get_foo() );
        bar_vect.push_back( fbc.get_bar() );
    }
}

Here are the benefits:

The main code is shorter and clearer.
You don’t need to make the compute() public anymore.
You don’t need has_changed(), instead, you have a check_change() that is private.
Users of your class will be less likely to misuse it. Since the compute() is now lazy, you are sure that no one will call the compute() recklessly.

This is that last point that is the most important. Any user could have made the mistake to omit the if conditional in the first version of the loop, rendering it inefficient.

3. If you want, you can imagine that the parameters change only once a hundred times. Thus, this is important to not call the compute() every time.

Isn’t it possible to do otherwise?

The most recurring argument I hear about this practice is the following: “Well, why don’t you just name your method otherwise? Like computeFoo()?”.

Well, the method does not always perform the computation so it’s wrong to call it that way. Plus, semantically, compute refers to a computation and does not imply that the value is returned. Even if some people do it, I don’t like to use the word compute that way.

“Then call it computeAndGetFoo() and be done with it!”

Except, again, it’s not what it does. It does not always compute, so a more appropriate name would be sometimesComputeAndAlwaysGet(), which is plain ridiculous for such a simple method.

“Then find a proper name!”

I did. It’s getFoo(). It’s what it does, it gets the foo. The fact that the compute() is lazy does not change the fact that it’s a getter. Plus, this is mentioned in the comments that the get may be costly, so read the documentation and everything will be fine.

“Couldn’t you put the check into the compute() instead of the getter?”

I could have, but to what end? Making the compute() public is useless since you need the getters to access the data.

There is a way to do otherwise…

There actually is a way to do otherwise, which is especially useful when you also need a getter than you are sure will never call the compute (this is useful if you need a const getter for instance).

In that case, I advise that you don’t use the same name for both getters, because they don’t do the same thing. Using the same name (with the only difference being the constness) will be confusing.

I personally use the following names:

getActiveFoo()
getPassiveFoo()

I like this way of writing it because you explicitly specify if the getter is potentially costly or not. Moreover, it also implies that the passive getter may give you an outdated value.

Plus, anyone who tries to call getFoo() will not succeed and have to choose between both versions. Forcing the user to think about what they write is always a good thing.

Most importantly: spread the word

Since it is useful to put performance-sensitive code in getters, you will find developers who do it.

The most dangerous behavior is ignoring that fact and letting other people assume that accessors are fast to execute.

You should spread the word around you that accessors are, at the end of the day, like any other method: they can be long to execute, and everyone must read the associated documentation before using them.

In this article, I talked about execution time performance, but this also applies to any other form of performance: memory consumption, disk access, network access, etc.

Thank you for reading and see you next week!