C++

[Introduction] [Literature recommandations] [Table of content]

Introduction

C++ is a programming language for serious coding. It is flexible, fast and has an high abstraction level. Of course, it's not a language you learn within a few weeks or months, but once you get a good understanding of it there's almost nothing you couldn't do with it. So IMHO it's worth every effort you make to know C++ a little bit better and the purpose of this page is to help you with that.

Literature recommandations

Before I'll give some information myself, I would like to point out some books you should read to gain some better understanding of C++ and the Standard Template Library. Unlike most other C++ books I've read, these are fairly up-to-date books and stick to the language standard. You should definitly NOT read books which are written prior to 1998 and/or mix up C and C++ code.

At first a list of books you SHOULD read:

  1. Scott Meyers, 'Effective C++' and Scott Meyers, 'More Effective C++'
    Although you should know a bit about C++ prior to reading these books, you will learn a lot about how C++ works and how to actually think in a C++ way. You'll learn to know the common (and some not so common) pitfalls of the language and some really nifty stuff. In the end you will have a lot of those moments where you think: 'oh, that's because this was not working!'
  2. Nicolai Josuttis, 'The Standard Template Library'
    This book holds what it promises. It is a tutorial, overview and a reference of the STL. I use it on a daily basis because this book explains almost everything in detail and gives lots of examples for frequently used code.

If you want to learn proper C++ DO NOT take a book that contains:

Most of this stuff will only screw up your coding style and confuse you in the long run. Be careful with free online tutorials, too! Many of them are outdated (the language standard has been fixed in 1999) and most compiler vendors implemented the features requested by the standard only recently.

Table of content

Whenever I find myself looking up a tricky bit of code more then twice I'll add it here. Right now you find something about:


Templates

Templates are functions with variable types for arguments and/or return values. 'Variable' means that they are specified at compile time (either explicitly by yourself or deduced by the compiler). Let's give a typical example, a min function.

inline
int min(int lhs, int rhs) {
    return (lhs < rhs ? lhs : rhs);
};

This function works for integers. Now what happens when you use the function for doubles?

double a = 5.4;
double b = 5.2;
double c = min(a,b);

Clearly the doubles will be converted to ints and loose information. Now you would need to write an additional min function that works for doubles (in C times you probably would have called that fmin).

Now having lots of data types you would end up with lots of different functions. This is one reason why MIN and MAX defines were widely used in C programs. However, C++ provides templates, which are much much better than defines (they are type-safe to name only one advantage).

You create a template by replacing fixed data types in the function with a symbol, for instance T. Then you add the template declaration and from now on the special functions for different data types will be generated automatically by the compiler.

template <typename T>
inline 
T min(const T& lhs, const T& rhs) {
    return (lhs < rhs ? lhs : rhs);
};

You still can use the min function as you used it before. Because we have used our template type in the arguments the compiler can deduce the type from the arguments we pass to the function. Now what happens with the template at compile time? If you use ints, then the compiler will silently insert 'int' for all the Ts in the function and compile the result. Later in the same program you might want to use doubles, so the compiler will compile an _additional_ version for doubles. So in fact there will be for each type you've used a compiled min function. Keep that in mind: templates reduce and clearify your source code, but do NOT reduce the compiled code size.

As written above the compiler can deduce the type T by examining the arguments. But what happens if you pass an int as first and an unsigned int as second parameter? The compiler will complain and you have to say explicitly which type he has to use. For templates without arguments you'll have to do that always. In the following call to min all parameters will be treated as integers:

int c=min<int>(a,b);

Templates are declarations and definitions in one, so templates should always reside in a header. In each implementation file (cpp file) which uses the template the compiler will create a specialisation of the template. Now often you will use just few types but use the templated function in many files. Instead of letting the compiler create a specialisation of it in each file, you can create explicit template specialisations for some frequently used types. If you use the template for doubles often, you might create a double specialisation:

double min(const double& lhs, const double& rhs);
This declaration should be placed in the header as well, and the actual definition can be put in a cpp file. Now the compiler first checks, whether an overloaded min function can be found and only if the types are different a new specialisation from the template is created.

NOTE: A templated function might perform better for certain data types. So you can use template specialisation to improve the performance in these cases. The copy-algorithm from the STL is one example for that. In many implementations of the STL copy is optimized for build-in data types so that it can copy chunks of data at a time (instead of copying element by element).


Function objects

STL algorithms often allow to pass a function object. This part will show how to use and write function objects and give you some ideas how you can clearify and improve your code with function objects.

At first some short definitions so that you know what I'm talking about:

  1. A pure function evaluates its arguments and returns a result that only relies on the arguments.
  2. A predicate is a function that returns bool. Predicates that take only one argument are called unary predicate, a predicate with two arguments is a binary predicate.
  3. A function object or functor is a class that has an operator() defined.
  4. A predicate class is a function object whose operator() returns bool.

So, that's it. Nothing more to memorize. Let's get started with an easy example: You have a container (well let's say a std::vector) with some numbers in it and you want to multiply each element with a constant factor.
This cries out for an algorithm - modifications of all elements in a certain range and storing the values in a range can be done by either for_each or transform (if you don't know them consult your favourite STL reference).

Besides the algoritm approach there are some other solutions for the problem (which we will discuss at first). We have std::vector (Note: I will neglect the namespace qualifier std in the examples).

vector<double>  vec;
which we fill with some numbers (however you would do that) and we now want to multiply all elements in the vector with the factor
double factor = 42;
Now how would you do that?

1. the hand-written loop #1

for (size_t i=0; i<vec.size(); ++i)
    vec[i] *= factor;

2. the hand-written loop #2

for (std::vector<double> it=vec.begin(); it!=vec.end(); ++it) 
    *it *= factor;

Both hand written loops require each loop a call to a member function of the container (#1 size, #2 end). Obviously there could be some better (meaning faster) way to do that (especially if we would use a std::list as container where the member function size() takes linear time).

3. using the transform algorithm with a predefined function object while binding the factor as second argument to the function object

transform(vec.begin(), vec.end(), vec.begin(),
    bind_2nd(multiplies<double>(), factor) );

multiplies is a function object provided by the STL (include <functional>). Its operator() requires two arguments, one is a reference to a value and one is the factor. Now for each element in the range the algorithm passes the currently processed value to the functor, but we still need the additional argument 'factor'. And bind_2nd simply causes the algorithm to always pass the factor as second argument. The result of the function object is then stored in the target range (in our case the original range).
Both member functions of the container (begin and end) will only be called once and the whole of bind_2nd(multiplies<double>(), factor) can be reduced at compile time to current_value *= factor (this requires a good compiler of course).

4. using a selfmade function object

First the function object

// function object for multiplications with a constant value
class multiply_with {
    const double factor_;
  public:
    // constructor takes the factor as argument
    multiply_with(double factor) : factor_(factor) {};
    // operator() multiplies the value with the factor
    void operator() (double& val) { val*=factor_; }
};
and then the call
for_each(vec.begin(), vec.end(), 
         multiply_with(factor) );

This is of course overkill (just count the lines) but it illustrates a simple matter: how to write a _very_ simple function object that is most likely to produce faster code than both hand written loops (just try it out). And, I wonder whether you noticed, but if you encounter such a call to for_each in the code somewhere you know what's going on right away. So this solution definitely scores with respect to 'clearity'.

5. A complete function object with an operator() that takes one parameter should be derived from unary_function to make it adaptable and of course it should be a template (to support other types):

template <typename T>
class multiply_with : public unary_function<T, void> {
    const T factor_;
  public:
    multiply_with(T factor) : factor_(factor) {};
    void operator() (T& val) { val*=factor_; }
};

for_each(vec.begin(), vec.end(), multiply_with<double>(factor) );

Let's talk about performance now. I suggest, that you create some simple evaluation programs to see, which version evaluates fastest on your computer. Usually it will be version 2 and - if you have a good compiler - version 4/5 will be equally fast.

This was a short introduction towards function objects, now let's take a look at predicates. For simplicity we create a vector that contains several numbers (integers).

vector<int>  vec;
for (int i=0; i<20; ++i)
    vec.push_back(i);

And now we want to print a filtered list, to keep it simple we want to print even numbers only. Because the STL doesn't have a copy_if algorithm (see next chapter), we will first remove all odd numbers using the remove_if algorithm and print out the remaining elements in the vector.

The remove_if algoritm requires an unary predicate which we can provide in two different ways. The first one is a function that returns true (=predicate). The second is a function object with operator() returning bool (=predicate class).

The function and the application may look like (its a template but obviously it only makes sense for integer data types)

template<typename T>
inline bool is_odd_number(T value) {
    return (value % 2 != 0);
};
and it would be used like
vec.erase(remove_if(vec.begin(), vec.end(), is_odd<int>), vec.end() );

Remember that remove_if moves all elements that should remain to the front of the container and returns the new logical end of the container. The actual erasing is done with the erase member function which basically just truncates the container to the new logical end (this is widely known as the erase-remove idiom).
Apart from that both the function template and the call to remove_if should be clear. And now we can simply copy the content of the vector into a stream:

copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, " ") );

With our selfmade algorithm copy_if (see next chapter) we can simplify this whole procedure and write

copy_if(vec.begin(), vec.end(), 
        ostream_iterator<int>(cout, " "), 
        is_even<int> );
whereas the function is_even is implemented similarly to is_odd.

Now you will have probably heard of 'Negators' already. Why can't we use not1(is_odd<int>) instead of writing another predicate? This is simply because not1() and the other function adaptors of the STL only work with function objects. So let's go for variant 2 and create a predicate class. This class has to be derived from std::unary_function so that we can adapt it with, for instance, the negator not1.

The predicate class could be implemented like

// unary predicate to evaluate the 
// question: 'is this number an odd number'
template<typename T>
struct is_odd : unary_function<T, bool> {
    bool operator() (T number) const { return (number % 2 != 0); }
};
and it would be used like
copy_if(vec.begin(), vec.end(), 
        ostream_iterator<int>(cout, " "), 
        not1(is_odd<int>()) );

Well, at the first sight this looks really hard to understand but if you look at it for a while and work out the different parts of the algorithm call, it will all become clear.
Note that the operator() is declared const. Remember that as rule of thumb: Make the operator() of predicate classes constant! (see Scott Meyers 'Effective STL' for explaination).


STL-like algorithms

To illustrate the process of creating an algorithm I will use a small example: the check whether a series of numerical values are monotonic increasing. You can also check this using the STL algorithm adjacent_find but this example will show an example how to write such an algorithm.

Legacy C-style code would look like:

bool monotonic(double * data, int n) {
    while (--n > 0) {
        if (*data > *(data+1)) return false;
        ++data;
    }
    return true;
}
and we would call that like
double arr[5] = { 1, 2, 2, 3, 4 };
cout << boolalpha << monotonic(arr, 5) << endl;
Now here we have several restrictions: An algorithm can provide flexibility in all issues, so let's introduce them one by one. At first we introduce a small change in the algorithm style to allow the use of iterators. However, in the first step we will still use double pointers:
bool monotonic(double * first, double * last) {
    while (++first < last)
        if (*(first-1) > *first)  return false;
    return true;
}
and we would call that like
cout << boolalpha << monotonic(arr, arr+5) << endl;
Here we see typical algorithm style parameters:
We pass an iterator (a pointer in this case) to the first element in the range and an iterator to the element after the last.

In the next step we can replace the pointers with random access iterators. Only random access iterators provide operator < and operator -

template <typename RAIt>
bool monotonic(RAIt first, RAIt last) {
    while (++first < last)
        if (*(first-1) > *first)  return false;
    return true;
}
If we want to generalise this even more and allow input iterators (from streams etc), we have to replace the expressions first < last and first-1. Input iterators should not be decrimented. We have to introduce a temporary variable.
template <typename InputIt>
bool monotonic(InputIt first, InputIt last) {
    if (first==last) return true;
    iterator_traits<InputIt>::value_type lastValue = *first;
    while (++first != last) {
        if (lastValue > *first) return false;
        lastValue = *first;
    };
    return true;
}
The last read value must be stored in a temporary variable. But first we have to know the value type of the iterator. The template iterator_traits helps us by providing the necessary type defs.

Now we can use regular input iterators, for example:

cout << boolalpha << monotonic(istream_iterator<double>(cin),
                               istream_iterator<double>() ) << endl;

Now this is already a working algorithm, but we would need three others for

Using predicates as template arguments we can incorporate all that in a single algorithm and to comply with STL conventions we should rename this predicate algorithm to monotonic_if

template <typename InputIt, typename Predicate>
bool monotonic_if(InputIt first, InputIt last, Predicate pred) {
    if (first==last) return true;
    iterator_traits<InputIt>::value_type lastValue = *first;
    while (++first != last) {
        if (!pred(lastValue, *first)) return false;
        lastValue = *first;
    };
    return true;
}
so now we can call the algorithm like
cout << boolalpha 
     << monotonic_if(arr, arr+5, std::less_equal<double>()) << endl;
meaning that for all pairs of consecutive numbers the predicate less_equal must return true to have a monotonic range. If we want to test whether the range is strictly monotonic decreasing we need to check
cout << boolalpha 
     << monotonic_if(arr, arr+5, std::greater<double>()) << endl;
Now you should have a fair idea of how to write an algorithm. But before you start writing algorithms all over the place take a good look at what is already available in the STL.



Another algoritm I've used already in the previous chapter is copy_if. It is not provided by the STL. So here's a possible implementation of this algorithm:
template <typename InputIt, typename OutputIt, typename Predicate>
OutputIt copy_if (InputIt first, InputIt last, 
                  OutputIt result, Predicate pred) 
{
    while (first != last) {
        if (pred(*first))  *result++ = *first;
        ++first;
    }
    return result;
}

As simple as that. An application of the algorithm has been given already in the previous chapter.


Back to home