Space and Time Implications of Common C++ Programming Idioms

Research Plan

Jeffrey Yasskin

University of Texas at Austin
Plan II

Table of Contents

Tests
Compilers
Optimization Flags
Glossary
Bibliography

Tests

  • Existing research has looked at the performance improvements from using templated code. I plan to look at the tradeoff between this and any code bloat caused by templates. For example, [Veldhuizen95b] contains two different uses of Expression Templates, one of which I expect to speed up the program but bloat it, the other I expect to make it both smaller and faster. I plan to use Blitz++ for a real-world example of heavy template use.
  • I need to write several pairs of programs. A pair that uses templates vs inheritance to do the same thing, a pair that uses templates vs selects, a pair that uses reference vs value argument passing, etc.
  • I'll measure the size of all of the STL containers and algorithms. This question has several dimensions: optimization, instantiation type, first vs. subsequent instantiation

    Check the size of using a functor adapter rather than writing a simple function. Probably use bind() rather than a standard STL adapter since it's more flexible and will be in the next version of the standard.

  • [Muller2000] shows how performance can be improved by adding specializations. I wonder how that affects the size of the executable.
  • I’ll also look at the performance of different argument passing conventions. Conventional wisdom is that one should pass arguments smaller than pointers by value, and larger arguments by reference. However, I suspect this fails to take into account the dereferencing penalty. The size over which an object should be passed by reference should be a function of the number of references to that object.

    Note

    I can probably write this experiment using templates. something like:

    template <int Size>
    struct sized_object {
      char data[Size];1
    };
    template <int References, int Size>
    void referring_function(const2 sized_object<Size> &3obj) {
      int dummy;
      for(unsigned int i=0; i<References; ++i) 4{
        dummy = obj.data[0]; 5
      }
    }
          

    1 An array is stored inside the object, unlike a dynamically allocated pointer. So a sized_object<N> is N bytes long.
    2 I don't expect that the const will affect code generation, but it could allow the compiler to get rid of something, so I'll try including and not including it.
    3 This is what I'm primarily testing. I need one version of this function that passes by reference and one that passes by value.
    4 This loop must either be unrolled for every instance of referring_function or never unrolled. I don't think it matters which, but it must be consistent.
    5

    I have to refer to the argument in such a way that it doesn't get optimized out. Some things to try:

    sum the members of the array
    Refer to each member, using % to distribute accesses

Compilers

I will run my benchmarks on several different compilers so that I avoid compiler-specific effects. I do not plan to benchmark these compilers against each other. (That way I avoid problems with running them all on the same platform.) I'm just checking if different tradeoffs need to be made depending on the compiler used. Some of these compilers are not free, so I may not be able to use them.

Optimization Flags

I don't yet know what optimization flags I'll test against. I expect that I need to play around with it a bit. I expect that the flags with the most effect on all compilers will control inlining. Without inlining, I expect using templates will always make code bigger, although I don't know how much. With inlining, templates could sometimes make the code both smaller and faster.

Glossary

Generic Programming

Template Metaprogramming

Bibliography

[Veldhuizen:1997:ISCOPE] Todd L. Veldhuizen and M. E. Jernigan. Will C++ be faster than Fortran?. Springer-Verlag. 1997. Marina del Rey, California. International Scientific Computing in Object-Oriented Parallel Environments. . http://osl.iu.edu/~tveldhui/papers/iscope97/index.html.

[Veldhuizen95b] Todd L. Veldhuizen. “Expression templates”. 26-31. http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html. C++ Report. 7. 5. June 1995. 1040-6042.

[Muller2000] Matthias Müller. Abstraction benchmarks and performance of C++ applications. The Fourth International Conference on Supercomputing in Nuclear Applications. September 4-7, 2000. Tokyo, Japan. . [http://www.hlrs.de/people/mueller/papers/sna2000/sna2000.ps.gz].