Existing Research on C++ Template Performance

Jeffrey Yasskin

University of Texas at Austin
Plan II

Table of Contents

Runtime speed improvement
Compilation time penalty
Object code bloat and executable bloat
Runtime memory bloat
Promising Links
Glossary
Bibliography

The problem I plan to study is C++ template performance in both time and space. Existing research in this area has focused on the speed benefits of using templates, but have almost entirely ignored any space penalty that might be associated. They do examine increases in compilation time, although most evidence is anecdotal.

Runtime speed improvement

Compilation time penalty

Object code bloat and executable bloat

Runtime memory bloat

Promising Links

These sites contain promising information. It seems people are calling this subject optimization rather than performance.

http://www.tantalon.com/pete/cppopt/main.htm
This site talks about design considerations, as you go guidelines, and optimizations to perform only when the profiler says to. It seems to be weighted towards Microsoft Visual C++, but I don't expect that to skew the results much. He has speed benchmarks but not size ones for almost everything he talks about and covers int, std::complex, std::string, and a custom bitmap. I expect this would be a good place to get ideas for writing benchmarks.
http://cgi.cse.unsw.edu.au/~ideas/index.php?module=articles&func=display&ptid=1&aid=279 or http://www.cs.ucsd.edu/users/wgg/Statements/mburton-ifip-gw-07-2002.pdf
This paper describes difficulties switching between Generic Programming (GP) and Template Metaprogramming (TMP). Then it presents some techniques to make TMP code look more like GP code. In section 7, they say that their TMP solution is 3.5 times faster than a solution using function pointers and 11 times faster than one using virtual functions. On the other hand, they report that compilation times grew from 3.5 minutes to 3 hours. However, they do not report how large the resulting executables were.
Heterogeneous, Nested STL Containers in C++
This has performance numbers for its implementation but doesn't compare it to a fully inherited approach.
Todd Veldhuizen

He studies C++ templates as a partial evaluation system. He began by exploring the speed of templates, and then continued to discuss redesigning C++ compilers to do partial evaluation.

His papers relevant to my work include

[Veldhuizen:1997:ISCOPE]
Describes how C++ performance has caught up to Fortran performance and may even surpass it. He uses Blitz++ to demonstrate template performance.
[Veldhuizen95b]
Describes the technique of expression templates to replace callback functions in functions like sort() and to optimize calculations involving large objects like matrices or vectors. These may have opposite space implications.
http://osl.iu.edu/~tveldhui/papers/2000/tmpw00/index.html
He's implemented all of the spectrum between templates and dynamic typing (like inheritance), so the tradeoffs I'm thinking of measuring could be done at optimization-time rather than as part of designing the program. If his compiler gets widely used, my paper will be mostly irrelevant.

http://www.hlrs.de/people/mueller/papers/sna2000/sna2000.ps.gz
He shows how template specialization can be used to improve the performance of numerical algorithms in the presence of badly optimizing compilers. I should look at this too: how does adding a specialization affect the size of the generated code?
Empty Base Class Optimization
This isn't exactly a link, but this is a place to note it. This can be used to reduce the extra space used by empty (usually policy) members
http://www.math.utah.edu/pub/tex/bib/cppreport.html#Hansen:1997:HRC, January 1997 of C++ Report
http://groups.google.com/groups?selm=slrn9mokna.d4k.elflord%40panix6.panix.com

> I guess one big question: you refer to code bloat. Is this code bloat actually causing you problems?

Yes, it is. A map class on my implementation costs 90kb *per instance*. Instances of the list class are about 45kb *per instance*. That means that a large program could produce an enormous executable.

http://groups.google.com/groups?threadm=MPG.174cf495d64808629896d4%40news.hevanet.com
This is a thread from 2002 about what people mean by "code bloat".

Glossary

Generic Programming

Template Metaprogramming

Bibliography

[Veldhuizen:1997:ISCOPE] Todd L. Veldhuizen and M. E. Jernigan. Will C++ be faster than Fortran?. Springer-Verlag. 1997. Marina del Rey, California. International Scientific Computing in Object-Oriented Parallel Environments. . http://osl.iu.edu/~tveldhui/papers/iscope97/index.html.

[Veldhuizen95b] Todd L. Veldhuizen. “Expression templates”. 26-31. http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html. C++ Report. 7. 5. June 1995. 1040-6042.

[Muller2000] Matthias Müller. Abstraction benchmarks and performance of C++ applications. The Fourth International Conference on Supercomputing in Nuclear Applications. September 4-7, 2000. Tokyo, Japan. . [http://www.hlrs.de/people/mueller/papers/sna2000/sna2000.ps.gz].