gcc - Do C++ compilers perform compile-time optimizations on lambda closures? -
suppose have following (nonsensical) code:
const int = 0; int c = 0; for(int b = 0; b < 10000000; b++) { if(a) c++; c += 7; }
variable 'a' equals zero, compiler can deduce on compile time, instruction 'if(a) c++;' never executed , optimize away.
my question: does same happen lambda closures?
check out piece of code:
const int = 0; function<int()> lambda = [a]() { int c = 0; for(int b = 0; b < 10000000; b++) { if(a) c++; c += 7; } return c; }
will compiler know 'a' 0 , optimize lambda?
even more sophisticated example:
function<int()> generate_lambda(const int a) { return [a]() { int c = 0; for(int b = 0; b < 10000000; b++) { if(a) c++; c += 7; } return c; }; } function<int()> a_is_zero = generate_lambda(0); function<int()> a_is_one = generate_lambda(1);
will compiler smart enough optimize first lambda when knows 'a' 0 @ generation time?
does gcc or llvm have kind of optimizations?
i'm asking because wonder if should make such optimizations manually when know assumptions satisfied on lambda generation time or compiler me.
looking @ assembly generated gcc5.2 -o2 shows optimization not happen when using std::function
:
#include <functional> int main() { const int = 0; std::function<int()> lambda = [a]() { int c = 0; for(int b = 0; b < 10000000; b++) { if(a) c++; c += 7; } return c; }; return lambda(); }
compiles boilerplate and
movl (%rdi), %ecx movl $10000000, %edx xorl %eax, %eax .p2align 4,,10 .p2align 3 .l3: cmpl $1, %ecx sbbl $-1, %eax addl $7, %eax subl $1, %edx jne .l3 rep; ret
which loop wanted see optimized away. (live) if use lambda (and not std::function
), optimization happen:
int main() { const int = 0; auto lambda = [a]() { int c = 0; for(int b = 0; b < 10000000; b++) { if(a) c++; c += 7; } return c; }; return lambda(); }
compiles to
movl $70000000, %eax ret
i.e. loop removed completely. (live)
afaik, can expect lambda have 0 overhead, std::function
different , comes cost (at least @ current state of optimizers, although people apparently work on this), if code "inside std::function
" have been optimized. (take grain of salt , try if in doubt, since vary between compilers , versions. std::function
s overhead can optimized away.)
as @marcglisse correctly pointed out, clang3.6 performs desired optimization (equivalent second case above) std::function
. (live)
bonus edit, @markglisse again: if function contains std::function
not called main
, optimization happening gcc5.2 somewhere between gcc+main , clang, i.e. function gets reduced return 70000000;
plus code. (live)
bonus edit 2, time mine: if use -o3, gcc will, (for reason) explained in marco's answer, optimize std::function
to
cmpl $1, (%rdi) sbbl %eax, %eax andl $-10000000, %eax addl $80000000, %eax ret
and keep rest in not_main
case. guess @ bottom of line, 1 have measure when using std::function
.
Comments
Post a Comment