c++ - Accessing struct members and arrays of structs from LLVM IR -
if have c++ program declares struct
, say:
struct s { short s; union u { bool b; void *v; }; u u; };
and generate llvm ir via llvm c++ api mirror c++ declaration:
vector<type*> members; members.push_back( integertype::get( ctx, sizeof( short ) * 8 ) ); // since llvm doesn't support unions, use arraytype that's same size members.push_back( arraytype::get( integertype::get( ctx, 8 ), sizeof( s::u ) ) ); structtype *const llvm_s = structtype::create( ctx, "s" ); llvm_s->setbody( members );
how can ensure sizeof(s)
in c++ code same size structtype
in llvm ir code? same offsets of individual members, i.e., u.b
.
it's case have array of s
allocated in c++:
s *s_array = new s[10];
and pass s_array
llvm ir code in access individual elements of array. in order work, sizeof(s)
has same in both c++ , llvm ir this:
%elt = getelementptr %s* %ptr_to_start, i64 1
will access s_array[1]
properly.
when compile , run program below, outputs:
sizeof(s) = 16 allocsize(s) = 10
the problem llvm missing 6 bytes of padding between s::s
, s::u
. c++ compiler makes union
start on 8-byte-aligned boundary whereas llvm not.
i playing around datalayout
. machine [mac os x 10.9.5, g++ apple llvm version 6.0 (clang-600.0.57) (based on llvm 3.5svn)], if print data layout string, get:
e-m:o-i64:64-f80:128-n8:16:32:64-s128
if force-set data layout to:
e-m:o-i64:64-f80:128-n8:16:32:64-s128-a:64
where addition of a:64
means object of aggregate type aligns on 64-bit boundary, same size. why isn't default data layout correct?
complete working program below
// llvm #include <llvm/executionengine/executionengine.h> #include <llvm/executionengine/mcjit.h> #include <llvm/ir/derivedtypes.h> #include <llvm/ir/llvmcontext.h> #include <llvm/ir/module.h> #include <llvm/ir/type.h> #include <llvm/support/targetselect.h> // standard #include <iostream> #include <memory> #include <string> using namespace std; using namespace llvm; struct s { short s; union u { bool b; void *v; }; u u; }; executionengine* createengine( module *module ) { initializenativetarget(); initializenativetargetasmprinter(); unique_ptr<module> u( module ); enginebuilder eb( move( u ) ); string errstr; eb.seterrorstr( &errstr ); eb.setenginekind( enginekind::jit ); executionengine *const exec = eb.create(); if ( !exec ) { cerr << "could not create executionengine: " << errstr << endl; exit( 1 ); } return exec; } int main() { llvmcontext ctx; vector<type*> members; members.push_back( integertype::get( ctx, sizeof( short ) * 8 ) ); members.push_back( arraytype::get( integertype::get( ctx, 8 ), sizeof( s::u ) ) ); structtype *const llvm_s = structtype::create( ctx, "s" ); llvm_s->setbody( members ); module *const module = new module( "size_test", ctx ); executionengine *const exec = createengine( module ); datalayout const *const layout = exec->getdatalayout(); module->setdatalayout( layout ); cout << "sizeof(s) = " << sizeof( s ) << endl; cout << "allocsize(s) = " << layout->gettypeallocsize( llvm_s ) << endl; delete exec; return 0; }
since original answer correct answer "pre-edit" question, i'm writing new answer new question (and guess structs not same pretty good).
the problem isn't datalayout
such [but need datalayout solve problem, need update code create module before starting make llvm-ir], fact combining union
has alignment restrictions in struct
lesser alignment restrictions:
struct s { short s; // alignment = 2 union u { bool b; // alignment = 1 void *v; // alignment = 4 or 8 }; u u; // = alignment = 4 or 8 };
now in llvm code-gen:
members.push_back( integertype::get( ctx, sizeof( short ) * 8 ) ); members.push_back( arraytype::get( integertype::get( ctx, 8 ), sizeof( s::u ) ) );
the second element in struct char dummy[sizeof(s::u)]
, has alignment requirement of 1. so, of course, llvm align struct
differently c++ compiler has stricter alignment criteria.
in particular case, using i8 *
(aka void *
) in place of array of i8
trick [obviously relevant bitcast
translate other types necessary when accessing value of b
]
to fix this, in generic way, need produce struct
consisting of element largest alignment requirement in union
, , pad enough char
elements make largest size.
i'm going have eat now, code solves properly, it's bit more complex thought.
here main
posted above modified use pointer instead of char
array:
int main() { llvmcontext ctx; vector<type*> members; members.push_back( integertype::get( ctx, sizeof( short ) * 8 ) ); members.push_back( pointertype::getunqual( integertype::get( ctx, 8 ) ) ); structtype *const llvm_s = structtype::create( ctx, "s" ); llvm_s->setbody( members ); module *const module = new module( "size_test", ctx ); executionengine *const exec = createengine( module ); datalayout const *const layout = exec->getdatalayout(); module->setdatalayout( *layout ); cout << "sizeof(s) = " << sizeof( s ) << endl; cout << "allocsize(s) = " << layout->gettypeallocsize( llvm_s ) << endl; delete exec; return 0; }
there tiny changes cover fact setdatalayout
has changed between version of llvm , 1 i'm using.
and generic version allows type used:
type* makeuniontype( module* module, llvmcontext& ctx, vector<type*> um ) { const datalayout dl( module ); size_t maxsize = 0; size_t maxalign = 0; type* maxalignty = 0; for( auto m : um ) { size_t sz = dl.gettypeallocsize( m ); size_t al = dl.getpreftypealignment( m ); if( sz > maxsize ) maxsize = sz; if( al > maxalign) { maxalign = al; maxalignty = m; } } vector<type*> sv = { maxalignty }; size_t mas = dl.gettypeallocsize( maxalignty ); if( mas < maxsize ) { size_t n = maxsize - mas; sv.push_back(arraytype::get( integertype::get( ctx, 8 ), n ) ); } structtype* u = structtype::create( ctx, "u" ); u->setbody( sv ); return u; } int main() { llvmcontext ctx; module *const module = new module( "size_test", ctx ); executionengine *const exec = createengine( module ); datalayout const *const layout = exec->getdatalayout(); module->setdatalayout( *layout ); vector<type*> members; members.push_back( integertype::get( ctx, sizeof( short ) * 8 ) ); vector<type*> unionmembers = { pointertype::getunqual( integertype::get( ctx, 8 ) ), integertype::get( ctx, 1 ) }; members.push_back( makeuniontype( module, ctx, unionmembers ) ); structtype *const llvm_s = structtype::create( ctx, "s" ); llvm_s->setbody( members ); cout << "sizeof(s) = " << sizeof( s ) << endl; cout << "allocsize(s) = " << layout->gettypeallocsize( llvm_s ) << endl; delete exec; return 0; }
note in both cases, need bitcast
operation convert address of b
- , in second case, need bitcast convert struct
void *
, assuming want generic union
support, how you'd have anyway.
a complete piece of code generate union
type can found here, pascal compiler's variant
[which pascal's way make union
]:
https://github.com/leporacanthicus/lacsap/blob/master/types.cpp#l525 , code generation including bitcast: https://github.com/leporacanthicus/lacsap/blob/master/expr.cpp#l520
Comments
Post a Comment