python - Passing objects to Spark -


i try understand capabilities of spark, fail see if following possible in python.

i have objects non pickable (wrapped c++ swig). have list of objects obj_list = [obj1, obj2, ...] objects have member function called .dostuff

i'd parallelized following loop in spark (in order run on aws since don't have big architecture internally. use multiprocessing, don't think can send objects on network):

[x.dostuff() x in obj_list]

any pointers appreciated.

if object's aren't picklable options pretty limited. if can create them on executor side though (frequently useful option things database connections), can parallelize regular list (e.g. maybe list of constructor parameters) , use map if dostuff function returns (picklable) values want use or foreach if dostuff function called side effects (like updating database or similar).


Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

Delphi 7 and decode UTF-8 base64 -

html - Is there any way to exclude a single element from the style? (Bootstrap) -