apache spark - Exception: could not open socket on pyspark -
whenever trying execute simple processing in pyspark, fails open socket.
>>> myrdd = sc.parallelize(range(6), 3) >>> sc.runjob(myrdd, lambda part: [x * x x in part])
above throws exception -
port 53554 , proto 6 , sa ('127.0.0.1', 53554) traceback (most recent call last): file "<stdin>", line 1, in <module> file "/volumes/work/bigdata/spark-custom/python/pyspark/context.py", line 917, in runjob return list(_load_from_socket(port, mappedrdd._jrdd_deserializer)) file "/volumes/work/bigdata/spark-custom/python/pyspark/rdd.py", line 143, in _load_from_socket raise exception("could not open socket") exception: not open socket >>> 15/08/30 19:03:05 error pythonrdd: error while sending iterator java.net.sockettimeoutexception: accept timed out @ java.net.plainsocketimpl.socketaccept(native method) @ java.net.abstractplainsocketimpl.accept(abstractplainsocketimpl.java:404) @ java.net.serversocket.implaccept(serversocket.java:545) @ java.net.serversocket.accept(serversocket.java:513) @ org.apache.spark.api.python.pythonrdd$$anon$2.run(pythonrdd.scala:613)
i checked through rdd.py _load_from_socket , realised gets port , server not started or sp runjob might issue-
port = self._jvm.pythonrdd.runjob(self._jsc.sc(), mappedrdd._jrdd, partitions)
its not ideal solution aware of cause. pyspark unable create jvm socket jdk 1.8 (64-bit) version, set java path jdk 1.7 , worked.
Comments
Post a Comment