apache spark - Issue Running PI example on Standalone Cluster -
first let me newbie spark, sparkr, hadoop, etc... .net developer has been tasked integrating our .net applications apache spark , apache sparkr. able run samples locally, when pointing linux cluster (master: spark01, slaves: spark02-spark05), unable run pi sample. when use following script, following errors.
my client mode command: <p> c:\mydata\apache_spark\sparkclr-master\build\runtime>scripts\sparkclr-submit.cmd --proxy-user miadmin --total-executor-cores 2 --master spark://spark01:7077 --exe pi.exe c:\mydata\apache_spark\sparkclr-master\examples\pi\bin\debug spark.local.dir %temp% errors:
"c:\mydata\apache_spark\sparkclr-master\build\tools\spark-1.6.0-bin-hadoop2.6\conf\spark-env.cmd" sparkclr_jar=spark-clr_2.10-1.6.0-snapshot.jar zip driver directory c:\mydata\apache_spark\sparkclr-master\examples\pi\bin\debug c:\users\shunley\appdata\local\temp\debug_1453925538545.zip [sparkclr-submit.cmd] command run --proxy-user miadmin --total-executor-cores 2 --master spark://spark01:7077 --name pi --files c:\users\shunley\appdata\local\temp\debug_1453925538545.zip --class org.apache.spark.deploy.csharp.csharprunner c:\mydata\apache_spark\sparkclr-master\build\runtime\lib\spark-clr_2.10-1.6.0-snapshot.jar c:\mydata\apache_spark\sparkclr-master\examples\pi\bin\debug c:\mydata\apache_spark\sparkclr-master\examples\pi\bin\debug\pi.exe spark.local.dir c:\users\shunley\appdata\local\temp [csharprunner.main] starting csharpbackend! [csharprunner.main] port number used csharpbackend 4485 [csharprunner.main] adding key=spark.jars , value=file:/c:/mydata/apache_spark/sparkclr-master/build/runtime/lib/spark-clr_2.10-1.6.0-snapshot.jar environment [csharprunner.main] adding key=spark.app.name , value=pi environment [csharprunner.main] adding key=spark.cores.max , value=2 environment [csharprunner.main] adding key=spark.files , value=file:/c:/users/shunley/appdata/local/temp/debug_1453925538545.zip environment [csharprunner.main] adding key=spark.submit.deploymode , value=client environment [csharprunner.main] adding key=spark.master , value=spark://spark01:7077 environment [2016-01-27t20:12:19.7218665z] [shunley10] [info] [configurationservice] configurationservice runmode cluster [2016-01-27t20:12:19.7228674z] [shunley10] [info] [sparkclrconfiguration] csharpbackend read environment variable csharpbackend_port [2016-01-27t20:12:19.7228674z] [shunley10] [info] [sparkclripcproxy] csharpbackend port number used in jvmbridge 4485 [2016-01-27 15:12:19,866] [1] [debug] [microsoft.spark.csharp.examples.piexample] - spark.local.dir set c:\users\shunley\appdata\local\temp\ [2016-01-27 15:12:21,467] [1] [info ] [microsoft.spark.csharp.examples.piexample] - ----- running pi example ----- collectandserve on object of type nullobject failed null java.lang.reflect.invocationtargetexception @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ org.apache.spark.api.csharp.csharpbackendhandler.handlemethodcall(csharpbackendhandler.scala:153) @ org.apache.spark.api.csharp.csharpbackendhandler.channelread0(csharpbackendhandler.scala:94) @ org.apache.spark.api.csharp.csharpbackendhandler.channelread0(csharpbackendhandler.scala:27) @ io.netty.channel.simplechannelinboundhandler.channelread(simplechannelinboundhandler.java:105) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:308) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:294) @ io.netty.handler.codec.messagetomessagedecoder.channelread(messagetomessagedecoder.java:103) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:308) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:294) @ io.netty.handler.codec.bytetomessagedecoder.channelread(bytetomessagedecoder.java:244) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:308) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:294) @ io.netty.channel.defaultchannelpipeline.firechannelread(defaultchannelpipeline.java:846) @ io.netty.channel.nio.abstractniobytechannel$niobyteunsafe.read(abstractniobytechannel.java:131) @ io.netty.channel.nio.nioeventloop.processselectedkey(nioeventloop.java:511) @ io.netty.channel.nio.nioeventloop.processselectedkeysoptimized(nioeventloop.java:468) @ io.netty.channel.nio.nioeventloop.processselectedkeys(nioeventloop.java:382) @ io.netty.channel.nio.nioeventloop.run(nioeventloop.java:354) @ io.netty.util.concurrent.singlethreadeventexecutor$2.run(singlethreadeventexecutor.java:111) @ io.netty.util.concurrent.defaultthreadfactory$defaultrunnabledecorator.run(defaultthreadfactory.java:137) @ java.lang.thread.run(thread.java:745) caused by: org.apache.spark.sparkexception: job aborted due stage failure: task 0 in stage 0.0 failed 4 times, recent failure: lost task 0.3 in stage 0.0 (tid 9, spark02): java.io.ioexception: cannot run program "csharpworker.exe": error=2, no such file or directory @ java.lang.processbuilder.start(processbuilder.java:1047) @ org.apache.spark.api.python.pythonworkerfactory.startdaemon(pythonworkerfactory.scala:161) @ org.apache.spark.api.python.pythonworkerfactory.createthroughdaemon(pythonworkerfactory.scala:87) @ org.apache.spark.api.python.pythonworkerfactory.create(pythonworkerfactory.scala:63) @ org.apache.spark.sparkenv.createpythonworker(sparkenv.scala:134) @ org.apache.spark.api.python.pythonrunner.compute(pythonrdd.scala:101) @ org.apache.spark.api.python.pythonrdd.compute(pythonrdd.scala:70) @ org.apache.spark.api.csharp.csharprdd.compute(csharprdd.scala:62) @ org.apache.spark.rdd.rdd.computeorreadcheckpoint(rdd.scala:306) @ org.apache.spark.rdd.rdd.iterator(rdd.scala:270) @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:66) @ org.apache.spark.scheduler.task.run(task.scala:89) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:213) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:745) caused by: java.io.ioexception: error=2, no such file or directory @ java.lang.unixprocess.forkandexec(native method) @ java.lang.unixprocess.(unixprocess.java:187) @ java.lang.processimpl.start(processimpl.java:130) @ java.lang.processbuilder.start(processbuilder.java:1028) ... 15 more driver stacktrace: @ org.apache.spark.scheduler.dagscheduler.org$apache$spark$scheduler$dagscheduler$$failjobandindependentstages(dagscheduler.scala:1431) @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1419) @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1418) @ scala.collection.mutable.resizablearray$class.foreach(resizablearray.scala:59) @ scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:47) @ org.apache.spark.scheduler.dagscheduler.abortstage(dagscheduler.scala:1418) @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:799) @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:799) @ scala.option.foreach(option.scala:236) @ org.apache.spark.scheduler.dagscheduler.handletasksetfailed(dagscheduler.scala:799) @ org.apache.spark.scheduler.dagschedulereventprocessloop.doonreceive(dagscheduler.scala:1640) @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dagscheduler.scala:1599) @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dagscheduler.scala:1588) @ org.apache.spark.util.eventloop$$anon$1.run(eventloop.scala:48) @ org.apache.spark.scheduler.dagscheduler.runjob(dagscheduler.scala:620) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1832) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1845) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1858) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1929) @ org.apache.spark.rdd.rdd$$anonfun$collect$1.apply(rdd.scala:927) @ org.apache.spark.rdd.rddoperationscope$.withscope(rddoperationscope.scala:150) @ org.apache.spark.rdd.rddoperationscope$.withscope(rddoperationscope.scala:111) @ org.apache.spark.rdd.rdd.withscope(rdd.scala:316) @ org.apache.spark.rdd.rdd.collect(rdd.scala:926) @ org.apache.spark.api.python.pythonrdd$.collectandserve(pythonrdd.scala:405) @ org.apache.spark.api.python.pythonrdd.collectandserve(pythonrdd.scala) ... 25 more caused by: java.io.ioexception: cannot run program "csharpworker.exe": error=2, no such file or directory @ java.lang.processbuilder.start(processbuilder.java:1047) @ org.apache.spark.api.python.pythonworkerfactory.startdaemon(pythonworkerfactory.scala:161) @ org.apache.spark.api.python.pythonworkerfactory.createthroughdaemon(pythonworkerfactory.scala:87) @ org.apache.spark.api.python.pythonworkerfactory.create(pythonworkerfactory.scala:63) @ org.apache.spark.sparkenv.createpythonworker(sparkenv.scala:134) @ org.apache.spark.api.python.pythonrunner.compute(pythonrdd.scala:101) @ org.apache.spark.api.python.pythonrdd.compute(pythonrdd.scala:70) @ org.apache.spark.api.csharp.csharprdd.compute(csharprdd.scala:62) @ org.apache.spark.rdd.rdd.computeorreadcheckpoint(rdd.scala:306) @ org.apache.spark.rdd.rdd.iterator(rdd.scala:270) @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:66) @ org.apache.spark.scheduler.task.run(task.scala:89) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:213) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) ... 1 more caused by: java.io.ioexception: error=2, no such file or directory @ java.lang.unixprocess.forkandexec(native method) @ java.lang.unixprocess.(unixprocess.java:187) @ java.lang.processimpl.start(processimpl.java:130) @ java.lang.processbuilder.start(processbuilder.java:1028) ... 15 more () methods: public static int org.apache.spark.api.python.pythonrdd.collectandserve(org.apache.spark.rdd.rdd) args: argtype: org.apache.spark.api.csharp.csharprdd, argvalue: csharprdd[1] @ rdd @ pythonrdd.scala:43 [2016-01-27t20:12:28.0995397z] [shunley10] [error] [jvmbridge] jvm method execution failed: static method collectandserve failed class org.apache.spark.api.python.pythonrdd when called 1 parameters ([index=1, type=jvmobjectreference, value=12], ) [2016-01-27t20:12:28.0995397z] [shunley10] [error] [jvmbridge] org.apache.spark.sparkexception: job aborted due stage failure: task 0 in stage 0.0 failed 4 times, recent failure: lost task 0.3 in stage 0.0 (tid 9, spark02): java.io.ioexception: cannot run program "csharpworker.exe": error=2, no such file or directory @ java.lang.processbuilder.start(processbuilder.java:1047) @ org.apache.spark.api.python.pythonworkerfactory.startdaemon(pythonworkerfactory.scala:161) @ org.apache.spark.api.python.pythonworkerfactory.createthroughdaemon(pythonworkerfactory.scala:87) @ org.apache.spark.api.python.pythonworkerfactory.create(pythonworkerfactory.scala:63) @ org.apache.spark.sparkenv.createpythonworker(sparkenv.scala:134) @ org.apache.spark.api.python.pythonrunner.compute(pythonrdd.scala:101) @ org.apache.spark.api.python.pythonrdd.compute(pythonrdd.scala:70) @ org.apache.spark.api.csharp.csharprdd.compute(csharprdd.scala:62) @ org.apache.spark.rdd.rdd.computeorreadcheckpoint(rdd.scala:306) @ org.apache.spark.rdd.rdd.iterator(rdd.scala:270) @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:66) @ org.apache.spark.scheduler.task.run(task.scala:89) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:213) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:745) caused by: java.io.ioexception: error=2, no such file or directory @ java.lang.unixprocess.forkandexec(native method) @ java.lang.unixprocess.(unixprocess.java:187) @ java.lang.processimpl.start(processimpl.java:130) @ java.lang.processbuilder.start(processbuilder.java:1028) ... 15 more driver stacktrace: @ org.apache.spark.scheduler.dagscheduler.org$apache$spark$scheduler$dagscheduler$$failjobandindependentstages(dagscheduler.scala:1431) @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1419) @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1418) @ scala.collection.mutable.resizablearray$class.foreach(resizablearray.scala:59) @ scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:47) @ org.apache.spark.scheduler.dagscheduler.abortstage(dagscheduler.scala:1418) @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:799) @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:799) @ scala.option.foreach(option.scala:236) @ org.apache.spark.scheduler.dagscheduler.handletasksetfailed(dagscheduler.scala:799) @ org.apache.spark.scheduler.dagschedulereventprocessloop.doonreceive(dagscheduler.scala:1640) @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dagscheduler.scala:1599) @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dagscheduler.scala:1588) @ org.apache.spark.util.eventloop$$anon$1.run(eventloop.scala:48) @ org.apache.spark.scheduler.dagscheduler.runjob(dagscheduler.scala:620) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1832) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1845) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1858) @ org.apache.spark.sparkcontext.runjob(sparkcontext.scala:1929) @ org.apache.spark.rdd.rdd$$anonfun$collect$1.apply(rdd.scala:927) @ org.apache.spark.rdd.rddoperationscope$.withscope(rddoperationscope.scala:150) @ org.apache.spark.rdd.rddoperationscope$.withscope(rddoperationscope.scala:111) @ org.apache.spark.rdd.rdd.withscope(rdd.scala:316) @ org.apache.spark.rdd.rdd.collect(rdd.scala:926) @ org.apache.spark.api.python.pythonrdd$.collectandserve(pythonrdd.scala:405) @ org.apache.spark.api.python.pythonrdd.collectandserve(pythonrdd.scala) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ org.apache.spark.api.csharp.csharpbackendhandler.handlemethodcall(csharpbackendhandler.scala:153) @ org.apache.spark.api.csharp.csharpbackendhandler.channelread0(csharpbackendhandler.scala:94) @ org.apache.spark.api.csharp.csharpbackendhandler.channelread0(csharpbackendhandler.scala:27) @ io.netty.channel.simplechannelinboundhandler.channelread(simplechannelinboundhandler.java:105) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:308) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:294) @ io.netty.handler.codec.messagetomessagedecoder.channelread(messagetomessagedecoder.java:103) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:308) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:294) @ io.netty.handler.codec.bytetomessagedecoder.channelread(bytetomessagedecoder.java:244) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:308) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:294) @ io.netty.channel.defaultchannelpipeline.firechannelread(defaultchannelpipeline.java:846) @ io.netty.channel.nio.abstractniobytechannel$niobyteunsafe.read(abstractniobytechannel.java:131) @ io.netty.channel.nio.nioeventloop.processselectedkey(nioeventloop.java:511) @ io.netty.channel.nio.nioeventloop.processselectedkeysoptimized(nioeventloop.java:468) @ io.netty.channel.nio.nioeventloop.processselectedkeys(nioeventloop.java:382) @ io.netty.channel.nio.nioeventloop.run(nioeventloop.java:354) @ io.netty.util.concurrent.singlethreadeventexecutor$2.run(singlethreadeventexecutor.java:111) @ io.netty.util.concurrent.defaultthreadfactory$defaultrunnabledecorator.run(defaultthreadfactory.java:137) @ java.lang.thread.run(thread.java:745) caused by: java.io.ioexception: cannot run program "csharpworker.exe": error=2, no such file or directory @ java.lang.processbuilder.start(processbuilder.java:1047) @ org.apache.spark.api.python.pythonworkerfactory.startdaemon(pythonworkerfactory.scala:161) @ org.apache.spark.api.python.pythonworkerfactory.createthroughdaemon(pythonworkerfactory.scala:87) @ org.apache.spark.api.python.pythonworkerfactory.create(pythonworkerfactory.scala:63) @ org.apache.spark.sparkenv.createpythonworker(sparkenv.scala:134) @ org.apache.spark.api.python.pythonrunner.compute(pythonrdd.scala:101) @ org.apache.spark.api.python.pythonrdd.compute(pythonrdd.scala:70) @ org.apache.spark.api.csharp.csharprdd.compute(csharprdd.scala:62) @ org.apache.spark.rdd.rdd.computeorreadcheckpoint(rdd.scala:306) @ org.apache.spark.rdd.rdd.iterator(rdd.scala:270) @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:66) @ org.apache.spark.scheduler.task.run(task.scala:89) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:213) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) ... 1 more caused by: java.io.ioexception: error=2, no such file or directory @ java.lang.unixprocess.forkandexec(native method) @ java.lang.unixprocess.(unixprocess.java:187) @ java.lang.processimpl.start(processimpl.java:130) @ java.lang.processbuilder.start(processbuilder.java:1028) ... 15 more [2016-01-27t20:12:28.1296129z] [shunley10] [exception] [jvmbridge] jvm method execution failed: static method collectandserve failed class org.apache.spark.api.python.pythonrdd when called 1 parameters ([index=1, type=jvmobjectreference, value=12], ) @ microsoft.spark.csharp.interop.ipc.jvmbridge.calljavamethod(boolean isstatic, object classnameorjvmobjectreference, string methodname, object[] parameters) [2016-01-27 15:12:28,130] [1] [info ] [microsoft.spark.csharp.examples.piexample] - ----- error running pi example (duration=00:00:06.6599877) ----- system.exception: jvm method execution failed: static method collectandserve failed class org.apache.spark.api.python.pythonrdd when called 1 parameters ([index=1, type=jvmobjectreference, value=12], ) @ microsoft.spark.csharp.interop.ipc.jvmbridge.calljavamethod(boolean isstatic, object classnameorjvmobjectreference, string methodname, object[] parameters) @ microsoft.spark.csharp.interop.ipc.jvmbridge.callstaticjavamethod(string classname, string methodname, object[] parameters) @ microsoft.spark.csharp.proxy.ipc.rddipcproxy.collectandserve() @ microsoft.spark.csharp.core.rdd1.collect() @ microsoft.spark.csharp.core.rdd1.reduce(func`3 f) @ microsoft.spark.csharp.examples.piexample.pi() in c:\mydata\apache_spark\sparkclr-master\examples\pi\program.cs:line 76 @ microsoft.spark.csharp.examples.piexample.main(string[] args) in c:\mydata\apache_spark\sparkclr-master\examples\pi\program.cs:line 35 [2016-01-27 15:12:28,131] [1] [info ] [microsoft.spark.csharp.examples.piexample] - completed running examples. calling sparkcontext.stop() tear down ... [2016-01-27 15:12:28,131] [1] [info ] [microsoft.spark.csharp.examples.piexample] - if program (sparkclrexamples.exe) not terminate in 10 seconds, please manually terminate java process launched program!!! requesting close call sockets. [csharprunner.main] closing csharpbackend requesting close call sockets. [csharprunner.main] return csharpbackend code 1 utils.exit() status: 1, maxdelaymillis: 1000 i have couple of questions documentation , quickstart here: https://github.com/microsoft/sparkclr/wiki/quick-start , didn't talk it.
when quickstart says use following command standalone cluster environment:
cd \path\to\runtime scripts\sparkclr-submit.cmd ^ --total-executor-cores 2 ^ --master spark://host:port ^ --exe pi.exe ^ \path\to\pi\bin[debug|release] ^ spark.local.dir %temp% i understand navigating runtime folder (locally or on submitting server) on first line. specifying master knows spark cluster run on (this remote spark cluster). now, confusing here still pointing local (windows) file system pi executable , temp directory? specify data directory? if we're specifying linux directory on cluster our data, what's format (especially if we're not using hadoop)? user@spark url:/path/to/sparkclr/runtime/samples/pi/bin?
we're looking use spark , sparkr our processing our application , trying understand how api interacts spark, submitting work, retrieving results, etc..
any getting cluster samples , running (client , cluster mode) appreciated.
thanks,
scott
according given error message, seems csharpworker.exe missing. please double check whether present under directory c:\mydata\apache_spark\sparkclr-master\examples\pi\bin\debug.
below typical file list of pi example fyi:
01/25/2016 02:36 pm <dir> . 01/25/2016 02:36 pm <dir> .. 01/21/2016 11:58 16,384 csharpworker.exe 01/21/2016 11:58 1,737 csharpworker.exe.config 01/13/2016 09:55 pm 304,640 log4net.dll 01/13/2016 09:55 pm 1,533,153 log4net.xml 01/21/2016 11:58 233,472 microsoft.spark.csharp.adapter.dll 01/13/2016 09:55 pm 520,192 newtonsoft.json.dll 01/13/2016 09:55 pm 501,178 newtonsoft.json.xml 01/21/2016 12:42 pm 8,704 pi.exe 01/13/2016 10:00 pm 1,673 pi.exe.config 01/21/2016 12:42 pm 17,920 pi.pdb 01/25/2016 02:36 pm 24,216 pi.vshost.exe 01/13/2016 10:00 pm 1,673 pi.vshost.exe.config 07/10/2015 07:01 pm 490 pi.vshost.exe.manifest 01/13/2016 09:55 pm 74,240 razorvine.pyrolite.dll 01/13/2016 09:55 pm 40,960 razorvine.serpent.dll answers other questions:
question 1: confusing here still pointing local (windows) file system pi executable , temp directory?
it depends on deploy mode use. client mode, driver program runs locally, should need put pi executable , dependencies on local file system. cluster mode, need put executable , dependencies zip file , upload hdfs, need put spark-clr_2.10-1.5.200.jar on hdfs, use below command submit application.
sparkclr-submit.cmd --proxy-user miadmin --total-executor-cores 20 --master spark://spark01:7077 --remote-sparkclr-jar hdfs://path/to/spark-clr_2.10-1.5.200.jar --exe pi.exe hdfs://path/to/pi.zip question 2: specify data directory? if we're specifying linux directory on cluster our data, what's format (especially if we're not using hadoop)? user@spark url:/path/to/sparkclr/runtime/samples/pi/bin
if don't misunderstand, data directory mentioned here should used driver program. if so, it's totally determined whether driver can handle format. arguments specified after driver directory or zip in submission cmd passed driver program directly program arguments.
Comments
Post a Comment