apache spark - Tool for distributed task execution -
is beneficial use spark distributed task execution. have requirement of processing huge datasets (read database, process, write database) processing done row level. means not have need reduce or machine learning.
would overkill use spark kind of requirement. best suit kind of requirement. not want writing software infrastructure distribute optimally, handle failures, retries etc
spark more meant processing (really) large data sets , in-memory. 1 option use open source imdg , process data in similar fashion (maybe) less complexity.
you choose imdg engine based on language want use it. .net use ncache , java there many use tayzgrid
Comments
Post a Comment