apache spark - Tool for distributed task execution -


is beneficial use spark distributed task execution. have requirement of processing huge datasets (read database, process, write database) processing done row level. means not have need reduce or machine learning.

would overkill use spark kind of requirement. best suit kind of requirement. not want writing software infrastructure distribute optimally, handle failures, retries etc

spark more meant processing (really) large data sets , in-memory. 1 option use open source imdg , process data in similar fashion (maybe) less complexity.

you choose imdg engine based on language want use it. .net use ncache , java there many use tayzgrid


Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

How to get the ip address of VM and use it to configure SSH connection dynamically in Ansible -

javascript - Get parameter of GET request -