c++ - Open CL Running parallel tasks on data parallel kernel -
i'm reading on opencl framework because of reasons regarding thesis work. , i've come across far can either run kernels in data parallel or in task parallel. i've got question , can't manage find answer.
q: have vector want sum up. can in opencl writing kernel data parallel process , run it. simple.
however, have 10+ different vectors need summed also. possible run these 10+ different vectors in task parallel, while still using kernel processes them "data parallel"?
so parallelize tasks, in sense run in parallel? because i've come understand can either run tasks parallel, or run 1 task in parallel.
the whole task-parallel/data-parallel distinction in opencl mistake. deprecated clenqueuetask in opencl 2.0 because had no meaning.
all enqueued entities in opencl can viewed tasks. tasks may run concurrently, may run in parallel, may serialized. may need multiple queues run them concurrently, or single out-of-order queue, implementation-defined flexible.
those tasks may data-parallel, if made of multiple work-items working on different data elements within same task. may not be, consisting of 1 work-item. last definition clenqueuetask used provide - however, because had no meaning whatsoever compared clenqueuendrangekernel global size of (1,1,1), , not checked against in kernel code, deprecating safer option.
so yes, if enqueue multiple ndranges, can have multiple tasks in parallel, each 1 of data-parallel.
you can copy of vectors @ once inside 1 data-parallel kernel, if careful way pass them in. 1 option launch range of work-groups, each 1 iterates through single vector copying (that might fastest way on cpu cache prefetching reasons). have each work-item copy 1 element using complex lookup see vector copy from, have high overhead. or can launch multiple parallel kernels, each 1 kernel, , have runtime decide if can run them together.
Comments
Post a Comment