Forum

How to use Vl to split data between threads?

Hi there,

We are working on a project that involves a massive amount of computation for a big collection of point clouds and we need a solution for splitting that computation between as many cores as possible. We use 5 x dual 8 cores xeon procs for a total amount of 160 logical procs at 3Ghz each.
Now, I am aware of posts like this here Ultimate way of sharing data among instances as well as similar ones, and we tried different scenarios with moderate success. The most promissing seems to be the zeromq approach, but also this one has it’s drawbacks, one of them being the pain in the B of manually implementing that and the fact that there is no sync option.

I was reading this introduction in The Gray Book, I quote:

" You want to offload parts of your patch to separate threads

Large patches can become computationally expensive and vvvv does not allow you to use the full power of your PC by being inherently single-threaded. Using vl you can define regions of your program that you want to run asynchronously to the main patch, thus using multiple CPUs in parallel."

Well yeah,that’s preciselly what we need! The question is, how to do that?

radoo

hei radoo,

can you describe a bit more in detail what you want to do?

  • where do you get the pointcloud data from? you load it from files? or it comes via network…?
  • what kind of computation are you talking about?
  • where is the data supposed to go then? back to files? out over the network? out for display?

Hey,
thank you for the quick reply!

The point clouds are algorithmically generated and their movement ( intricate algorithms as well) is creating long trails - and that is the computational problem I talked about. The data goes to a renderer.
I would like to split those point clouds into chunks and distribute that over as many threads as possible.

How many points are you talking about? And how do you draw them?

8 to 12 millions - rendered as buffered sprites and lines

How important is it that the particles are moved via CPU? Is it a very complex algorithm?

I think with this large amount of points, no CPU would give you great frame rates. Also, if you do it on CPU you have to upload the whole data every frame to the GPU.

The ideal solution for this scenario is to upload data once to the GPU and then animate it with compute shaders… But that depends on what exactly you want to do with the points.

Sure, look, we are aware of this and we ported everything we could to the GPU (compute shaders, like you said, creation, animation etc happens in the GPUs)- which is a bunch of Titan XPs that eat that amount on bread and ask for a desert afterwards. However we reached a scenario where it is not possible to use the GPUs for two reasons: the calculation of the trails gets screwed up after a certain threshold - the thread group size seems to be the culprit there, but maybe I’m wrong. The end of the queue gets screwed up, the longer the trails the bigger the problem.
The second reason is that with different point clouds coming from different GPUs we are forced to do a readback of that data in order to send them over and render them in one place. But if you have a suggestion here, I am going to make your portrait in dough, vegetables or chewing gum!