Help needed with GPU Buffers in Fuse (For and/or ComputeBuffer)

TremensS · December 25, 2023, 6:32pm

Hey there!

I’m trying to optimize my LED mapping toolkit for an installation with quite a lot of points (~20k).
While I have managed to get everything pretty much as light as possible upstream of the almighty Pipet step, I’m trying to enhance performance downstream of Pipet.
With just a few hundred LEDs, I would make operations on the individual R/G/B channels of the color spread out of Pipet using a regular ForEach approach to post-process and prepare all the data for both on-screen previewing and ArtNet output, but now that we have thousands it is struggling quite a bit.
I guess it is time to move this part to buffers (if I remember well, back in the Beta days I was using StructuredBuffers and it was great), but I’m struggling a bit in FUSE land.

I see 2 different approaches:

converting my color spread out of Pipet to a DynamicBuffer, and looping over this buffer in a For (Fuse) region to do my operations in GPU land;
or rather processing this DynamicBuffer with a ComputeBuffer approach.

First question is: is one of these two approaches more relevant than the other for this context?

Second question being: I’m struggling a bit to get either one working anyway 😅

Attached is a patch of my current WIP with both approaches, with a simplistic pipeline: basically my Pipet looks up a black/white texture that is my LEDs’ dimmer, and I want to multiply it with a color that I want to both use as a color spread for my scene preview (basically sent to an InstancingSpread in Stride), and my ArtNet output (R/G/B/W channels serialized).

with For: I guess I’m good with translating my operations to GPU but do no see how to get my ShaderNode to regular Spreads?
with ComputeBuffer: I have a compilation error reported by ComputeGraph about a signed/unsigned mismatch, but do not really understand where that would come from…

Could you give me any pointers to sort this out?
Thanks a lot!

xp-GPU-Buffers_20231225.vl (68.9 KB)

tonfilm · December 25, 2023, 8:50pm

@readme gave the correct answer, pipet (GPU → CPU transfer) should be the last step, immediately before sending the data. Everything before should be on the GPU, if you are using textures, fuse and so on. To use Fuse nodes, simply use fuse texture sampling, to process the texture pixels with fuse nodes.

Hey, not at my computer right now. But as a first optimisation, the pipet already operates on the gpu and creates a buffer for you, it has an internal readback which you don’t (should avoid) if you plan on operating on the Color values on the Hoi further. I think the node lets you disable the readback on one of the input pins, and its second output is the buffer - so you can skip the spread.

When you’re done with your gpu operation, you can have one single readback before outputting to artnet

Depending on the amount of values, reading back values should potentially be avoided altogether. You could also output everything to a texture and use a LED controller which uses a video signal instead if it’s in your budget

Something like this

Schnick-Schnack-Systems: Pixel-Gate

TremensS · December 26, 2023, 5:57pm

Hey there!

@readme

the pipet already operates on the gpu and creates a buffer for you, it has an internal readback which you don’t (should avoid) if you plan on operating on the Color values on the Hoi further. I think the node lets you disable the readback on one of the input pins, and its second output is the buffer - so you can skip the spread.

Indeed I didn’t know there was a hidden output pin on Pipet for a Buffer, thanks for pointing this out!
It doesn’t seem to be a similar option to enable/disable the ColorSpread output though.
If I don’t connect it, is it enough for it to not be be processed?

You could also output everything to a texture and use a LED controller which uses a video signal instead if it’s in your budget

Yes this was an option considered, but for “the challenge” I’m trying to optimize my pipeline as much as possible so I can use the same app and hardware (with ideally no additional hardware…) whatever the LED count is. And I should not be too far from done actually!
(also these things are dead expensive, if I had ever needed to go this route I was wondering whether another dedicated PC with a USB 3.0 capture card and a vvvv instance dedicated to a Pipet + ArtNet output would be really that different in perfs? Because definitely half the price!)

@tonfilm

pipet (GPU → CPU transfer) should be the last step, immediately before sending the data. Everything before should be on the GPU, if you are using textures, fuse and so on.

I already do most of the processing before the Pipet; but still do have some operations post-Pipet to at least for instance rearrange the color channels before ArtNet output (some LEDs are RGBW, some RGBWA, some RGB, etc.).
While I could also do that by rearranging the channels on the input Texture, now that I know that Pipet can directly output a Buffer without the need of DynamicBuffer, it is already making me stay in GPU land for these last operations without an additional Readback.

And while I agree I could/will probably work my needs around Textures before Pipet, I’m actually still interested in understanding how to wire properly these two approaches in my patch as I’m getting used to FUSE logic and this might get handy in other situations too ;)

I guess my real trouble is the actual readback phase, in both approaches drafted in the patch:

with the For region approach (middle panel), how do I actually Readback the Shadernode to a regular Spread?
with the ComputeBuffer approach (right panel), any idea on how to get rid of the compilation error?

Cheers all for your help!

tonfilm · December 27, 2023, 10:49pm

the point is that you do not need the pipet at all. the pipet is a texture sample + readback, the only thing you need here is a texture sample, which exists in fuse.

If you want to rearrange pixels. you can create a static immutable texture with the coordinates in it and sample the texture with these coordinates. rearranging the channels per pixel should then be just a split and join node.

texone · December 31, 2023, 9:32am

I guess this is how I would do it

You need a computeBuffer to write the values
This also includes reading values from texture directly as already suggested

texone · December 31, 2023, 9:32am

xp-GPU-Buffers_2.vl (27.7 KB)

TremensS · December 31, 2023, 5:15pm

Hey @texone and others

Thanks for all the hints you gave here and on Element.

Just as a personal memo, I put together a patch gathering snippets for the different approaches to readback data or textures from FUSE.
I’m attaching it here in case it might help other FUSE newbies struggling on the same basic workflows.
(Maybe it could make it to FUSE Help patches?)

Have a great NYE everyone! 🥳🖤

[EDIT 2024.01.13]
Updated the attached patch, fixed a few mistakes in the "Compute Buffer with <> type Spread" part*

ExsS.snippets.FUSE.readback_20240113.vl (170.1 KB)

Noir · April 22, 2024, 1:39pm

Thanks @TremensS for sharing!