I am using an append buffer in a compute shader, and the amount of data being appended depends on processing the incoming image. I have “Reset Counter” and “Appendable” set on the Renderer, but the count of the data output sticks at the highest level seen.
I am using DX11.BufferReadbackDynamic to get the data from the Renderer, which is what is reporting the data count.
It is my understanding that setting “Reset Counter” true should reset buffer count to the “Reset Counter Value” each frame, which I have at the default of zero - is that not the case?
However, I noticed if the “Element Count” pin is even just disconnected/reconnected (so the value does not change) the output count resets to zero and starts sticking to the highest value seen again.
I tried a hack workaround to toggle “Element Count” by one each frame to force a count reset, only to find no data output at all, a result apparently of an “Element Count” bug described elsewhere.
Poked around a bit more, and what seems to be happening is that the buffer is cleared initially, and not afterward. The actual element count is always the number specified into the Renderer. The count I was seeing getting stuck at the high point was the number of elements that had actually been written to at some point.
The Appends in the compute shader start at the beginning of the buffer each frame, which is cool, but how on earth to know how many appends were actually done each frame?
Append buffers are fixed size, so the real (or maximum) buffer size is always the one specified in the Element Count pin.
Append buffers hold an internal counter which is then used for processing only “valid elements”.
To retrieve that number -> CopyCounter + ReadBack (DX11Buffer Raw), with a size of 16 with give you how many elements got appended.
Changing element count size every frame is a pretty bad idea (actually it should not really change at all for your whole application lifetime, except at design time of course), since you recreate resource every frame, which will at some point lead to memory fragmentation, and cause some potential slowdowns.
I’m a tad confused. I hooked up a CopyCount node, and then a ReadBack (Raw) node, but the ReadBack node has no output pins! What am I missing?
Is there a way to access that counter inside a dynamic plugin? I checked the available elements for a IDX11RWStructureBuffer and only see ElementCount, which is why I thought it must be storing the count. Should another buffer data type be used? I tried some names to no avail (IDX11AppendStructuredBuffer, etc.).
I am using B32, and the latest DX11 pack. Thanks vux!
Thanks vux for the detailed explanation. Duly noted about the GPU stall - is that also happening with CopyCounter?
I need this as the shader can have a variable size output. I’m converting a dynamic plugin I made for taking a depth camera depth image, converting the depth to XYZ, applying a camera transform to it to get world-relative data, then applying bounding boxes for the interaction areas (with per-BBox sub sampling). This allows me to dynamically focus on areas of interest (such as moving hands) and easily combine point clouds from multiple cameras.
I’m seeing conservatively a 10x speed up moving this code to the GPU. I’ll roll this into the Kinect nodes when I get it all working. Oh, and as I still use Primesense cameras, my next move is to convert the OpenNI nodes to DX11 to save the texture conversion, which appears to introduce a frame lag.
I am writing some indices within a compute shader into an AppendStructuredBuffer.
Now I want to access these indices in a further computeshader. At the moment I am using the copycounter/readback solution you mentioned above to get the count of appended indices. Then I use this count to setup the Dispatcher.
Unfortunately this is very unperformant because of the readback.
Is there a way to access the AppendStructuredBuffer without dynamically setting the Dispatcher Threads? Do I have to utilize DispatchIndirect? And if yes - how?