CS count white pixels, readback one integer

hi all,

I want to use an interlockedAdd to count up the number of my bright pixels. Multiple examples use a second shader that generates an empty buffer of the same dimensions. Is this mandatory for interlockedAdd to work - does it need an pre-existing buffer to work on?

I found the node AddSpectral(ValueBuffer) which should do something very similar. Did a quick test but now I am unsure if the node works at all. I am unable to read back the sum. stride4 should be an integer. callmenames-2021-02-13.v4p (6.0 KB)

I guess I just want to know if I am on the right track.

This normally how it’s done, because you need amount of threads, same as incoming buffer, so the only way to is to dispatch a same amount…

You can look this thread for solution without interlocking add…

that’s my interlocked version (7.0 KB)

perfect, thx! some questions tough :)

Is this about the group shared memory? I don’t get how the different thread counts work together to make InterlockedAdd() work.

ClearCounter writes zeros to tid[x]
dispatch 1,1,1 of threadgroup 8,8,1 = 64 threads
so buffer[0] to [64] = 0

for the main pass let’s say the texture is 16x16, so
dispatch 2,2,1 of threadgroup 8,8,1 = 256 threads.

InterlockAdd() then runs 256 in parallel, adding to buffer[0] if the condition is met. It just needs to read and write to the first position of the buffer. do you just generate the first buffer value so that the main pass can work on it? and threadgroups have to match so that they can access GSM?

I would like to understand this black magic

The group shared is a bit of other thing, it’s hard to describe as I don’t really tested (never found a nice example), but I guess need to read some docs, anyways you can dispatch threads and groups, so you can write the result of the group to group shared (like interlocked min or max) and then you have this Memory Barrier that will wait till all the threads in group are finish so you can do something like ok if group result is this thread result is this…

So… In the case of example:
Clear counter says that on the beginning of calc the first buffer value is 0, then the second shader with interlocked add writes to this buffer first value from every thread you are dispatch

1 Like