Instancing vs instance noodles

ggml · August 16, 2018, 10:53pm

just finding out that instances in instance noodles are not the gpu efficient kind one might have been expecting,
as seems demostrated in this patch
instancing question.zip (66.1 KB)
[edit: this is because MaxElements was left to 0]

is it possible to patch vertex instancing with transform buffers without readback ?
(as in using Superpysical instanced version with Noodles transform buffers ?)
or if someone is able to link the two in code, maybe interested in a collab ?

tekcor · August 17, 2018, 7:25am

for using instancing with transform buffers you have to add this to your shader:

Variable Declaration:

StructuredBuffer<float4x4> Transform;

add instance id to your input structure:

struct vsInput
{
    ....
	uint ii : SV_InstanceID;
};

In Vertex Shader read the buffer based on this id:

float4x4 tW = Transform[In.ii];

Apply it to the vertex positions of your geometry:

float4 PosW = mul(In.PosO,tW);

done.

Knowing this you can add what ever buffers you like to your shader, also passing for example colors to pixel shader. Then you need to read the color buffer in the vertex shader and pass it to the pixel shader just like the other attribute data

everyoneishappy · August 17, 2018, 8:57am

You should set your max elements pins to the total number of vertices. Otherwise they are doing a small readback via pipeline statistics to set it for you automatically. Please don’t trash talk the pack.

ggml · August 17, 2018, 9:27am

You should set your max elements pins to the total number of vertices.

thank you, this seems key to performance
should be mentioned in the instancing demo from the pack/workshop

as for trash talking, i could never, i am basing most v4 practice on noodles at the moment
i was asking for a comment on the performance difference

why is the attached example rendering better performance in non-instanced than in instanced mode ?

ggml · August 17, 2018, 11:47am

attached are two instancing scenarios
instancing question 2.zip (833.0 KB)

are the maxelements being calculated correctly ?
is the performance difference between the two examples a normal cause of operation complexity, or is there room for more patching efficiency ?

system · August 17, 2019, 11:47am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.