Need some HINTS for research - Stride - SystemOutOfMemoryException

Hi folks,

I use Gamma “PREVIEW 2020.3.0-176” and Stride also involved (Skia, Elementa, MQTTnet, Kinect, XML). There are many Images and I have a Memory Problem
When I run the App very slowly the Memory in Task Manager goes up and after 12-16 hours crash (SystemOutOfMemoryException) always.

How can I look what’s going on? How can I optimize? What can I try?

I suspected File loading - I have 3000 DDS Files and loaded these ones when they where needed, so it always changes but very good with Performance. I switched also to load all DDS Files in FileTexture and connect to getspread Texture but in both cases Memory and “Paged pool” went up.
Does this process fill the memory and if yes (is there a way to load and delete things) or what else could fill the memory? When I use Skia ImageReader and PNG Files I’m not getting this effect on Memory but there are other problems.

thanks in advance for any hint in researching memory problems

My main question here would be whether or not the 3000 textures can actually stay in memory all at once or not. What happens if you build a little test patch where you load all of them like you say and then switch through them one by one each frame so all of them get displayed at least once?

Or did I misunderstand and you are only loading different subsets of those 3000 textures? If so, then you’ll also have to patch some logic which unloads those textures. If you need help with that feel free to ask.

Hi Elias, thanks for the hint

I tried before loading all Textures with an Cache/ForeEach/Filetexture also without Cache but I didn’t try to get them all displayed. — So I load now all in MainRenderer and set Alpha to 0 (just for testing) - Memory very full but still working. The rest of the logic is like before - loading different subsets by getspread of these loaded textures. And after 3h there Memory in TaskManager of (commited, paged pool, non paged pool) went up .

Yes, I’m only loading different subsets of those 3000 textures.
Yes, I think I need help to understand LOAD - UNLOAD FIleTextures in Stride.

I found the VL.Stride.TextureArray stuff from bjoern but is this what I should do in my case?
I’m used to the FileTextureDX11 or nice PlayerDX11 Node from Beta but I have to rethink in Gamma.

thx greets

(edit a day after)
BUT to load all Textures by connecting them to Renderer makes a difference- the patch runs now 18h and until now no crash by SystemOutOfMemoryException (Memory in TaskManager of (commited, paged pool, non paged pool) is 20times higher but still running)
AND this time it crashes after 24h

@bjoern Did I dream this or did you post a VL.Stride patch the other day which was similar to the old Player node?

@CeeYaa Do you need to display several textures at once or is it like the stack player one after the other?

I need to Display Several (around25) Texturen at once - but to have a Stack Player Like the good old other Players would be awsome but isn’t someone working on? As I Said, found VL.Stride.TextureArray which goes in that Direktion

I tried to patch a stackplayer for node, but all my attempts were somehow flawed, either performance was bad or there were memory leaks. In the end it wasn’t really needed and I stopped my pursuit. Maybe I’ll give it another go.
The “player” I posted in the chat was just loading everything at once, playing it back from v-ram, basically filetexture + getslice. During my (very limited) testing with this approach I didn’t encounter any memory leaks though.

@CeeYaa
I don’t think TextureArray is what you are looking for - it just copies all files/textures into one texture. The textures all have to have the same size, format and mipmap count. Also memory consumption should be about the same as loading all textures separately and keeping them in a spread. Additionally something like GetSlice (DX11.TextureArray) to conveniently access the different slices in a TextureArray is still missing in stride. I guess one could do something like this in a TextureFX (haven’t tried):

shader GetTexture_TextureFX : TextureFX
{
    int Index = 0;

    stage override float4 Shading()
    {
        float2 uv = streams.TexCoord;
        return Texture0.Sample(LinearSampler, float3(uv, (float)Index));
    }
}

Indeed the texture loading function in Stride causes a lot of memory allocations. I’ve added a workaround/overload to VL.Stride which I used in a patch I’ll upload here as soon as it is available. Had good results, was able to playback a bmp image stack with 60 fps and memory was completely stable.

Here we go, inside you find a little demo patch showing howto load (and unload) a set of textures. I was testing it with 1280p *.bmp files I generated with ffmpeg before. It should also allow to do what you need, having 25 of out 3000 loaded at a time. You’ll need at least version 2021.3.0-21 for it to run.
TextureLoader.vl (57.2 KB)

5 Likes

@Elias this is pretty impressive and should definitely go into core lib / VL.Stride.
I tried with 4096x2160 dds/bc7 files (~8 MB on disk) and reached stable 120 FPS on a 3 year old really small form factor desktop pc.
Specs:
Intel i7-8700 @ 3,20 GHz
NVidia Geforce GTX 1070-ti / 8GB VRam
Samsung 960 pro NVME SSD (1TB)


Once I went to higher FPS and a bottleneck was reached (did not figure out what exactly), the textures didn’t get properly disposed anymore, VRam was filling up and eventually vvvv crashed.

2 Likes

can we merge that back and improve the stride texture loading itself? which seems to be the reason for this post. having our own solution for a fundamental functionality of the main library doesn’t seem to be the best overall solution to me. also, it would benefit the stride community as well.

@bjoern Thanks! Regarding the crash when reaching too high with the FPS - the culprit could be the quickly hacked together AsyncTexture node in there. Could be that the texture never arrives on the main thread and therefor won’t get disposed. Good to know that a potential future node provided by VL.Stride should need to deal with that case.

@tebjan Indeed, we should provide a fix for Stride itself. I first tried adding an overload there taking ReadOnlyMemory<byte> so the memory needed here (https://github.com/stride3d/stride/blob/9219cbbb9de7a238060d97cf1cfbbf759120aa23/sources/engine/Stride/Graphics/Image.cs#L541) could be allocated on a pool (using ArrayPool from System.Memory). However that reference caused the C++ projects in the Stride solution to complain and I had no idea what to do. So I went a different route and added an overload to our project taking a string instead of a Stream and using the unmanaged heap to allocate the memory for the image. That solution should be easy to backport as it doesn’t need System.Memory.

@elias WOW again amazing stuff THANK YOU - I had the same issue like bjoern said: when it’s to fast or when there are to many (the white flash of unloaded textures is a good marker) memory goes up again, but I can manage this by delay loading of different stacks - so I can use this in my case so far, after some hours it went up just a little bit but far away from before the “AsyncTexture”

@Elias made a comparison with woei’s player. It can go to up to 180 FPS without hiccups.
Maybe take a look at the way he does the texture loading / creation?

2 Likes

Thank you for this thorough benchmarking, Björn!

1 Like

The major difference I can spot is that the node from @woei uses D3DX11CreateTextureFromFile function (D3DX11.h) - Win32 apps | Microsoft Docs to do the actual dds load, while in Stride this is done by these lines https://github.com/stride3d/stride/blob/9219cbbb9de7a238060d97cf1cfbbf759120aa23/sources/engine/Stride/Graphics/DDSHelper.cs#L988

As you can see the original function has been deprecated by Microsoft and is also no longer available in the SharpDX bindings (the DX11 pack uses the SlimDX bindings where this function still exists). So doing a quick comparison by moving over to that function doesn’t seem to be straight forward. We’d need to pipe the COM pointers from Stride → SharpDX → SlimDX, call the function and then all the way back.

In any case, I think I tackled the disposal topic mentioned earlier. So memory should be stable even if going over the hardware limits. Here’s the updated patch:
TextureLoader.vl (71.2 KB)

4 Likes

Like so?

It seems to work, isn’t leaking, but looks a bit hacky. Couldn’t compare performance yet because I currently haven’t got access to the desktop and my laptop isn’t up to the task.
Tried to use the VVVV.SlimDX nuget but had to reference the dll manually because gamma didn’t pick it up.

TextureLoader_SlimDX.vl (156.7 KB)

Hehe yes, like so. Regarding sRGB https://www.gamedev.net/forums/topic/666668-shaderresourceview-srgb-format-having-no-effect-on-sampler-reads/ - there seem to be known issues with that function. Maybe a reason why it was deprecated? But yeah, let’s wait until you can do some tests.

TextureLoader_SlimDX.vl (130.7 KB)

1 Like

So some more testing.
I tried to make it as comparable as possible. I started the patches let them sit for a minute. Started the playback let it run for 1:30 min an then made the screenshots. The patches without the “benchmarking” seem to run a bit smoother.

First of all, the SlimDX version I added performs abysmal, it reaches max 12 FPS :) Comparing the CPU load to the other approaches I assume some of the “copying” back and forth happens on the CPU. Also there is some error at startup.

Beta using woei’s player again 180 FPS but this time I had some dropouts here and there.

Gamma using @Elias’ improved version I also got to 180FPS with some dropouts. After I made the screenshots I realized that I had set GraphicsResourceUsage to Default but trying with Immutable yielded the same results. So this seems the way to go.

Here again the patches with the settings I used in case someone else wants to give it a go:
Player.DX11Texture.7z (15.4 KB)
TextureLoader.vl (137.0 KB)

3 Likes