I need to feed a live video stream from an IP camera to the DetectPedestrian CV.image node for an art installation. The video is captured from an RTMP stream using FileStreamVlc2 (with DX11 texture output).
The detection works fine when routed directly from FileImage, however if the source is a converted DX11 texture, the detection fails.
I’ve suspected the format difference RGB8 vs RGBA8 to be the cause, but looking at the code, it seems it should be converted implicitly:
Changing texture formats does not help as the AsImage node can never output RGB8 (it only performs memory copy and no conversion from what I see). I wanted to try and make an explicit conversion node but all my CV.Image compilation attempts end up with most nodes missing (despite VVVV.CV.Core/Nodes/DX11 compiling fine).
Does the implicit conversion not work and is there a way to solve it? Or could it be something else causing this issue?
compiling the whole imagepack needs some dependencies in the right folders. try with this one there are all the necessary things, but the dx11 dlls are quite outdated and should be replaced with the latest ones
the folder structure should look like this:
i always compile without FlyCapture, OpenNI, OptiTrack and VideoInput (unloaded in solution explorer)
Hmm, I am still not clear about what Image/Channel format you are trying to feed into VL.OpenCV. Also, what options are you using on the ConvertColor node? Please elaborate more.
Next, what exactly do you mean here: “The detection works fine when routed directly from FileImage, however if the source is a converted DX11 texture, the detection fails.”? Which format is it in when routed directly and what changes when you “convert to DX11”?
As for the sluggishness, yes, VL.OpenCV is running in your CPU, not GPU so performance will not be as high, but you should use AsyncLoop regions around the Detectors combined with Trackers which are less heavy to improve performance. Your current setup is going to kill your application. An example on how to go about this (more or less) can be found in \VL.OpenCV\demos\08_FaceSwap.
@ravazquez I meant ImageReader (FileImage is the CV.Image pack equivalent, was writing it a bit late)
Anyways the point is I’m trying to run the detection on a video source VL.OpenCV doesn’t support on its own (I believe) - RTMP stream from an IP camera. I am able to get this source with a node that produces a DX11 Texture. However the detection nodes do not work when I feed them CV Images converted from DX11 textures (DX11Texture2D R8G8B8A8_UNorm format > AsImage > VL > FromImage).
I attempted using ConvertColor RGBA->RGB and BGRA->BGR to get an image with the same properties to the working one from ImageReader to no avail.
Well, as you might know, OpenCV is usually BGR or BGRA if 4 channels are used. Have you tried ConvertColor from RGBA to BGR or from RGBA to BGRA?
Also, have you tried to display the image you are passing in with a VL.OpenCV Renderer? Does it display properly or are the channels mixed?
In your screenshot it seems as though it is working for the ImageReader but not for the FromImage, maybe compare the output of both using the Info node to get a better understanding of what is going on?
Lastly, if you hover over the PedestrianDetector node you will see the remarks indicating the provided image has to be 1 or 3 channels. Maybe that is the source of your issue. Note that 1 channel images will be processed faster.