CV.image, Pedestrian Detection with DX texture source


#1

Hi,

I need to feed a live video stream from an IP camera to the DetectPedestrian CV.image node for an art installation. The video is captured from an RTMP stream using FileStreamVlc2 (with DX11 texture output).

The detection works fine when routed directly from FileImage, however if the source is a converted DX11 texture, the detection fails.

I’ve suspected the format difference RGB8 vs RGBA8 to be the cause, but looking at the code, it seems it should be converted implicitly:

DetectPedestrian.cs:127
FInput.Image.GetImage(TColorFormat.RGB8, FBgrImage);

Changing texture formats does not help as the AsImage node can never output RGB8 (it only performs memory copy and no conversion from what I see). I wanted to try and make an explicit conversion node but all my CV.Image compilation attempts end up with most nodes missing (despite VVVV.CV.Core/Nodes/DX11 compiling fine).

Does the implicit conversion not work and is there a way to solve it? Or could it be something else causing this issue?

Any help will be greatly appreciated, thanks!


#2

here’s that conversion that seems to not work for you: https://github.com/sebllll/VVVV.Packs.Image/blob/b35/src/nodes/plugins/Image/OpenCV/src/Tracking/DetectPedestrian.cs#L131

maybe you can change that RGB8 to RGBA8?

compiling the whole imagepack needs some dependencies in the right folders. try with this one there are all the necessary things, but the dx11 dlls are quite outdated and should be replaced with the latest ones

the folder structure should look like this: image

i always compile without FlyCapture, OpenNI, OptiTrack and VideoInput (unloaded in solution explorer)

this should be the newest branch: https://github.com/sebllll/VVVV.Packs.Image/tree/newNodes


Another way (if compilation is not an option) would be to use the new vl.opencv thing, which has the pedestrian-tracker already implemented, but is running only on the cpu and is therefore slower.


#3

Thanks for the quick reply!
Everything compiles fine (except for Ximea) but unfortunately I’m still having issues, even with the Dependencies inserted.
This is all the nodes I am getting:

Screenshot_28

Just a side note, to compile without issues I had to install the older 2.4 version of EmguCV, used the VDK.EmguCV.x64 nuget package.


#4

VL.OpenCV seems to be plagued by the same issue (even when using the available ConvertColor node to reduce amount of channels) and yes, it is quite sluggish, even for a ~500x400 image.


#5

Hmm, I am still not clear about what Image/Channel format you are trying to feed into VL.OpenCV. Also, what options are you using on the ConvertColor node? Please elaborate more.

Next, what exactly do you mean here: “The detection works fine when routed directly from FileImage, however if the source is a converted DX11 texture, the detection fails.”? Which format is it in when routed directly and what changes when you “convert to DX11”?

As for the sluggishness, yes, VL.OpenCV is running in your CPU, not GPU so performance will not be as high, but you should use AsyncLoop regions around the Detectors combined with Trackers which are less heavy to improve performance. Your current setup is going to kill your application. An example on how to go about this (more or less) can be found in \VL.OpenCV\demos\08_FaceSwap.

Hope that helps.


#6

Also, for more on AsyncLoop examples, check out this video around minute 57:


#7

@ravazquez I meant ImageReader (FileImage is the CV.Image pack equivalent, was writing it a bit late)

Anyways the point is I’m trying to run the detection on a video source VL.OpenCV doesn’t support on its own (I believe) - RTMP stream from an IP camera. I am able to get this source with a node that produces a DX11 Texture. However the detection nodes do not work when I feed them CV Images converted from DX11 textures (DX11Texture2D R8G8B8A8_UNorm format > AsImage > VL > FromImage).

I attempted using ConvertColor RGBA->RGB and BGRA->BGR to get an image with the same properties to the working one from ImageReader to no avail.

Thanks for the optimization tips, running the detection asynchronously and interleaving it with trackers makes a lot of sense, I will look into it!


#8

Well, as you might know, OpenCV is usually BGR or BGRA if 4 channels are used. Have you tried ConvertColor from RGBA to BGR or from RGBA to BGRA?

Also, have you tried to display the image you are passing in with a VL.OpenCV Renderer? Does it display properly or are the channels mixed?

In your screenshot it seems as though it is working for the ImageReader but not for the FromImage, maybe compare the output of both using the Info node to get a better understanding of what is going on?

Lastly, if you hover over the PedestrianDetector node you will see the remarks indicating the provided image has to be 1 or 3 channels. Maybe that is the source of your issue. Note that 1 channel images will be processed faster.


#9

Quick note, to get a 1 channel image out of the ConvertColor node, use the GRAY enum item.