I’ve done something similar by using Rulr from Elliot Woods and editing a dx11pointcloud node. First try to use Rulr to get the camera’s intra/extrinsics according to the kinect2. Then use this patch to: project the camera texture on the pointcloud, group 3d points that are corresponding to dots on the “detection texture”, and take the centrer of each group. You can adjust dot size depending on the camera’s angle/resolution.
kinect_detection_from_external_camera.zip (775.3 KB) (patch untested with a real setup)
Points of the same group are sometime too far from each other like on the image but I guess it’s less likely to happen with an actual calibrated camera. Depending on the object tracked size, shape, camera’s fps you may have to add some filtering.
AFAIK I don’t see a way to do it with only a single pixel per object tracked cause I guess you cannot predict if a pixel from the camera texture will be present or not on the pointcloud since the camera and kinect aren’t aligned. Good luck!