VL.PythonNET and AI worflows like StreamDiffusion in vvvv gamma

tonfilm · April 26, 2024, 11:29am

Hello vvvv community,

I’ve been working on integrating Python into vvvv to leverage the explosion of AI and GenAI projects out there. These mind-blowing Python-based GitHub repositories are popping up day by day. It’s about transforming these repositories from neat, standalone proofs of concept into interactive, usable components within the vvvv ecosystem.

VL.PythonNET

The heart of this development is embedding the Python runtime within a vvvv process, allowing for direct interaction with Python code. This enables the use of libraries, such as PyTorch, TensorFlow, HuggingFace transformers, as well as the the usual suspects like NumPy and Pandas, natively in the vvvv environment.

StreamDiffusion

As a first example, I’ve applied this to StreamDiffusion. After a lot of optimization work, we now have what I believe to be the fastest implementation available. Additionally, by achieving direct texture input and output, latency is reduced further as the data never leaves the GPU, creating a truly interactive experience.

Current Status and Early Access

This isn’t quite ready for prime time; the setup for StreamDiffusion with Cuda and TensorRT acceleration is complex, and I want to improve on that. But I’ve started a super early access program for those who can contribute to its development. A donation to support this project will get you early access, my support in setting it up, and a mention on the forthcoming project website.

If you’re interested in getting ahead of the curve and are in a position to support this project, drop me a line at my forename at gmail dot com or:

Element Chat
Instagram (some more videos there)
LinkedIn
Twitter
Facebook

Live Demo

Introduction and demo at the 24th vvvv worldwide meetup:

Outlook and further possibilities

The horizon for this integration is vast and with more development time, this can get really big.

ComfyUI

One particularly exciting potential is to integrate ComfyUI, enabling the auto-import of ComfyUI workflows. As well as potentially being able to use ComfyUI nodes as vvvv nodes seamlessly. While ComfyUI is not geared towards real-time, it is a flexible and powerful GenAI toolkit.

Large Language Models

Already in the works, incorporating local LLMs, like the new LLaMA3 or Mistral to integrate text or code generators.

Music and Audio Generation

Lately there are better and better music generation models and they could be used to generate endless music streams that are interactively influenced.

Training and Fine-Tuning Models

While more complex than just running a model, it opens the door for real-time live training for interactive projects that could learn over time.

Usability

Exploring multithreading and running Python in a background thread could improve the experience and will make it possible to run vvvv for visuals in a different framerate.

Also, vvvv’s node factory feature could be used to automatically import Python scripts or libraries and build a node set for it. For example, get the complete PyTorch library as nodes for high-performance data manipulation on the GPU.

Licensing

Currently, I do not intend to offer it for free or as open-source. The library will be available under a commercial license. However, an affordable hobbyist/personal use license will be available in a few months.

That’s it for now, I’ll update here if something new happens. If you have any questions or ideas, add them here.

yar · April 26, 2024, 3:02pm

Guys, I will never tire of saying that this is fascinatingly awesome.
This is one of the best things to happen to VVVV in years.

But I have a big request. Although I find neural networks and especially ComfyUI interesting and community is interested in them, can you specifically beta test VL.PythonNET? I have some experimental Python scripts that I cannot reproduce in VVVV and I would like to try to run them.

tonfilm · April 26, 2024, 4:23pm

Yes, of course, VL.PythonNET is the core of this development. You do not need to use any neural network, you can just run any Python code, as long as you create a venv with the right dependencies or your Python installation or the machine has everything installed to run the script.

yar · April 26, 2024, 4:25pm

To be clear: VL.PythonNET in early access or will it be publicly available for beta testing?

tonfilm · April 26, 2024, 4:34pm

No, currently, I do not intend to offer it for free or as open-source. The library will be available under a commercial license. However, an affordable hobbyist/personal use license will be available in a few months.

EDIT: I’ve added a licensing section in the text above.

yar · April 26, 2024, 4:37pm

@tonfilm Thanks!

tonfilm · April 28, 2024, 6:21pm

Local inference for llama3 with 8B parameters. Using llamacpp-python with cuda backed and a quantized gguf model returns an answer in 2-3 seconds.

m4d · April 28, 2024, 7:16pm

This is massive!

nissidis · April 28, 2024, 8:25pm

I can officially confirm that this is a game changer and an exceptional addition to vvvv armada.

@tonfilm if there is any way I can help you with or if you need me to provide you content feel free to ask!

Thanks again for all the hard work!

<3

schlonzo · April 29, 2024, 8:32am

awsome! really looking forward to this one :)

tonfilm · May 5, 2024, 1:25pm

StreamDiffusion can now use all sd21 control nets with the sd-turbo model, including TensorRT acceleration:

As ControlNet is another network that needs to be evaluated, the performance impact is about 40%, it went from 45fps to about 25fps on my laptop 4090 GPU. A desktop 4090 GPU could reach 40-60fps.

schlonzo · May 13, 2024, 3:00pm

will this also support ip adapter?

tonfilm · May 13, 2024, 5:56pm

Yes, IP-Adapter should be faster than ControlNet but so far, no one has integrated it into StreamDiffusion. When I have time to work on the project again, it would be one of the first things to look into. But the most important one would be to get SDXL running with TensorRT in StreamDiffusion.

manuel · May 24, 2024, 2:51pm

Have you seen this ?
maybe usefull to simplify the setup/installation ?

tonfilm · May 24, 2024, 3:19pm

Yes, I have looked at it, it is nice to get the app set up, but it would need to have some modification to work with vvvv. I am already on their discord and will talk with the developers to evaluate options when I have more time.

But the apps on there are mostly not real-time and my interest is mainly in high-performance real-time AI projects. I don’t see a big benefit in having non-realtime standalone apps in vvvv. But if you have use cases that you couldn’t do otherwise, please let me know.

tonfilm · May 27, 2024, 5:05pm

Small update:
As a commissioned work the Wav2Vec audio AI model family have been added and the inference works well in real-time:

In this picture: ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition · Hugging Face

tonfilm · July 16, 2024, 5:07pm

Now also with Emotion2Vec in real-time with live audio input which seems to work better than Wav2Vec:

And AST audio classification in real-time with live audio input:

It uses the MIT/ast-finetuned-audioset-10-10-0.4593 which can detect over 500 audio classes.

domj · July 17, 2024, 5:58pm

This work is absolutely stunning!

As mentioned already, this is bringing gamma to absolutely new heights, tapping into this huge ecosystem.

👏👏👏