VL.Audio FFT

You all know this FFT node, and it has an input called “Buffer Size”, with which one can specify how many frequency bands should be analyzed.

grafik

I have been asking myself: Why is it an integer, although it will always just output a spread with a count to the power of 2? Also it needs to be always one value above the output count you want to have - for example, if I want 512 bands, I need to set the buffer size to at least 513. What is the logic behind all of that?

Would it be desireable to have the buffer size available as an enum with options 64, 128… up to let’s say 16384, and count it up internally by 1? Or is it neccessary to be able to go lower, higher or in between these numbers?

If this made sense, I could try myself on this with a PR, but would need an example on how to define enums, because IIRC this is only possible in C#, right?

This question is especially for people who have worked with the FFT node in the past. Would be nice to get some feedback about the use cases and if it made sense to switch to an enum with pre-defined block counts.

  • For the Fourier Transform, an input of n samples will yield an output of n/2 frequency bands (e.g. 512 samples → 256 frequency bands)
    • (reason for that is, that the algorithm will also output 256 negative frequencies that are a mirror of the frequencies that you are actually interested in. in total you actually get 512 bands, but put simply they are 2 times the same info (sorry, a deeper explanation would need much more space)).
  • Some info about the Fast Fourier Transform: This is a nice algorithm that calculates the Fourier Transform very efficiently (hence the ‘fast’ in the name). However, it requries
    • a sample count that is a power of 2. if this is not the case, the node will automagically fill up the samples to the next power of 2, which is why in this example you see the same output spread count of 512 for an input spread count of 513-1024.

so don’t worry about only getting half the frequencies when specifying a sample count with a power of 2 - that’s exactly how it’s meant to be used.

but i agree, instead of having and input called “Buffer Size” and having to know that the output spread will only contain half of the size’s values, we could also have an input called “Frequency Bin Count” where you specify the spreadcount for the output directly. and yes, then let’s have it as enum because it can by definition only have pow2 values. thoughts?

2 Likes

Makes sense to me.
I know how FFT works but this change would probably lead to less irritation.

I found the FFT workflow in VVVV a bit confusing in other areas, since it outputs an IReadOnlyList while most FFT-consuming nodes want a Spread.
If that isn’t mandatory for higher reasons, a bit of streamlining could make things more straightforward to setup maybe?

Cheers,

Tom

latest vvvv 6.0 preview now has this reworked as mentioned: FFT node has an enum input called “Bin Count” (instead of buffer size), so a value you set there corresponds to the output spread count. In addition there are now nodes: PickFFTBinBand and PickFFTFrequencyBand for quickly accessing FFT values.

note that this is a breaking change, but ideally the advantages of less confusion should outweigh the minor hoops you have to go through, adapting existing patches.

3 Likes