Smooth transition between frequencies amplitudes over time using a FFT analysis

I’ve done a few experiments with sound visualization in Processing using the Sound and Minim libraries, but I always stuck in the same problem: the values of frequencies amplitudes over time are very discrepants and the result its a raw transition that’s not too suitable for the eyes.

I’ve wondering if there’s a better approach to get smoothed values of an FFT analysis, in a way that the transition don’t be so abrupt.

Sure, a possibility its read a song file for a first time just to get the values so I can smooth them and then draw the new values while play the song. But what if I need to get smoothed values in a real-time analysis?

Thinking in the same logic, maybe what I need its to have a kind of buffer with something like a second of the sound and then do the same process: read, smooth the values and draw. It’ll not be on the real time, but maybe there’s a way to fine tune the buffer size to get smoothed values, but keeping an apparent synchrony with the sound.

I’m not too familiar with those libraries, so I’m wondering if they have a proper method to do this or if someone here can suggest a better approach.


Consider a LPF.


These are random points (yellow) above and below a line that are filtered (red):


This used the FFT from the Processing Sound library and I did some LPF (using concepts from links provided) of the amplitudes for each frequency and plotted in real-time:

I did buffer the amplitudes for filtering and plotting for real-time response… at least visually.

I will see if I can scale down the code for something to help you get started… much too busy these days so try it yourself first

I started with the FFTSpectrum example in the Processing Sound library and went from there.


Hi @werls. Welcome to the Processing community. FFT is not an easy data stream to visualize because it is contingent on the window size. The smaller the window size the faster the analysis, but less frequency resolution. And the opposite is true for larger window sizes, slower but more accurate.
If you need to use FFT, like to resynthesize the sound for example, then you can take the magnitude of the real and imaginary parts, sqrt(real^2 + imag^2). This gives you a larger number for each bin, based on the window size. That will only work, however, if your audio library provides both the real and imaginary parts of the FFT.
But if you are not interested in resynthesizing the sound, just displaying the analysis, I would recommend a different approach. If you use a series of bandpass filters across a defined spectrum (50-5000Hz for example) then you can analyze just the amplitude of the output of each filter, which can be smoothed much easier, and works better for visualization. This would only give you the frequency information you want and be within a numeric range that is more predictable than what FFT will provide. You can customize the frequencies and overlap of each filter and apply what ever smoothing algorithm you want. I do something similar in a vocoder example using my own library here: Pd4P3/examples/Filters/VocoderDemo at main · robertesler/Pd4P3 · GitHub.