Hello, I am attempting to get audio/video sync working for exporting a music visualization. I’m working with this demo script from VideoExport:withAudioViz.pde.
I am able to generate the analysis .txt file and get the synced example. However, I am attempting to integrate with this visualizer ProcessingCubes and having some issues.
The crux of my issue is that the VideoExport example is bucketing the fft output from the AudioSample while the visualization script is parsing the fft.getband() data directly. Is it possible to do something similar to fft.forward() with AudioSample? It doesn’t appear to have this capability. Sorry if this is a bit hard to follow, happy to give additional information or follow up. Thank you!
Sure, I will try to write some notes on the current state. Mostly I’ve just been looking at print statements to try to find parity between the bucketing procedure done in withAudioViz.pde and the cubes.pde .
For cubes:
Minim minim;
AudioPlayer song;
FFT fft;
minim = new Minim(this);
song = minim.loadFile(song_path);
song.play(0);
Then in the draw function the below is called:
//This moves the buffer read forward in the AudioPlayer song object
fft.forward(song.mix);
fft.getBand() calls are then used to extract the magnitudes of the different frequency bands. Our buffer size for the AudioPlayer is 1024, so there are 513 frequency bands.
For withAudioViz:
//This loads an *AudioSample* of the track with the same buffer size as with cubes.
AudioSample track = minim.loadSample(fileName, 1024);
//Creates a new fft object with the track parameters
fft = new FFT(track.bufferSize(), track.sampleRate());
–
At this point I’m a bit stuck because there’s no equivalent .forward method for AudioSample. I need to figure out how to move the sample window to do the cubes processing on the Sample fft data.
I will tag @hamoid to say this is an awesome library and the example is super helpful so thank you! The issue I’m having is the bucketing in withAudioViz is sort of off from the spectrum processing that I’m hoping to achieve with the synced cubes.pde visualization.
cool but please use </> button to format your code! (even if it’s a snippet)
I haven’t tried any code but basically what you want to do is to move forward at the fixed rate so that you can render sequential images - is it correct? Assuming that is the case, I checked some examples and with AudioSample you can extract the float array:
and you can pass an array to run FFT on with specified offset (startAt)
so I’m guessing that you need to move startAt at a fixed rate - which is perhaps floor(sampleRate / fps) where fps is the framerate of the output video.
Thank you for the formatting guidance. I will be sure to format code in posts as such going forward.
I think you’re on the right track with the startAt offset read windows. I’ve been looking at AudioStream as a possibility too. Do you know if there’s a getChannel equivalent for mix rather than LEFT or RIGHT? Thank you again for posting with me, definitely learning a lot about minim & fft.
I don’t know. But since you are not doing it real time, I would simply edit the sound with an external tool (like audacity) or compute it within Processing (simply take two arrays and generate an averaged array)
I tried reading the thread but I’m not sure I understand the issue Is one issue that the two FFT implementations give different values? In the example I use logAverages but there’s also linAverages, in case that makes a difference: FFT
Ah… Maybe I get it now: you prefer the values you are getting from the standard Processing audio library in real time, and you want a way to get those values not-in-real-time, the same way you get them when using Minim. Right?
In recent years I haven’t use this approach, but used https://sonicvisualiser.org/ instead. It has tons of plugins for different types of analysis, and has also a command line version you could use to automate converting audio files to text files.
Unfortunately I can’t help much with the standard audio library.
After many print statements and reading of the minim documentation I have figured it out! So for the ProcessingCubes the visualization iterates over the specSize() bands (513 because of the 1024 buffersize).
In your example logAverages causes the FFT array to contain 30 bands, by setting fftSlices to 513 and averaging the right and left channel values each slice is an array of 513. I tried linAverages but it didn’t agree with the code (more testing might help here), I just used getBand once I had the desired parameters of the .txt sorted to the chunks and slices.
By modifying the ProcessingCubes code to iterate over the array values rather than getBand()'s of a live fft I can parse the .txt into the visualizer. With your while logic I’m able to generate synced A/V.
I definitely learned a lot through this process and am pleased with the results. I’ll write some comments up on the withAudioViz example for how it can be modified to generate specSize arrays rather than logAverage ones and begin working on a de novo viz for my tunes.