Sound Library: What values to use in sample?

Just started using the Sound Library and have a quick question. I wasn’t able to find any reference to whether or not the value in each frame in a sample should be a float -1.0 to 1.0, a float 0.0 to 1.0, or if it can be a float as big as a float can be negative or positive.

Edit: The examples found in the library vary from -1.0 to 1.0 (AudioSampleManipulation) to -100 to 100 (Example in AudioSample read() reference).

My assumption is that the player simply normalizes to the min max numbers in one sample, but I’d like to get some clarity on how this works, how to optimize, and ultimately what to standardize on in my projects.

Thanks!
Ryan M

1 Like

Hi @LDB477,
If you really want to find the true answer Processing Sound Library uses JSYN as its audio backend, which can be a bit thick to parse through but the source is here: GitHub - philburk/jsyn
However, as for the values to write to the AudioSample.write() method, I would just default to -1 to 1. Though the example you mention in the documentation ( Reference / Processing.org) seems to go well beyond that range, which I have no real good explanation for, the other example you mention will start to clip if you push it beyond the range.
The code below will start to clip if you go above or below the -1 to 1 range.
If you change the multiplier on line 13 from 3 to 10 for example, you’ll get ear bleeding distortion, which indicates that the desired numerical range should be -1 to 1.
This is consistent with most audio libraries and hardware. Each one deals with out-of-range values differently, some have limiters or other protections. Since the Processing Sound Library uses Java Sound ( Java Sound API (oracle.com)), ultimately it would depend on how Java Sound deals with numbers out of range based on the operating system, and then how the hardware deals with numbers out of range.
So, to be consistent I would stick with -1.0 to 1.0 float values.

import processing.sound.*;

AudioSample sample;

void setup() {
  size(640, 360);
  background(255);

  // Manually write a sine wave oscillations into an array.
  int resolution = 1000;
  float[] sinewave = new float[resolution];
  for (int i = 0; i < resolution; i++) {
    sinewave[i] = 3*sin(TWO_PI*i/resolution);//This should clip if you multiply by another over 2
  }

  // Initialize the audiosample, set framerate to play 200 oscillations/second
  sample = new AudioSample(this, sinewave, 500 * resolution);

  sample.amp(1.0);
  sample.loop();
}      

void draw() {
}
1 Like

Hello @robertesler and thank you for the detailed explanation. I think I will stick to -1 to 1 range as you’ve suggested.

While I have you, I’m also working with the FFT built into the library. What I’m missing though is what each band refers to for the frequency it represents. My assumption is that if you choose the default of 512 bands, the resulting spectrum output is 0Hz through some very high Hz, in steps of (max Hz/512).

Any insight would be appreciated!

Thanks

When using FFT the output is in frequency bins. Each bin has a bandwidth of the sample rate divided by the number of bins. So if your sample rate is 44100Hz and your window size is 512 then each bin is about 86Hz wide.
The output of the FFT then would be the average energy across each band at the phase of the analysis window. The nth bin frequency would be n * SR/windowSize.

2 Likes