Beat Detection Algorithm help / feedback

I am trying to recreate an algorithm from this link - specifically the one for frequency selected sound energy:
Frequency selected sound energy

In the author’s work he uses a constant C set to 250. This constant does not work in my program, and I’m curious if anyone knew why, or could offer any guidance on how I might go about figuring out a constant value for my program.

Additionally, if anyone can offer feedback about the number of samples I should hold in the history buffer it would be super helpful. I’ve never programmed anything like this, so I’m not entirely sure how it should function. It seems to look ok when the circle is scaled with size, but does not look right without the size (unless you use a very simple audio file like just a kick drum or hi hats, or other very peaked type of audio). I wasn’t sure if this is normal or not for this style of algorithm (imagine for example, the circle just flickers on / off at a static size). If anyone can offer feedback or ways to tweak it, it would really help me out.

Without seeing a video of the author’s implementation I have a difficult time imagining his intended behavior for this, so I am hoping someone is out there who has done this type of work and knows how it is supposed to function. Any code tweaks also helpful, I tried to keep this simple, I know it’s probably not the most efficient code, I’m literally trying to recreate his work to learn about this.

Code:

let sound;

function preload() {

//put your sound file here
sound = loadSound("../audio/17.mp3");

}

function setup() {

  createCanvas(500,500);
  frameRate(60);
  beatDetect = new BeatDetect();
  sound.loop();

}


function draw() {

  background(0);

  if( beatDetect.detectBeat() ) {
    fill(255,0,0);
    //console.log(size);
    ellipse(width/2,height/2,300+beatDetect.size,300+beatDetect.size);
  }

}

//BeatDetect constructor
function BeatDetect() {

  //http://archive.gamedev.net/archive/reference/programming/features/beatdetection/
  //algorithm #2 - Frequency selected sound energy

  let sampleBuffer=[]; //sample history buffer
  let bandsAverage=[]; //array of averages of the bands
  let energyBuffer=[]; //array of current sound energy

  let fourier = new p5.FFT();
  this.size=0;

  this.detectBeat = function() {

    //analyze fourier
    let spectrum=fourier.analyze();

      let count=0; //reset count every frame
      let isBeat=false; //reset isBeat every frame

      let bandsEnergy=[]; //start a new array each frame, so we don't pass in an array reference :)
      let bands_temp=[]; //temp array to split the frequency bands

      for(let i=0; i < spectrum.length; i++) {

        energyBuffer[i] = spectrum[i] * spectrum[i]; //calculate energy
        bands_temp.push(energyBuffer[i]); // temp array to split into grouped bands

        //collect 32 samples (one band of 1024 i.e. 1024/32 = 32 bands)
        if(bands_temp.length == 32) {

           let sum=0; // init the sum to zero
           for(let j=0; j < bands_temp.length; j++) {
             sum+=bands_temp[j]; // get the sum of the band's samples
             bandsEnergy[count] = sum; // get average energy of band's grouping
           }

           bands_temp=[]; //clear temp bands_sum array for next set of 32 samples;
           count++; //increment count index, reset this every frame;

         }

      }

      sampleBuffer.push(bandsEnergy);

      //add +1 since we remove a value each frame
      if(sampleBuffer.length == 30 + (1)) {

        for(let j = 0; j < bandsEnergy.length; j++) {

          let bandSum=0; //reset bandSum when we get to a new band

          //loop through all arrays in the history buffer
          for(let k=0; k < sampleBuffer.length; k++) {
            bandSum += sampleBuffer[k][j]; // get the sum for each freq. band in the buffer
          }
          //calc avg from sum in each band of the buffer
          bandsAverage[j] = bandSum/sampleBuffer.length;
          }

        //remove the oldest sample from the array
        sampleBuffer.splice(0,1);

      }

      for(let k=0; k < sampleBuffer.length; k++) {

         let c = 1.095; //increase to lower sensitivity

         if(bandsEnergy[k] > bandsAverage[k]*c) {
           isBeat=true;
           this.size=(bandsEnergy[k])*.0001;
           //this.size=300;
           return isBeat;
         }
       }

       //console.log(bandsEnergy.length);
       //console.log(sampleBuffer.length);

      //console.log( bandsAverage[14] + "bandAverage");
      //console.log( bandsEnergy[14] +  "bandEnergy");

    } // end DETECT

  } // end CONSTRUCTOR
1 Like

It’s kind of confusing in that linked page what C means:

Now the ‘C’ constant of this algorithm has nothing to do with the ‘C’ of the first algorithm, because we deal here with separated subbands the energy varies globally much more than with colorblind algorithms. Thus ‘C’ must be about 250.

It’s possible that the 250 value for C was referring to a different algorithm than the one you’ve implemented.

Pretty cool sketch, although it does get fooled quite a bit by non-percussive sounds. Without more research I don’t have any specific advice for how to improve it, but here is a version that visualizes the spectrum as well: Beat Detection and Spectrum Viz - OpenProcessing

1 Like

Thank you for your feedback. I think without having some sort of extensive testing and a table of frequencies it’s really hard to figure it out. I was having the same issue with non percussive sounds fooling it. Playing with the size of the bands (i.e. 16, 32) and the sample history size seem to have a big impact. For some reason 30 seems to work the best, I am not sure if this has something to do with the FFT spectrum, according to the P5 docs it is 1024, but in reality it says it is twice that, so it would sort of make sense that instead of sampling one second, 30 elements in the history buffer “might” be about one second worth of data. I played with changing the C value also up and down, but mine is more or less around 1 plus minus something like .01, .02.

The other algorithm (the simple beat detection with variance) has less loops. Maybe it’s a better way to go, or maybe I bit off more than I can chew at this level. Your feedback about the array references really helped me. THANK YOU.