Speech Recognition

Hello!

I’m a student working on a small project for class and I’m having some troubles (I’m pretty new to p5.js). I’m using the p5.speech library to get speech recognition into my project; basically what I want to do is get the words that are being said to be displayed in the sketch.

In the sketch, it seems like the recognized words are being turned into strings (or a string), but I just cant figure out how to get them to actually draw. The words display in the console.log perfectly as you speak, which should mean that it would be simple (right?). I’ve also tried a bunch of things to find out how the variable that stores the words functions, but I can’t get anything to work. I’m pretty stumped. If anyone can figure out how to get the words displayed onto the sketch, or at least use them like any other string, that would be great.

Here’s a link to the project:
https://editor.p5js.org/felixd1@tcnj.edu/sketches/1E3Vv2zTA

1 Like

Hi,

Welcome to the Processing community!
When you don’t know how to do something, you often check the p5.js reference because almost everything is here :

For drawing text on the canvas, here is what you need :

1 Like

Oh ok sorry didn’t see you were already using the text() function.

Yea I used the text() function to test a bunch of things, and try to display the text of course. Thanks though. My trouble lies in the variable that stores the words, since it really seems like it should work like any other string, but it just doesn’t.

@dannfelixx you have to write moving or changing stuff inside draw function. read wiki to get more idea.

Put this lines in draw function to render words in sketch.

  if(String(mostrecentword) !== ""){
    text(String(mostrecentword), 20, 20);  
  }
1 Like

I tried a few different combinations of stuff in draw, and that line doesn’t seem to work :frowning: that’s why its so confusing, the mostrecentword variable doesn’t act like a string even though it looks like it should

@dannfelixx when I run your sketch. I get /api/session:1 Failed to load resource: the server responded with a status of 404 () error in console. can you confirm this ? you can open console by proessing F12 or ctrl+shift+I.

Looking at your code, the first thing you need to be aware is when p5 is available, which is within the setup() function. I did a search an found this

Below is a working demo. If it shows an error, just start speaking and you will see it starts processing the audio. Tested in Chrome. It doesn’t seem to be supported in FF.

Kf

var myRec;
var mostrecentword = "";

var x, y;
var dx, dy;
var step;

function setup() {
  createCanvas(300,400);  
  
  
  myRec = new p5.SpeechRec('en-US'); // new P5.SpeechRec object
  myRec.continuous = true; // do continuous recognition
  myRec.interimResults = true; // allow partial recognition (faster, less accurate)
  
  myRec.onResult=parseResult;
  myRec.onError= showError;
  myRec.start(); 
  
  // graphics stuff:
  textFont('Georgia');
  textSize(12);
  textAlign(LEFT);
  fill(0, 0, 0, 255);
  x = width / 2;
  y = height / 2;
  dx = 0;
  dy = 0;
  
  step=10;

}

function draw() {
  background(200);

  text("Hello", 12, 30);
  
  if (mostrecentword != "") {
    x=x+dx;
    y=y+dy;
    text(String(mostrecentword),x ,y);
    dx=0;
    dy=0;
  }
  
}

function parseResult() {
  
  // recognition system will often append words into phrases.
  // so hack here is to only use the last word:
  mostrecentword = myRec.resultString.split(' ').pop();
  
  if (mostrecentword.indexOf("left") !== -1) {
    dx = -step;
    dy = 0;
  } else if (mostrecentword.indexOf("right") !== -1) {
    dx = step;
    dy = 0;
  } else if (mostrecentword.indexOf("up") !== -1) {
    dx = 0;
    dy = -step;
  } else if (mostrecentword.indexOf("down") !== -1) {
    dx = 0;
    dy = step;
  } else if (mostrecentword.indexOf("clear") !== -1) {
    //background(255);
  }
  console.log(myRec.resultString);
  console.log(myRec.resultString.split(' ').length);
}

function showError(){
	console.log('There is an error');
	text('There is an error', windowWidth/2, windowHeight/2);
}

  • p5 is a variable everytime & everywhere!
  • p5 is a global variable that stores the reference for the “p5js”’ constructor function (a.K.a. class).
  • SpeechRec is a property of class p5, which the library “p5.speech” had appended to it.
  • Like variable p5, property SpeechRec also refers to a class (constructor function).
  • We can say that SpeechRec is a nested static class of class p5.
  • BtW, setup() is a p5 callback function.

@GoToLoop Auto-corrector changed what I meant to say, not variable but available, so my mistake. I amended my post.

I did some more testing and doing myRec = new p5.SpeechRec('en-US'); doesn’t seem to work in the global context. I do not have a certain answer as I am experiencing errors with the library from time to time. I am guessing it is because the library object is not properly disposed. From my limited experience with p5, should speechRec() be called in setup, right? Is this a requirement for “SpeechRec is a nested static class of class p5”? When is setup called during initialization? I assume the p5 constructor is called when the p5.js library is loaded but setup must be called in window.onload() or similar method?

Kf

  • Variable p5 is globally declared and initialized right after the library is loaded.
  • An instance of p5 is created right after the p5js finds a setup() or draw() in the global context.
  • Otherwise, p5js rechecks for their existence onload() event.
  • Then preload() & setup() are called back once everything is properly initialized.

Yes, I bit rusty with this life cycle @Gotoloop. So based on this, is myRec initialized in the global context legal?

Kf

As long as “p5.speech” is already loaded, I believe it should work globally. :thinking:

Did a bit more testing, looks like the continuous speech recognition isn’t compatible with the draw function if you want to continuously draw the words. Logging the frame rate shows that it gets stuck on the first frame. I’m guessing that since it’s a continuous process, p5 technically can’t continue to run the code past it (at 60 times a second) like it normally would. In the example on the p5.speech site, it’s not drawing the words it recognizes as text - it only uses them to move the dot around.

Did you try my code above? Because I had a draw(). Also, what browser and OS are you using?

From my limited testing, it works continuously most of the time. If you are referring to the code you posted, it only pops the last recognized word. In my code, I was printing all the words in the stack. If you stop speaking, I think it stops recognizing. If you start speaking, it activates again. Where you watching the console log messages? I did notice the sketch would get stack when restarting. I am guessing the system was trying to load the rec object and it hangs there. I suggest you do a broader search in the js community to see if this is observed in other projects or probly suggestions how to resolve your current issue. But first, become familiar with the library documentation.

Kf

1 Like

I just got it to work on Chrome :slight_smile: I used your code and it works very well, now I can finally finish the rest of the project. Many thanks to you and everyone who helped!