AI reinforcement learning

ctremblay · May 30, 2021, 3:39pm

I’m trying to learn reinforcement learning for AI so far i been told the way to program it is make a table with the actions I want the player to do, but when looking at youtube reinforcement learning ai they are getting way more info then how I was told to code it.
When I do machine learning ai I give it several distances to stuff & allow it to run. How can I give my reinforcement learning ai the distances?

josephh · May 31, 2021, 8:35am

Hi @ctremblay ,

Can you be more explicit about your project? What are you trying to do? Is it a game? Do you have some code already? Are you doing the whole thing in Processing Java?

Chrisir · May 31, 2021, 8:57am

I imagine that this is an assignment from your teacher?

I think that he wants you to work with a table to store moves for tic tac toe or whatever so your small AI acts based on that data and gets better.

When you google the AI stuff you come across approaches with neural networks. That confuses you. Stick with the concept of your teacher and forget YouTube/google for now.

Neural networks are much harder to do.

To get real help, tell us about the project and show some code.

Example: Tic Tac Toe

The way I would do it for tic tac toe: (Problem is that you first need the game itself. How to make a move there etc.)

My idea would be that the AI stores the game in a table on the hard drive:

When it looses, all the moves it made are noted with score -1, when it wins +1, when it’s a Draw 0.

Now when playing check the table:

When you find the board (also by rotation or mirroring but that’s hard) and the score is -1 or 0 avoid the move, otherwise make the move.
When you don’t find the board, random move.

Add moves/game to the table when new.

To train this let the AI play against itself as fast as possible for a night.

Then play against the AI.

(check out the forum and the processing YouTube channel for more)

Chrisir

ctremblay · May 31, 2021, 3:05pm

It’s not for school I just think what I see with reinforcement learning looks cooler then what I’m doing now. I never did learn the part of how to choose the win or lose part on reinforcement. I just found a basic guide & been copying it.

Chrisir · May 31, 2021, 3:07pm

To get real help, tell us about the project and show some code.

ctremblay · May 31, 2021, 3:07pm

OK, so here the code for my machine learning. It works good but can be dumb. But it did make it to the end. It has room cause I was gonna throw a menu and high score and ect in it
mazeRunner

int s = 25;
ArrayList<Wall> wall = new ArrayList<Wall>();
ArrayList<Enemy> enemy = new ArrayList<Enemy>();
ArrayList<Enemy> saved = new ArrayList<Enemy>();
ArrayList<endPoint> end = new ArrayList<endPoint>();
boolean created = false;
Table wallTable;
TableRow tr;
int room = 1;
int total = 500;
int[] bs = new int[5];
int[] be = new int[5];
int gen = 0;
int hs = 0;

void setup(){
  size(625,625);
  wallTable = loadTable("wall.csv");
  tr = wallTable.getRow(0);
  int tn = tr.getInt(0);
  for (int i = 0; i < tn; i++){
    int tc = tr.getInt(i + 1);
    wall.add(new Wall(tc));
  }
  end.add(new endPoint(552));
  saved.add(loadSnake());
}

Enemy loadSnake() {
  Enemy load = new Enemy(52);
  Table t = loadTable("data/Snake.csv");
  load.brain.TableToNet(t);
  return load;
}

void draw(){
  background(0);
  if (room == 1){
    for (int i = wall.size()-1; i >= 0; i--){
        Wall w = wall.get(i);
        w.show();
        w.mouseCheck();
      }
      if (enemy.size() > 0){
        Enemy e = enemy.get(0);
        fill(255,0,0);
        textSize(20);
        text("gen: " + gen, 50, 150);
      }
      for (int i = enemy.size()-1; i >= 0; i--){
        Enemy e = enemy.get(i);
        e.show();
        e.checkD();
        e.checkW();
        if (e.alive == false){
          enemy.remove(i);
          saved.add(e);
        }
        if (e.destroy == true){
          enemy.remove(i);
        }
      }
      for (int i = end.size()-1; i >= 0; i--){
        endPoint e = end.get(i);
        e.show();
      }
      if (enemy.size() == 0){
        nextGen();
      }
  }
  if (created){
    tr = wallTable.addRow();
    tr.setInt(0,wall.size()+1);
    for (int i = 0; i < wall.size(); i++){
      Wall w = wall.get(i);
      tr.setInt(i+1,w.num);
    }
    saveTable(wallTable,"wall.csv");
    exit();
  }// created
}// draw

Enemy

class Enemy extends Rect{
  int rd, ld, ud, dd;
  float[] input = new float[5];
  NeuralNet brain;
  boolean alive = true;
  boolean destroy = false;
  PVector opos;
  int timeBetween = 128;
  int timer = timeBetween;
  float d;
  int score = 0;
  int rc, lc, uc, dc;
  
  Enemy(int n){
    super(n);
    opos = new PVector(pos.x,pos.y);
    brain = new NeuralNet(input.length,8,4);
  }
  
  Enemy(int n, NeuralNet b){
    super(n);
    opos = new PVector(pos.x,pos.y);
    brain = b.clone();
    brain.mutate(0.1);
  }
  
  void saveSnake() {
    //save snakes brain
    saveTable(brain.NetToTable(), "data/Snake.csv");
  }
  
  void think(){
    input[0] = rd;
    input[1] = ld;
    input[2] = ud;
    input[3] = dd;
    endPoint e = end.get(0);
    input[4] = e.num;
    float[] guess = brain.output(input);
    if (guess[0] > guess[1] && guess[0] > guess[2] && guess[0] > guess[3]){
      pos.x += s;
      rc = 0;
      lc = 2;
      uc = 0;
      dc = 0;
    }
    if (guess[1] > guess[0] && guess[1] > guess[2] && guess[1] > guess[3]){
      pos.x -= s;
      rc = 2;
      lc = 0;
      uc = 0;
      dc = 0;
    }
    if (guess[2] > guess[1] && guess[2] > guess[0] && guess[2] > guess[3] && dc != 2){
      pos.y += s;
      rc = 0;
      lc = 0;
      uc = 2;
      dc = 0;
    }
    if (guess[3] > guess[1] && guess[3] > guess[2] && guess[3] > guess[0] && uc != 2){
      pos.y -= s;
      rc = 0;
      lc = 0;
      uc = 0;
      dc = 2;
    }
    checkW();
  }
  
  void checkW(){
    for (int i = wall.size()-1; i >= 0; i--){
      Wall w = wall.get(i);
      if (pos.x == w.pos.x && pos.y == w.pos.y){
        alive = false;
      }
    }
    if (timer > -1){timer -= 1;}
    if (timer < 0){
      d = dist(pos.x,pos.y,opos.x,opos.y);
      if (d < 64){
        destroy = true;
      }
      timer = timeBetween;
      opos = new PVector(pos.x,pos.y);
    }
    score += 1;
  }
  
  void checkD(){
    rd = width;
    ld = width;
    ud = height;
    dd = height;
    for (int i = 0; i < wall.size(); i++){
      Wall w = wall.get(i);
      if (w.pos.y <= pos.y && w.pos.y + s >= pos.y){
        d = int(w.pos.x - pos.x);
        if (d > 0 && d < rd){
          rd = int(d-25);
        }
        d = int(pos.x - w.pos.x);
        if (d > 0 && d < ld){
          ld = int(d-25);
        }
      }// y
      if (w.pos.x <= pos.x && w.pos.x + s >= pos.x){
        d = int(w.pos.y - pos.y);
        if (d > 0 && d < dd){
          dd = int(d-25);
        }
        d = int(pos.y - w.pos.y);
        if (d > 0 && d < ud){
          ud = int(d-25);
        }
      }// y
    }// wall size
    think();
  }/// checkD
  
  void show(){
    fill(0,255,0);
    rect(pos.x,pos.y,s,s);
  }
}

Matrix

class Matrix {
  
  //local variables
  int rows;
  int cols;
  float[][] matrix;
  
  //---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //constructor
  Matrix(int r, int c) {
    rows = r;
    cols = c;
    matrix = new float[rows][cols];
  }
  
  //---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //constructor from 2D array
  Matrix(float[][] m) {
    matrix = m;
    cols = m.length;
    rows = m[0].length;
  }
  
  //---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //print matrix
  void output() {
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        print(matrix[i][j] + "  ");
      }
      println(" ");
    }
    println();
  }
  //---------------------------------------------------------------------------------------------------------------------------------------------------------  
  
  //multiply by scalar
  void multiply(float n ) {

    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        matrix[i][j] *= n;
      }
    }
  }

//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //return a matrix which is this matrix dot product parameter matrix 
  Matrix dot(Matrix n) {
    Matrix result = new Matrix(rows, n.cols);
   
    if (cols == n.rows) {
      //for each spot in the new matrix
      for (int i =0; i<rows; i++) {
        for (int j = 0; j<n.cols; j++) {
          float sum = 0;
          for (int k = 0; k<cols; k++) {
            sum+= matrix[i][k]*n.matrix[k][j];
          }
          result.matrix[i][j] = sum;
        }
      }
    }

    return result;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //set the matrix to random ints between -1 and 1
  void randomize() {
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        matrix[i][j] = random(-1, 1);
      }
    }
  }

//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //add a scalar to the matrix
  void Add(float n ) {
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        matrix[i][j] += n;
      }
    }
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  ///return a matrix which is this matrix + parameter matrix
  Matrix add(Matrix n ) {
    Matrix newMatrix = new Matrix(rows, cols);
    if (cols == n.cols && rows == n.rows) {
      for (int i =0; i<rows; i++) {
        for (int j = 0; j<cols; j++) {
          newMatrix.matrix[i][j] = matrix[i][j] + n.matrix[i][j];
        }
      }
    }
    return newMatrix;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //return a matrix which is this matrix - parameter matrix
  Matrix subtract(Matrix n ) {
    Matrix newMatrix = new Matrix(cols, rows);
    if (cols == n.cols && rows == n.rows) {
      for (int i =0; i<rows; i++) {
        for (int j = 0; j<cols; j++) {
          newMatrix.matrix[i][j] = matrix[i][j] - n.matrix[i][j];
        }
      }
    }
    return newMatrix;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //return a matrix which is this matrix * parameter matrix (element wise multiplication)
  Matrix multiply(Matrix n ) {
    Matrix newMatrix = new Matrix(rows, cols);
    if (cols == n.cols && rows == n.rows) {
      for (int i =0; i<rows; i++) {
        for (int j = 0; j<cols; j++) {
          newMatrix.matrix[i][j] = matrix[i][j] * n.matrix[i][j];
        }
      }
    }
    return newMatrix;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //return a matrix which is the transpose of this matrix
  Matrix transpose() {
    Matrix n = new Matrix(cols, rows);
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        n.matrix[j][i] = matrix[i][j];
      }
    }
    return n;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //Creates a single column array from the parameter array
  Matrix singleColumnMatrixFromArray(float[] arr) {
    Matrix n = new Matrix(arr.length, 1);
    for (int i = 0; i< arr.length; i++) {
      n.matrix[i][0] = arr[i];
    }
    return n;
  }
  //---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //sets this matrix from an array
  void fromArray(float[] arr) {
    for (int i = 0; i< rows; i++) {
      for (int j = 0; j< cols; j++) {
        matrix[i][j] =  arr[j+i*cols];
      }
    }
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------    
  //returns an array which represents this matrix
  float[] toArray() {
    float[] arr = new float[rows*cols];
    for (int i = 0; i< rows; i++) {
      for (int j = 0; j< cols; j++) {
        arr[j+i*cols] = matrix[i][j];
      }
    }
    return arr;
  }

//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //for ix1 matrixes adds one to the bottom
  Matrix addBias() {
    Matrix n = new Matrix(rows+1, 1);
    for (int i =0; i<rows; i++) {
      n.matrix[i][0] = matrix[i][0];
    }
    n.matrix[rows][0] = 1;
    return n;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //applies the activation function(sigmoid) to each element of the matrix
  Matrix activate() {
    Matrix n = new Matrix(rows, cols);
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        n.matrix[i][j] = sigmoid(matrix[i][j]);
      }
    }
    return n;
  }
  
//---------------------------------------------------------------------------------------------------------------------------------------------------------    
  //sigmoid activation function
  float sigmoid(float x) {
    float y = 1 / (1 + pow((float)Math.E, -x));
    return y;
  }
  //returns the matrix that is the derived sigmoid function of the current matrix
  Matrix sigmoidDerived() {
    Matrix n = new Matrix(rows, cols);
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        n.matrix[i][j] = (matrix[i][j] * (1- matrix[i][j]));
      }
    }
    return n;
  }

//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //returns the matrix which is this matrix with the bottom layer removed
  Matrix removeBottomLayer() {
    Matrix n = new Matrix(rows-1, cols);      
    for (int i =0; i<n.rows; i++) {
      for (int j = 0; j<cols; j++) {
        n.matrix[i][j] = matrix[i][j];
      }
    }
    return n;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //Mutation function for genetic algorithm 
  
  void mutate(float mutationRate) {
    
    //for each element in the matrix
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        float rand = random(1);
        if (rand<mutationRate) {//if chosen to be mutated
          matrix[i][j] += randomGaussian()/5;//add a random value to it(can be negative)
          
          //set the boundaries to 1 and -1
          if (matrix[i][j]>1) {
            matrix[i][j] = 1;
          }
          if (matrix[i][j] <-1) {
            matrix[i][j] = -1;
          }
        }
      }
    }
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //returns a matrix which has a random number of values from this matrix and the rest from the parameter matrix
  Matrix crossover(Matrix partner) {
    Matrix child = new Matrix(rows, cols);
    
    //pick a random point in the matrix
    int randC = floor(random(cols));
    int randR = floor(random(rows));
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {

        if ((i< randR)|| (i==randR && j<=randC)) { //if before the random point then copy from this matric
          child.matrix[i][j] = matrix[i][j];
        } else { //if after the random point then copy from the parameter array
          child.matrix[i][j] = partner.matrix[i][j];
        }
      }
    }
    return child;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //return a copy of this matrix
  Matrix clone() {
    Matrix clone = new  Matrix(rows, cols);
    for (int i =0; i<rows; i++) {
      for (int j = 0; j<cols; j++) {
        clone.matrix[i][j] = matrix[i][j];
      }
    }
    return clone;
  }
}

NG

void nextGen(){
  for (int i = 0; i < 5; i++){
    bs[i] = 0;
    be[i] = 0;
  }
  checkBest();
  for (int i = 0; i < total; i++){
    enemy.add(PickOne());
  }
  for (int i = 0; i < 10; i++){
    enemy.add(new Enemy(52));
  }
  checkHigh();
  saved.clear();
  gen += 1;
}

Enemy PickOne(){
  float r = random(4);
  r = round(r);
  Enemy p = saved.get(be[int(r)]);
  Enemy child = new Enemy(52,p.brain);
  return child;
}

void checkHigh(){
  if (bs[0] > hs){
    hs = bs[0];
    Enemy e = enemy.get(bs[0]);
    e.saveSnake();
  }
}

void checkBest(){
  for (int i = saved.size()-1; i >= 0; i--){
    Enemy e = saved.get(i);
    if (e.score > bs[0]){
      bs[4] = bs[3];
      bs[3] = bs[2];
      bs[2] = bs[1];
      bs[1] = bs[0];
      bs[0] = e.score;
      be[4] = be[3];
      be[3] = be[2];
      be[2] = be[1];
      be[1] = be[0];
      be[0] = i;
    }
    else if (e.score > bs[1]){
      bs[4] = bs[3];
      bs[3] = bs[2];
      bs[2] = bs[1];
      bs[1] = e.score;
      be[4] = be[3];
      be[3] = be[2];
      be[2] = be[1];
      be[1] = i;
    }
    else if (e.score > bs[2]){
      bs[4] = bs[3];
      bs[3] = bs[2];
      bs[2] = e.score;
      be[4] = be[3];
      be[3] = be[2];
      be[2] = i;
    }
    else if (e.score > bs[3]){
      bs[4] = bs[3];
      bs[3] = e.score;
      be[4] = be[3];
      be[3] = i;
    }
    else if (e.score > bs[4]){
      bs[4] = e.score;
      be[4] = i;
    }
  }
}

Rect I got tired of coding the position over and over

class Rect{
  PVector pos;
  
  Rect(int n){
    int jj = 0;
    while(n > 24){
      n -= 25;
      jj += 1;
    }
    pos = new PVector(n * s, jj * s);
  }// rect
  
}

Wall

class Wall extends Rect{
  int num = 0;
  boolean check = false;
  boolean clicked = false;
  
  Wall(int n){
    super(n);
    num = n;
  }// wall
  
  void show(){
    fill(255);
    if (check){fill(255,255,0);}
    rect(pos.x,pos.y,s,s);
    fill(0);
    textSize(10);
    text(num, pos.x + 2, pos.y + 12);
    if (clicked){
      fill(255,0,0);
      ellipse(pos.x+12,pos.y+12,12,12);
    }
  }// show
  
  void mouseCheck(){
    if (mouseX > pos.x && mouseX < pos.x + s && 
        mouseY > pos.y && mouseY < pos.y + s){
          check = true;
        }
        else {check = false;}
  }
}

end point, the way to win or suppose to be

class endPoint extends Rect{
  int num;
  
  endPoint(int n){
    super(n);
    num = n;
  }
  
  void show(){
    fill(255,255,0);
    rect(pos.x,pos.y,s,s);
    fill(0);
    text("E",pos.x+5,pos.y+20);
  }
}

NN

class NeuralNet {

  int iNodes;//No. of input nodes
  int hNodes;//No. of hidden nodes
  int oNodes;//No. of output nodes

  Matrix whi;//matrix containing weights between the input nodes and the hidden nodes
  Matrix whh;//matrix containing weights between the hidden nodes and the second layer hidden nodes
  Matrix woh;//matrix containing weights between the second hidden layer nodes and the output nodes
//---------------------------------------------------------------------------------------------------------------------------------------------------------  

  //constructor
  NeuralNet(int inputs, int hiddenNo, int outputNo) {

    //set dimensions from parameters
    iNodes = inputs;
    oNodes = outputNo;
    hNodes = hiddenNo;


    //create first layer weights 
    //included bias weight
    whi = new Matrix(hNodes, iNodes +1);

    //create second layer weights
    //include bias weight
    whh = new Matrix(hNodes, hNodes +1);

    //create second layer weights
    //include bias weight
    woh = new Matrix(oNodes, hNodes +1);  

    //set the matricies to random values
    whi.randomize();
    whh.randomize();
    woh.randomize();
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  

  //mutation function for genetic algorithm
  void mutate(float mr) {
    //mutates each weight matrix
    whi.mutate(mr);
    whh.mutate(mr);
    woh.mutate(mr);
  }

//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //calculate the output values by feeding forward through the neural network
  float[] output(float[] inputsArr) {

    //convert array to matrix
    //Note woh has nothing to do with it its just a function in the Matrix class
    Matrix inputs = woh.singleColumnMatrixFromArray(inputsArr);

    //add bias 
    Matrix inputsBias = inputs.addBias();


    //-----------------------calculate the guessed output

    //apply layer one weights to the inputs
    Matrix hiddenInputs = whi.dot(inputsBias);

    //pass through activation function(sigmoid)
    Matrix hiddenOutputs = hiddenInputs.activate();

    //add bias
    Matrix hiddenOutputsBias = hiddenOutputs.addBias();

    //apply layer two weights
    Matrix hiddenInputs2 = whh.dot(hiddenOutputsBias);
    Matrix hiddenOutputs2 = hiddenInputs2.activate();
    Matrix hiddenOutputsBias2 = hiddenOutputs2.addBias();

    //apply level three weights
    Matrix outputInputs = woh.dot(hiddenOutputsBias2);
    //pass through activation function(sigmoid)
    Matrix outputs = outputInputs.activate();

    //convert to an array and return
    return outputs.toArray();
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //crossover function for genetic algorithm
  NeuralNet crossover(NeuralNet partner) {

    //creates a new child with layer matrices from both parents
    NeuralNet child = new NeuralNet(iNodes, hNodes, oNodes);
    child.whi = whi.crossover(partner.whi);
    child.whh = whh.crossover(partner.whh);
    child.woh = woh.crossover(partner.woh);
    return child;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //return a neural net which is a clone of this Neural net
  NeuralNet clone() {
    NeuralNet clone  = new NeuralNet(iNodes, hNodes, oNodes); 
    clone.whi = whi.clone();
    clone.whh = whh.clone();
    clone.woh = woh.clone();

    return clone;
  }
//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //converts the weights matrices to a single table 
  //used for storing the snakes brain in a file
  Table NetToTable() {

    //create table
    Table t = new Table();


    //convert the matricies to an array 
    float[] whiArr = whi.toArray();
    float[] whhArr = whh.toArray();
    float[] wohArr = woh.toArray();

    //set the amount of columns in the table
    for (int i = 0; i< max(whiArr.length, whhArr.length, wohArr.length); i++) {
      t.addColumn();
    }

    //set the first row as whi
    TableRow tr = t.addRow();

    for (int i = 0; i< whiArr.length; i++) {
      tr.setFloat(i, whiArr[i]);
    }


    //set the second row as whh
    tr = t.addRow();

    for (int i = 0; i< whhArr.length; i++) {
      tr.setFloat(i, whhArr[i]);
    }

    //set the third row as woh
    tr = t.addRow();

    for (int i = 0; i< wohArr.length; i++) {
      tr.setFloat(i, wohArr[i]);
    }

    //return table
    return t;
  }

//---------------------------------------------------------------------------------------------------------------------------------------------------------  
  //takes in table as parameter and overwrites the matrices data for this neural network
  //used to load snakes from file
  void TableToNet(Table t) {

    //create arrays to tempurarily store the data for each matrix
    float[] whiArr = new float[whi.rows * whi.cols];
    float[] whhArr = new float[whh.rows * whh.cols];
    float[] wohArr = new float[woh.rows * woh.cols];

    //set the whi array as the first row of the table
    TableRow tr = t.getRow(0);

    for (int i = 0; i< whiArr.length; i++) {
      whiArr[i] = tr.getFloat(i);
    }


    //set the whh array as the second row of the table
    tr = t.getRow(1);

    for (int i = 0; i< whhArr.length; i++) {
      whhArr[i] = tr.getFloat(i);
    }

    //set the woh array as the third row of the table

    tr = t.getRow(2);

    for (int i = 0; i< wohArr.length; i++) {
      wohArr[i] = tr.getFloat(i);
    }


    //convert the arrays to matrices and set them as the layer matrices 
    whi.fromArray(whiArr);
    whh.fromArray(whhArr);
    woh.fromArray(wohArr);
  }
}

i’ll show the same but in my try at reinforcement learning in a second

ctremblay · May 31, 2021, 3:25pm

This is my try at reinforcement learning

wall.csv gotta have my path

197,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,49,50,74,75,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,124,125,149,150,174,175,199,200,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,249,250,274,275,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,324,325,349,350,374,375,399,400,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,449,450,474,475,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,524,525,549,550,574,575,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624

sketch

Table enemyThink,wallTable;
TableRow tr;
int s = 25;
ArrayList <Wall> wall = new ArrayList<Wall>();
Enemy e;
Finish f;

void setup(){
  size(625,625);
  e = new Enemy(52);
  f = new Finish(121);
  //enemyThink = loadTable("enemy.csv","header");
  enemyThink = new Table();
  enemyThink.addColumn("Right");
  enemyThink.addColumn("Left");
  enemyThink.addColumn("Up");
  enemyThink.addColumn("Down");
  for (int i = 0; i < 125; i++){
    tr = enemyThink.addRow();
    for (int j = 0; j < 4; j++){
      tr.setFloat(j,0);
    }
  }
  saveTable(enemyThink,"enemy.csv");
  wallTable = loadTable("wall.csv");
  tr = wallTable.getRow(0);
  for (int i = 1; i < wallTable.getColumnCount(); i++){
    int tn = tr.getInt(i);
    wall.add(new Wall(tn));
  }
}

void draw(){
  background(0);
  for (Wall w: wall){
    w.show();
  }
  e.show();
  e.think();
  if (e.death == true || e.win == true){
    e.pos = new PVector(2 * s, 2 * s);
    e.state = 52;
    e.death = false;
  }
  f.show();
  text(e.state,200,50);
}

Enemy where all my Q learning is

class Enemy{
  int j = 0;
  int jj =0;
  int num = 0;
  float learning_rate = 0.1;
  PVector pos;
  int state = 0;
  float explore = 1;
  float minexplore = 0.01;
  float decayRate = 0.001;
  float maxSteps = 99;
  int currentMove = 0;
  ///////////////////////////////////////////////
  boolean death = false;
  boolean win = false;
  int a = 0;
  String movement = " ";
  
  Enemy(int n){
    state = n;
    j = n;
    num = 0;
    while(j > 24){
      jj += 1;
      j -= 25;
    }
    pos = new PVector(j * s, jj * s);
  }// Enemy
  
  void show(){
    fill(255,0,0);
    rect(pos.x,pos.y,s,s);
  }
  
  void think(){
    if (currentMove < maxSteps){
      tr = enemyThink.getRow(state);
      float r = random(1);
      if (r > explore){
        float ln = 0;
        a = 0;
        for (int j = 0; j < 4; j++){
          float tn = tr.getFloat(j);
          if (tn > ln){
            a = j;
          }// n > large number
        }// j
      }// r > explore
      if (r <= explore){
        float rn = random(0,3);
        a = int(rn);
      }// do random
      switch(a){
        case 0:
          movement = "Right";
        break;
        case 1:
          movement = "Left";
        break;
        case 2:
          movement = "Up";
        break;
        case 3:
          movement = "Down";
        break;
      }
      move();
      check();
      float t = tr.getFloat(a);
      tr.setFloat(a, t - learning_rate);
      saveTable(enemyThink,"enemy.csv");
      explore -= decayRate;
      currentMove += 1;
    } else if (currentMove == maxSteps){
      currentMove = 0;
    }
  }// think
  
  void move(){
    if (movement == "Right"){
      pos.x += s;
      state += 1;
    }
    if (movement == "Left"){
      pos.x -= s;
      state -= 1;
    }
    if (movement == "Up"){
      pos.y -= s;
      state -= 25;
    }
    if (movement == "Down"){
      pos.y += s;
      state += 25;
    }
  }
  
  void check(){
    death = false;
    win = false;
    for (int i = 0; i < wall.size(); i++){
      Wall w = wall.get(i);
      if (w.pos.x == pos.x && w.pos.y == pos.y){
        death = true;
      }
    }
    if (f.pos.x <= pos.x && f.pos.x + (3 * s) > pos.x && f.pos.y == pos.y){
      win = true;
    }
  }// check
}

wall

class Wall{
  int j = 0;
  int jj = 0;
  int num = 0;
  PVector pos;
  
  Wall(int n){
    num = n;
    j = n;
    while(j > 24){
      jj += 1;
      j -= 25;
    }
    pos = new PVector(j * s, jj * s);
  }// wall
  
  void show(){
    fill(255);
    rect(pos.x,pos.y,s,s);
    fill(0);
    textSize(10);
    text(num,pos.x + 2, pos.y + 20);
  }
}

finish reach it to win

class Finish{
  PVector pos;
  int j , jj, num;
  
  Finish(int n){
    j = n;
    jj = 0;
    num = n;
    while (j > 24){
      jj += 1;
      j -= 25;
    }
    pos = new PVector(j * s,jj * s);
  }
  
  void show(){
    fill(0,255,0);
    rect(pos.x,pos.y,3*s,s);
  }
}

so how do i just give this code the distance instead of checking what block it’s in

Chrisir · May 31, 2021, 3:29pm

I don’t have time at the moment.

What is the AI learning please?

ctremblay · May 31, 2021, 3:31pm

it suppose to be an enemy in my tower defense so it needs to make it to the finish line so the game will end. in simple terms that is what I want. but atm just want it to get to the finish line. I thought coding enemies was getting boring so I’ll just give them an ai so I have more exciting enemies and learn ai.

Topic		Replies	Views
AI Flappy Bird Processing Java Coding Questions	54	3463	April 23, 2019
Reinforcement learning Java Processing Coding Questions	1	73	January 31, 2025
Processing Q-Learning Coding Questions	0	234	February 16, 2021
Neural network help: AI doesn't improve Coding Questions	3	773	June 24, 2020
Flaw in current neural networks Gallery	10	2087	June 29, 2019

AI reinforcement learning

Related topics