How I Tackled Multi Threading For Performance Gain

Hi every one, I wanted to share with you guys how I managed to speed up my code considerably with multi threading. Hopefully some one can use it to his benefit.
It turns out that you can call a function from different concurrent threads even if the threads work on the same array. As long as they don’t change the same array index all goes fine.

The code below fills 2 arrays with a billion sine and cosine values. Calculation time goes down from 19055 msec. (single thread) to 2562 msec (16 threads) on a medium laptop. I am a bit of a speed freak and was quite happy with the performance gain.
Each thread calls the same function but with different start and end values so that they all fill a different part of the arrays.
The code first starts the threads and then waits until they all have finished.
I also included the function printThreads in case you want to change the number of threads or the size of the arrays. If so you have to uncomment the call of it in the setup function and run the program with a different number for numThreads or arraySize. The threads at the bottom get swapped for the printout and the printThreads gets commented again. I hope you like it and someone can put it to good use.
Let me know what you think.
Cheers, Adrian.

// Calculation Time : with 16 threads : 2653 mil.  sec.
// Calculation Time : with 14 threads : 2747 mil.  sec.
// Calculation Time : with 8  threads : 3676 mil.  sec.
// Calculation Time : with 4  threads : 6414 mil.  sec.
// Calculation Time : with 2  threads : 11297 mil. sec.
// Calculation Time : with 1  thread  : 19055 mil. sec.

int numThreads = 4; // number of threads
boolean[] hasFinished = new boolean[numThreads];
int arraySize = 1000_000_000;

float[] Sin = new float[arraySize];
float[] Cos = new float[arraySize];

void setup() {
  surface.setVisible(false); // avoids the canvas popping up
  int t0 = millis();
  //printThreads(); // prints the threads, to be pasted at the bottom
  startThreads();
  int t1 = millis();
  println("Calculation Time : with " + numThreads + " threads : " +    (t1-t0) + " msec.");
}

void calculateTrigonometryArray(int start, int end, int threadNumber) {
  float step = 1.0 / arraySize;
  for ( int alfa = start; alfa < end; alfa++) {
    Sin[alfa] = sin(2 * PI * alfa * step);
    Cos[alfa] = cos(2 * PI * alfa * step);
  }
  hasFinished[threadNumber] = true;
}

void startThreads() {
  for (int i = 0; i < numThreads; i++) {
    hasFinished[i] = false;
  }
  // start threads
  for (int i = 0; i < numThreads; i++) {
    thread("thread"+i);
  }
  // check if all threads have finished
  boolean threadsReady = false;
  while (threadsReady == false) {
    threadsReady = true;
    for (int i = 0; i < numThreads; i++) {
      if (hasFinished[i] == false) threadsReady = false;
    }
  }
}

void printThreads() {
  float POA = (float)arraySize / numThreads; // POA = Part Of Array
  for (int i = 0; i < numThreads; i++) {
    print("void thread" + i + "(){");
    print("calculateTrigonometryArray(" + round(i * POA) + ", "
      + round((i+1) * POA) + ", " + i + ");");
    println("}");
  }
}
void thread0(){calculateTrigonometryArray(0, 250000000, 0);}
void thread1(){calculateTrigonometryArray(250000000, 500000000, 1);}
void thread2(){calculateTrigonometryArray(500000000, 750000000, 2);}
void thread3(){calculateTrigonometryArray(750000000, 1000000000, 3);}
2 Likes