# Detecting large data-sets

Kindly help me to guide how should start implementing using processing. I have attach link below. https://youtu.be/EZgiYYxUT74

This is a force directed graph, an important detail to start searching for resources. Check for instance:

https://forum.processing.org/one/topic/force-directed-graph-in-toxiclibs.html

Kf

1 Like

Thank you for guide and support on this. I have doubt on this why force directed graph cannot execute in processing 3?

This is just a resource where you could start. Your statement:

why force directed graph cannot execute in processing 3

You tried to execute it online and it didnâ€™t work? If you are referring to the first link, what I would do is create a new sketch and load all the code there. Check for external resources if any is required. I am guessing you need a data set, although they should provide one. The first challenge is to get it running. This would be a demonstration and a code you use as a guide so for you to design your own code. I am not sure if I am answering your question as your question is not clear.

Kf

You should also explain what you mean in your title by â€śdetectingâ€ť large data-sets.

A graph doesnâ€™t detect large data, it visualizes large data.

Right now i have data set into csv file which iâ€™m using processing 3 using `loadstring` and `split` function. Then i need implement into force directed graph same thing link beside https://youtu.be/EZgiYYxUT74 so iâ€™m stuck here.

Detecting large data sets is identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R2) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.
As i saw on YouTube video itâ€™s show that detecting large data.

Thank you lot but i already make `split` function using processing

``````void setup() {
size(600, 600);
String [] data;

println("There are " + data.length + "lines of data.");
for (int i = 0; i < data.length; i++) {

println("Now looking at line -> " + i + ", which is " + data[i]);

String [] splitNums = split(data[i], ',');

for (int j=0; j < splitNums.length; j++){
println("" + i + "," + j + " => " + splitNums[j]);

float a = float(splitNums[0]);
float b = float(splitNums[1]);
}

}
}
``````

So next part i need to do force direct graph code using processing 3, so from Spellman.csv file data set are load into force direct graph. Thatâ€™s is my idea but not sure whether can work or not. Any idea from your side.

You can implement this in Processing. the challenge is how to do it. For instance, the definition above is not sufficient to implement this. If you are implementing this from scratch, you need to have these associations concepts clear. Then, there is a design part before implementation. Do you have a test data set, clear concepts and a design ready to go?

Check this video and the description of it right below: https://www.youtube.com/watch?v=dlnozul7ek0

Kf

That paper is called â€śDetecting Novel Associations in Large Datasets.â€ť You Are detecting associations, you arenâ€™t detecting the large dataset. Detect Is â€śto identify it exists.â€ť

1. Detecting stolen cars in large cities â€“ yes, they are hard to find.
2. Detecting large cities â€“ no, nobody needs to detect them, they are just there.

Do you need to use Processing (Java), or could you use p5.js (JavaScript)?

In Processing (Java) you can do this with the toxiclibs library. Dan Shiffman walks through examples in Ch5 of The Nature of Code.

Read the book, and check out the example here:

i using processing java. I try make something like this code but not exactly in https://youtu.be/EZgiYYxUT74

``````import toxi.geom.*;

//DECLARE - store all the balls/global variable
ArrayList ballCollection;

// Setup the Processing Canvas
void setup() {
size(600,600);
smooth();
String [] data; //d

//INITIALIZE
ballCollection=new ArrayList();

for(int i = 0; i < 100; i++) {
Vec3D origin = new Vec3D (random(width),random(height), 0);
Ball myBall = new Ball(origin);
}
}

// Main draw loop
void draw() {
background(0);

//CALL FUNCTIONALITY

for(int i = 0; i < ballCollection.size(); i++){
Ball mb = (Ball) ballCollection.get(i);
mb.run();

}
}

class Ball{
// GLOBAL VARIABLES - LOCATION SPEED
Vec3D loc = new Vec3D (0,0,0);
Vec3D speed = new Vec3D(random(-2,2),random(-2,2),0);
Vec3D acc = new Vec3D();
Vec3D grav = new Vec3D(0,0.2,0);

//CONSTRUCTOR - HOW DO YOU BUILD THE CLASS - GIVE VARIABLES A VALUE
Ball(Vec3D _loc){
loc = _loc;
}

//FUNCTIONS - BREAK DOWN COMPLEX BEHAVIOUR INTO DIFFERENT MODULES

void run(){
display();
move();
bounce();
//gravity();

//Create a line between the balls
lineBetween();
//flock = complex behaviour. Craig Reynolds Boids
//flock();

}

/*
void flock(){

//3 functions of flock : based on vector maths
separate(5);
cohesion(0.001);
align(1);

}

void align(float magnitude){
Vec3D steer = new Vec3D();
int count = 0;

for(int i = 0; i < ballCollection.size();i++) {
Ball other = (Ball) ballCollection.get(i);

float distance = loc.distanceTo(other.loc);

if(distance > 0 && distance < 40) {

count ++;

}
}

if(count > 0){
steer.scaleSelf(1.0/count);
}
steer.scaleSelf(magnitude);

}

//cohesion =opposite of seperate - keep together -

void cohesion(float magnitude){

Vec3D sum = new Vec3D();
int count = 0;

for(int i = 0; i < ballCollection.size();i++) {
Ball other = (Ball) ballCollection.get(i);

float distance = loc.distanceTo(other.loc);

if(distance > 0 && distance < 40) {

count++;
}
}
if (count > 0){
sum.scaleSelf(1.0/count);
}

Vec3D steer = sum.sub(loc);

steer.scaleSelf(magnitude);

}

void separate(float magnitude){

Vec3D steer = new Vec3D();
int count = 0;

for(int i = 0; i < ballCollection.size();i++){

Ball other = (Ball) ballCollection.get(i);

float distance = loc.distanceTo(other.loc);
if(distance > 0 && distance <  30){

//move away from another ball // calculate a vector of difference

Vec3D diff = loc.sub(other.loc);
//increases smoothness of the steer
diff.normalizeTo(1.0/distance);

count++;

}

}
if (count > 0){
steer.scaleSelf(1.0/count);

}
steer.scaleSelf(magnitude);

}
*/
void lineBetween(){
//BallCollection
for(int i = 0; i < ballCollection.size();i++){

Ball other = (Ball) ballCollection.get(i);

float distance = loc.distanceTo(other.loc);
if(distance > 0 && distance <  100){
stroke(255,0,0);
strokeWeight(0.4);
line(loc.x,loc.y,other.loc.x,other.loc.y);
}
}
}

void gravity(){
}

void bounce (){
if (loc.x > width){
speed.x = speed.x * -1;
}
if (loc.x < 0){
speed.x = speed.x * -1;
}
if (loc.y > height){
speed.y = speed.y * -1;
}

if (loc.y < 0){
speed.y = speed.y * -1;
}

}

void move() {
//steering behaviours need accelartion :store the movements for the ball

speed.limit(2);

acc.clear();

}
void display(){
stroke(0);
ellipse(loc.x,loc.y,5,5);
}
}
``````

and another coding make separately using load data-set and split function

``````import toxi.geom.*;

void setup() {
size(600, 600);
smooth();
String [] data;

println("There are " + data.length + "lines of data.");
for (int i = 0; i < data.length; i++) {

//Tell which line we are looking at now.
println("Now looking at line -> " + i + ", which is " + data[i]);

//Parse this line using split()
String [] splitNums = split(data[i], ',');

//show every number we found on this line.
for (int j=0; j < splitNums.length; j++){
println("" + i + "," + j + " => " + splitNums[j]);

// a equals to MICe (strength) & b equals to TICe (presence of relationship)
float a = float(splitNums[0]);
float b = float(splitNums[1]);

}

}
}
``````

I donâ€™t know how to merge the both coding ? Help me.

Are you trying to create a 2D or 3D force directed graph?

2D or 3D doest matter. But im trying in 3D