Creating and deleting files

Th3cG · April 18, 2021, 8:49am

I’m trying to make a little app for personal use intended to get weather data from a website. To make this, I have to make some workaround to bypass a 403 error and I’m able to create three files (one for every day for which the weather forecast are intended to be analyzed). Then I create a function to analyze these files and putting data into three tables. Here is the code I wrote:


//-----------------------Importing libraries and other stuff-----------------------

// importing libraries to htmlWorkaround()  work
import java.net.*;
import java.io.*;

//creating Tables
Table table0;
Table table1;
Table table2;

Table[] table = {table0, table1, table2};

//defining variables to be used in the tables
int time;
String weatherCondition;
int speedMin;
int speedMax;
String direction;


//-----------------------SETUP-----------------------
void setup() {

  //setting up the Tables
  for (int i = 0; i < 3; i++) {
    table[i] = new Table();

    table[i].addColumn("time");
    table[i].addColumn("weatherCondition");
    table[i].addColumn("speedMin");
    table[i].addColumn("speedMax");
    table[i].addColumn("direction");
  }

  //calling htmlWorkaround() to generate files to be read 
  htmlWorkaround();

  //calling loadData() to analyze the files
  loadData();
}


//-----------------------DRAW-----------------------
void draw() {

}


//-----------------------htmlWorkaround() >> avoids 403 error-----------------------
void htmlWorkaround() {
  String[] weatherURLs = {"WEBSITE-URL-0", "WEBSITE-URL-1", "WEBSITE-URL-2"};

  for (int i = 0; i < weatherURLs.length; i++) {
    URL url;

    //Create a writer to print the console out
    PrintWriter output = createWriter("weatherText" + i +".txt");

    //
    try {
      // Create a URL object
      url = new URL(weatherURLs[i]);
      URLConnection conn = url.openConnection();
      conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB;     rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
      // Read all of the text returned by the HTTP server
      BufferedReader in = new BufferedReader
        (new InputStreamReader(conn.getInputStream(), "UTF-8"));

      String htmlText;

      while ( (htmlText = in.readLine ()) != null) {
        output.println(htmlText);
      }
      in.close();
    } 

    //Should be called once the website throws 403 back, case 1
    catch (MalformedURLException e) {
      e.printStackTrace(output);
      System.out.println(output.toString());
    } 

    //Should be called once the website throws 4}03 back, case 2
    catch (IOException e) {
      e.printStackTrace(output);
      System.out.println(output.toString());
    }

    //Clearing and closing after the job is done 
    finally {
      output.flush(); 
      output.close(); 
      exit();
    }
  }
}


//-----------------------loadData()-----------------------
void loadData() {
  for (int j = 0; j < 3; j++) {
    //loading htmlWorkaround() output as an array of string
    String[] testoDaAnalizzare = loadStrings("weatherText" + j + ".txt");

    //joining the array to get a String
    String testoDaAnalizzareUnito = join(testoDaAnalizzare, "");

    //cleaning
    String testoDaAnalizzareUnitoRipulito = testoCompreso(testoDaAnalizzareUnito, "<ul class=\"mb-24\">", "</ul>");

    //making another array to get infos related to each hour
    String[] slotOrari = split(testoDaAnalizzareUnitoRipulito, "</li>");

    //hour (time), weather condition (weatherCondition), 
    //wind speed min and max (
    for (int i = 0; i < slotOrari.length - 2; i++) {

      //finding the hour
      String delimitatoreInizio = "<time>";
      String delimitatoreFine = "</time>";
      //converting to int
      time = int(testoCompreso(slotOrari[i], delimitatoreInizio, delimitatoreFine));
      //adding a row to the table
      TableRow newRow = table[j].addRow();

      //setting the data found
      newRow.setInt("time", time);

      /*
      repeating the process for other data needed
      */
    }

    saveTable(table[j], "tabella" + j + ".csv");

  }
}

//-----------------------testoCompreso >> feeds loadString()-----------------------
String testoCompreso(String s, String inizio, String fine) {
  int start = s.indexOf(inizio);
  if (start == -1) {
    return "";
  };

  start += inizio.length();
  int end = s.indexOf(fine, start);
  if (end == -1) {
    return "";
  }

  return s.substring(start, end);
}

This process successfully creates the files I need to be analyzed. The point is:
how can I delete the text files generated by htmlWorkaround()?
I tried to create new files named “weatherText” + j + “.txt” after saveTable and then deleting them, but they stand still in the sketch folder. How can I fix this problem?

th75 · April 18, 2021, 10:05am

Hi,
you can delete files like this:

File f2delete = new File(sketchPath(“weatherText” + j +".txt"));
boolean success = f2delete.delete();
println(success);

paste it just after saveTable(table[j], “tabella” + j + “.csv”);

Even if it will not change performance here, you can simplify your code and gain some elegance by not using temporary file:
load html data to a String , then process this string loadData(myString)
i let you dig how, tell me if you need some help…

anyhow, i hope predicted wheater is sunny…

Th3cG · April 18, 2021, 11:26am

Hello th75,
unfortunately here the weather is partly cloudy
I was missing the sketchPath() part, so I slightly modified your code as follows:

    File toBeDeleted = new File(sketchPath("weatherText" + j + ".txt"));
    toBeDeleted.delete();

and it works! Thank you for your help!!

As this is my first “serious” project, it’s probably kind of spaghetti code and I’d really appreciate your guidance to a more elegant way. That said, when I started the project I was forced to retrieve on the web the htmlWorkaround() code part - so I had to figure out how to use the data I got in that way. The most obvious solution seemed generating a text file for every day, but you’re welcome if you want to share with me your idea.

th75 · April 18, 2021, 12:47pm

your code is clear to read, go step by step, and do what you expected, it s great!
i’m afraid my explanation will be less clear…

my suggestion was to avoid writing to a file , then next step read it to finally delete it

when you do:
String htmlText;

  while ( (htmlText = in.readLine ()) != null) {
    output.println(htmlText);
  }

you add, one by one, incoming lines to a file ( PrintWriter output )

then in loadData(), you read lines (loadStrings )
and concatenate them to a string (String testoDaAnalizzareUnito = join(testoDaAnalizzare, “”); )

to avoid this:
you can store incoming data to a String variable, then process it:

String myData="";
while ( (htmlText = in.readLine ()) != null) {
myData=myData+htmlText;
}
loadData(myData , i );

then you need to change loadData to handle this String and index like this:
void loadData(String testoDaAnalizzareUnito, int j) {
…
}

i suggest you try then let me know,
you need to remove printwriter output in htmlWorkaround()
and , in loadData you don t need anymore for (int j = 0; j < 3; j++) {
as j is set in loadData call:
void loadData(String testoDaAnalizzareUnito, int j)

you don t need anymore “loadStrings” and “join” code

then finally you only call htmlWorkaround(); during setup
and it will call loadData() when you do loadData(myData , i );

let translate all this in pseudo code:
you re doing ( htmlWorkaround):
for each url
load data
save to a file

then in loadData() you do :
for each file
load data
process data

and i suggest:
for each url
load data and process data

key point here is too change your function loadData, to get inputs and call it for each url loaded

Th3cG · April 18, 2021, 4:02pm

Thank you for the kind words about my code, you’re casting some light on a topic that I do not completely master. However, even if I’m able to catch your idea, when I follow your suggestion

you need to remove printwriter output in htmlWorkaround()

I cannot manage the two catch statements and, consequently, bypass the 403 error. If I do not remove PrintWriter output, files are generated anyway and the code revision results useless.

The rest of your solutions makes the job and the tables containing data are correctly generated, saving 10 precious lines of code and making me pretty happy. I’m confident that you’ll show me a solution for the PrintWriter output issue.

th75 · April 18, 2021, 7:59pm

you’re close then:

you can remove printwriter output,
then you need to remove all reference to “output”, including ones you have on your catch statements
(like System.out.println(output.toString()); )

however, you can’t remove this try/catch as it is a requirement for others instructions you have there : “new URL” and “BufferedReader”

some instructions need to be encapsulated in a try{…} catch(){}
specially ones we can easily imagine how they can fail : eg new url, expect you to handle a "MalformedURLException " triggered when asked url is … malformed

it work like that:
try{
…myCode…
}
catch(AException a){
Code to run in case AException happen
}
catch(BException b){
MyCode in case BException happen
}

so you can run different actions for different possible errors
if it s not relevant to handle all possible errors (and for reading remote url, many wrong stuff can happen)
you can replace by
catch (Exception e) {println(e);}
this is generic for “any exception” and printing e, will give you error details

in your case was 2 different errors handled inside catch statements , put there what you want to do if an error occur, eg:
catch (Exception e) {
println(" i can’t do what you ask : ");
println(e); // this will print cryptic details on the error if any
}

and, in this case, you don 't need the finally{} block, this part is always executed after try/catch (with or without errors)
in this code it was used to nicely close output printwritter, but as you don t use output printwritter anymore…you can delete it

may be a better explanation:
https://processing.org/reference/try.html

micuat · April 18, 2021, 8:21pm

Hi! your answers are cool but it would be even better if you format the code with </>

th75 · April 18, 2021, 8:22pm

oh! thanks, was wondering how to … i will

Th3cG · April 19, 2021, 8:13am

Thank you @th75 , it works.
As told in one of my previous posts, I found the code related to htlmlWorkaround() somewhere and put it into my sketch - this to say that I was not completely conscious of how it works.
Your explanation is quite clear, but I still can’t grab what the catch() part does. To be more specific, if I use this code

 catch (MalformedURLException e) {
      println(e);
    }

catch (IOException e) {
      println(e);
    }

when I run the sketch the magic is done without any line being printed in the console. So, what’s the function of println(e)?

th75 · April 19, 2021, 11:27am

catch part will be run only if an error occur when running the code inside the try{} part

i guess you experimented this:
if you do a mistake in your code, when you try to run it, it just crash with some red warning in processing console

but even with a perfect code it can fail for external reasons, (internet is down, hard drive unplugged while writing a file…)

it’s where try{…} catch(){} is useful

in place of crashing the whole software or give a false result, you can anticipate what to do if an error occur: you run what is inside the catch part without stopping your software

for example, if one of the url you read fails (they changed stuff on server side) may be you don t want the whole stuff to stop, but go on with what you get…

usually processing hide for us all this complexity, but as you put there some pure java code, you have to deal with it… a good occasion to learn when to use it

i hope it’s more clear?

just after each catch, you see the kind of error it will manage, (MalformedURLException e) or (IOException e) in your code, and (Exception e) handle any error
as you hard coded the url you ask, it’s quite unlikely MalformedURLException trigger once you get this url right
but IOException is more likely as it can be triggered by any external network errors…

i guess it puzzled you because comments around “catch” is about 403 error, but if your code run and doesn t trigger 403 error anymore, it’s not fixed by this catch statements (more likely fixed by the setRequestProperty part)

Th3cG · April 19, 2021, 4:39pm

i guess it puzzled you because comments around “catch” is about 403 error, but if your code run and doesn t trigger 403 error anymore, it’s not fixed by this catch statements (more likely fixed by the setRequestProperty part)

if I try commenting out

   catch (IOException e) {
      println(e);
    }

the console says

Unhandled exception type IOException

Instead if I comment out the MalformedURLException statement my code works.
I know, in the end my code works as it is, mine is pure curiosity to go deeper in the matter.
By the way I’m still asking to myself what does println(e) print…

th75 · April 19, 2021, 6:09pm

yes IOException can catch error for both but as console say it is mandatory to have at least one…
(Exception e) will do the job too, catching quite everything
without spending too much time, if you look a java doc like this one linked to bufferedReader , you will see for each instruction , a section “throw” giving you the kind of error you can catch

what is println(e):
exception will put details in “e” about the error, it depends on error , but usually it give you wich instruction or function throw the error, a problematic value… sometimes with a lot of details

you can try it: put a bad url in one of weatherURLs, run
you will see in console something like

java.net.UnknownHostException: www.googlefr
java.net.MalformedURLException: no protocol: WEBSITE-URL-1

here is an advantage of try / catch, as the software didn’t crash, you still can print variable
eg, if you put:

    catch (IOException e) {
      println("error with url number "+i, weatherURLs[i]);
      println(e);
    }

then you get

error with url number 0 www.googlefr
java.net.UnknownHostException: www.googlefr
error with url number 1 WEBSITE-URL-1
java.net.MalformedURLException: no protocol: WEBSITE-URL-1

so…println(e); do …nothing until an error occur, then it print to console this error details

Th3cG · April 20, 2021, 6:42pm

Hi @th75,
this is just to show my gratitude for your support.
I hope our roads can cross again in the future.
Cheers.

Topic		Replies	Views
Printing console output Coding Questions	5	523	April 11, 2021
Permission issue in reading file from external storage Processing for Android	44	2971	January 2, 2022
The file is missing or Inaccessible Coding Questions	20	8712	January 17, 2019
Writing text to a document (with fonts and whatnot) Project Guidance	15	953	February 12, 2020
Saving live information to excel file every half hour Electronics (Arduino, etc.)	60	1068	March 26, 2024

Creating and deleting files

Related topics