Issue loading xml-feed in p5js

Hi

I am wondering how to correctly get load an xml file from an external website in p5js. In Processing it is no issue using loadXML on the url, but in p5js it seems more complicated, and just calling loadXML with the raw url does not give any results.

As an example, I am interested in getting all of the content inside the <title> tags from this site: https://ekstrabladet.dk/rssfeed/sport/

I had @GoToLoop help me out in the old Processing forum, but the solution he helped me make (and thank you a lot for that!) stopped working earlier this week for some reason… I don’t really understand how the HerokuApp and CORS proxies work, so any help would be much appreciated.

Here is the link to the original thread on the old forum:

1 Like

I don’t think the proxy being used on that old sketch is active anymore. :worried:

Also, that EkstraBladet.dk site seems to be now HTTPS exclusively. The old sketch uses HTTP. :roll_eyes:

Therefore we need to find another proxy & change the constant HTTP to 'https://'. :sweat_drops:

In that old forum thread I had posted a list of proxies: :postbox:

And on its comments, there are some more active CORS proxies listed. :spiral_notepad:

I’ve simply picked this 1 and it worked: :partying_face:

Basically I’ve mainly changed these lines from the old sketch and made some other cleanups too: :sunglasses:

const HTTP = 'https://',
      PROX = 'YaCDN.org/', MODE = 'proxy/',

Here’s the new link for the updated “loadXML() from EkstraBladet via Proxy” sketch: :link:

/**
 * loadXML() from EkstraBladet via Proxy (v2.0)
 * Andreas_Ref & GoToLoop (2019-Feb-08)
 *
 * https://Discourse.Processing.org/t/issue-loading-xml-feed-in-p5js/8212/2
 *
 * https://Bl.ocks.org/GoSubRoutine/5bdec67b4f70080bfddb6c0607cb6c70
 *
 * https://EkstraBladet.dk/rssfeed/sport/
 * https://Gist.GitHub.com/jimmywarting/ac1be6ea0297c16c477e17f8fbe51347
*/

"use strict";

const HTTP = 'https://',
      PROX = 'YaCDN.org/', MODE = 'proxy/',
      SITE = 'EkstraBladet.dk/', FOLD = 'rssfeed/', QRY = 'sport/',
      LINK = HTTP + PROX + MODE + HTTP + SITE + FOLD + QRY,
      FILE = 'sport.xml',
      REMOTE = true,
      TAG = 'title',
      LIST = 'ol', ITEM = 'li',
      titles = [];

let xml;

function preload() {
  console.info(LINK);
  xml = loadXML(REMOTE && LINK || FILE, print, console.warn);
}

function setup() {
  noCanvas(), noLoop();

  const items = xml.getChild('channel').getChildren('item');

  for (const item of items) {
    print(item.listChildren());
    titles.push(item.getChild(TAG).getContent());
  }

  const ol = createElement(LIST)
            .style('color', 'blue')
            .style('font-weight: bold')
            .style('font-size: 1.2em');

  for (const title of titles)  createElement(ITEM, title).parent(ol);
}
2 Likes

Again you are my hero!
Thank you so much :star_struck:

Do you know why loading xml needs to go through a proxy in javaScript (p5js) but not in Java (Processing)?

CORS protection access got nothing to do w/ programming languages. :tongue:

CORS is a browser exclusive “feature”: :crazy_face:

3 Likes

It stopped working again for some reason. I tried changing to GitHub - Freeboard/thingproxy: A simple forward proxy server for processing API calls to servers that don't send CORS headers or support HTTPS. as proxy but without any luck. Any chance you can figure out a new solution @GoToLoop ?

Well, you’ve gotta try to search for another workable CORS proxy.
From the proxy list I’ve found out “All Origins” seems to still be working:
https://AllOrigins.win
Here’s the full URI using this other proxy:
http://api.AllOrigins.win/get?url=https://EkstraBladet.dk/rssfeed/sport/

1 Like

Right thanks so much for helping once again!
I tried with http://api.AllOrigins.win/get?url=https://EkstraBladet.dk/rssfeed/sport/ but it gives me an error saying:
p5.js:78248 Mixed Content: The page was loaded over HTTPS, but requested an insecure resource 'http://api.allorigins.win/get?url=https://ekstrabladet.dk/rssfeed/sport/'. This request has been blocked; the content must be served over HTTPS.

When changing from http://api.AllOrigins… to https://api… (changing the s), I get no error, but don’t seem to get access to anything?

Here is a simple sketch of my attempt: p5.js Web Editor

  • Sorry but I’m getting the same error. :bug:
  • Apparently it loads via loadXML(), but it gets an empty body! :astonished:
  • Seems like you’re gonna need to look up for a better CORS proxy then. :telescope:
  • This is my current attempt adapted from my previous sketch: :woozy_face:
/**
 * loadXML() from EkstraBladet via Proxy (v2.0.2)
 * Andreas_Ref & GoToLoop (2019-Feb-08)
 *
 * https://Discourse.Processing.org/t/issue-loading-xml-feed-in-p5js/8212/9
 *
 * https://Bl.ocks.org/GoSubRoutine/5bdec67b4f70080bfddb6c0607cb6c70
 *
 * https://EkstraBladet.dk/rssfeed/sport/
 * https://Gist.GitHub.com/jimmywarting/ac1be6ea0297c16c477e17f8fbe51347
*/

"use strict";

const HTTP = 'https://',
      // PROX = 'YaCDN.org/', MODE = 'proxy/',
      PROX = 'api.AllOrigins.win/', MODE = 'raw?url=',
      SITE = 'EkstraBladet.dk/', FOLD = 'rssfeed/', QRY = 'sport/',
      LINK = HTTP + PROX + MODE + HTTP + SITE + FOLD + QRY,
      FILE = 'sport.xml',
      REMOTE = true,
      TAG = 'title',
      LIST = 'ol', ITEM = 'li',
      titles = [];

let xml;

function preload() {
  console.info(LINK);
  xml = loadXML(REMOTE && LINK || FILE, print, console.warn);
}

function setup() {
  noCanvas(), noLoop();

  const items = xml.getChild('channel').getChildren('item');

  for (const item of items) {
    print(item.listChildren());
    titles.push(item.getChild(TAG).getContent());
  }

  const ol = createElement(LIST)
            .style('color', 'blue')
            .style('font-weight: bold')
            .style('font-size: 1.2em');

  for (const title of titles)  createElement(ITEM, title).parent(ol);
}

P.S.: Never mind, it’s working now! :partying_face:

After reading further from https://AllOrigins.win I’ve spotted an alternative mode to “get” called “raw”. :flushed:

This new URI proxy should successfully work for p5js’ loadXML() now: :crossed_fingers:
https://api.AllOrigins.win/raw?url=https://EkstraBladet.dk/rssfeed/sport/

2 Likes

Awesome! Thanks so much again :pray: :star_struck: :grinning:

Hi again @GoToLoop , I hope it is okay I ask yet again about a related issue.

How might I correctly load the xml for Google suggest xml
When using api.AllOrigins.win which worked (and still works :partying_face:) for the other rssfeed, it does not seem to work for this particular website.

Indeed that CORS service isn’t compatible w/ SuggestQueries.Google.com.

I’ve changed it to CorsAnywhere.HerokuApp.com and it’s just worked:

2 Likes

Thanks so much @GoToLoop ! In Chrome I do get a warning about CorsAnywhere.HerokuApp.com potentially being unsafe (see screenshot). Are there any alternatives?

Hmm, it seems like I can use https://test.cors.workers.dev/? instead of CorsAnywhere.HerokuApp.com, will give this some tests…