Image to text (Tess4J?)

MTech · August 25, 2018, 7:48pm

Hey all,

So the past few hours i’ve been desperately trying to get some form of ‘image-to-text’ working in processing 3. I’ve browsed and read alot of things for java, and i think it should be possible to get something working. I just can’t seem to get it right though.

Background of the problem: I want to take screenshots frequently of a particular part of the screen which contains the Health Points of the boss in the game i play. I’ve got a lot of plans which i could do to make an interesting tool, but the big step “Screenshot/image -> int HP variable” is where i need help with.

So i’ve tried to use the Tess4J library for this (see http://tess4j.sourceforge.net/ ). I am not fully sure how to get things working though, and im down to this situation: http://prntscr.com/kmua58 .

Main question here: how can i get Tess4J to work with processing? I’ve seem to get stuck alot. After hours of trying on my own im hoping someone here is able to help! Anyone able to get Tess4J image to text going in processing3?

Hope to get reactions -MTech

Kevin · August 25, 2018, 7:57pm

What you’re looking for is called optical character recognition, or OCR. Googling something like “java OCR library” will return a ton of results.

But to answer your question, the simplest way to use a Java library in Processing is to drag the .jar file onto the Processing editor. Then you need to add import statements to use the classes in the library .jar file.

Shameless self-promotion: here is a guide on using libraries in Processing:

MTech · August 25, 2018, 8:07pm

Thanks alot for replying, ye i found that out been reading articles like that. The thing is: in the Tess4J folder there is a ‘lib’ folder, which contains like 20 .jar files all with kinda vague names (atleast to me).

Kevin · August 25, 2018, 8:11pm

The setup is going to be a little different for every library. You could try narrowing down which .jar file you actually need. Or you could just drag them all onto your editor.

MTech · August 25, 2018, 8:14pm

The first .jar i even try comes up with an error, first asks me would you like to replace the current .jar? So i press yes…

Kevin · August 25, 2018, 8:17pm

Right, so it looks like you’ve already copied that .jar file.

Take a look at your sketch folder by going to Sketch -> Show Sketch Folder to see what libraries files have already been copied.

Kevin · August 25, 2018, 8:18pm

You also might just want to use a different library.

There are a couple OpenCV libraries that work nicely with Processing. Those might be worth looking into. Check out the libraries page on the Processing website for more info.

MTech · August 25, 2018, 8:20pm

So apperently when i drag all .jar files it creates a map ‘code’ in my sketchfolder… see image.

MTech · August 25, 2018, 8:22pm

Well i’ve been looking around for ages now, for example http://www.yunmai.com/en/home.html is one i just downloaded. I get a random .zip tho with no clear library folder. For some reason everything i try has some kind of complicated structure and i just can’t get things to work. Been looking youtube and what not aswell for concrete steps which i’d need to take for making it work, but can’t get there.

Do you know of a concrete library that you know is easy to get working

MTech · August 25, 2018, 8:25pm

I’ve also taken a look at this one: https://github.com/bytedeco/javacpp-presets/tree/master/tesseract

But again (i guess its me being a total noob) i just can’t seem to follow the steps. Creating a pom file? Never seen one before. Using maven 3? I just get confused quite fast.

Kevin · August 25, 2018, 8:26pm

Unfortunately that’s the case for a lot of libraries. Many come in different structures and formats, and it’s a bit of guesswork to figure out exactly what you need.

I think you’re on the right track with what you already have to Tess4J. My guess is you’re just missing an import statement now.

Yep like I mentioned, I’d try OpenCV. I don’t have a specific example, but it should at least be easier to import into Processing since it’s supported directly in the editor.

That’s talking about Maven, which is a way to handle libraries without manually coping over .jar files. You can’t really use Maven from the Processing editor though, you’d have to switch to a more advanced editor.

MTech · August 25, 2018, 8:29pm

Yea i see what you are saying. I just hope things get easier as i get more experienced. I’ve taken a look at OpenCV ( https://github.com/atduskgreg/opencv-processing ) but all the examples showed some photos, not really text recognition. You’re saying it actually is able to recognize text from images?

MTech · August 25, 2018, 8:32pm

As for Tess4J: I’ve been trying the same stuff as in this example ( http://tess4j.sourceforge.net/codesample.html ). import net.sourceforge.tess4j.*, but this seems kinda odd since i now have a folder named ‘code’ in my sketchfolder after dragging the .jars in there. “The package net.sourceforge.tess4j does not exist. You might be missing a library”.

Kevin · August 25, 2018, 8:36pm

Like I mentioned I don’t have a specific OCR example for OpenCV in mind, but I know it handles a lot of similar problems. I’d recommend googling something like “OpenCV OCR” for a ton of results.

The code directory is expected. This is how Processing stores libraries used in a sketch. I’m not sure why the import statement isn’t working- maybe you’re missing a core library .jar file? In other words, it could be that the .jar files you’ve seen so far are the libraries used by Tess4J, and not the Tess4J library itself?

Kevin · August 25, 2018, 8:37pm

Take a look in the dist folder inside the Tess4J directory. It contains the core library .jar file you need.

MTech · August 25, 2018, 8:39pm

As you say this im just discovering the dist folder with a single tess.jar file myself lol. Program gets past this problem now, gonna check it out. Thanks alot! What your saying was probably right, i assumed the ‘lib’ folder was the library folder and i’d need those libraries.

Kevin · August 25, 2018, 8:40pm

You probably do need those .jar files. You just also need the tess.jar file as well. Basically the tess.jar file contains code that requires the other .jar files.

MTech · August 25, 2018, 9:03pm

I see thanks. Well i guess one final question, the results seem quite bad. Written text on a white background gets recognized pretty good. But reading the numbers in the healthbar seems too hard for it? This is what he reads. If i set it to only recognize numbers it doesnt read anything at all.

To me as a human it seems the numbers are pretty easy to recognize from the background, is it normal a program like this finds it realy hard tho? Is that just a limit of computers? If so ill have to just give up on what i want i guess. aImgTester

Kevin · August 25, 2018, 9:06pm

One common approach is to do some pre-processing on the image first. You could try converting to black and white first, or removing the background, or increasing the contrast for example. These are just examples, but there are a ton of options you can do.

This is one reason I mentioned OpenCV: it comes with a bunch of features for cleaning up images before you actually process them.

But yes, in general OCR is not exactly a trivial problem.

Topic		Replies	Views
Is OCR available in Processing? Libraries	4	1275	August 11, 2020
ZXING4P Ported to Android mode P4 ;) Processing for Android	0	157	December 8, 2022
I have a question regarding OCR Libraries	2	171	October 29, 2022
Difference between a TIFF file and Processing's TIFF file Processing	10	1310	April 6, 2020
Java Processing Development	7	1431	January 20, 2020

Image to text (Tess4J?)

Related topics