Help reading data from a game (Diablo II) or analyzing an image from it for text recognition

Hello all.

I’ve been using processing for a couple of years now and it’s worked well for most of my needs, but I am having some trouble with one particular project. The project is to make physical health/mana indicators that mimic the in-game health and mana pools in Diablo II (although, it could probably be used with other games too eventually). Something like this, but without water - just LEDs:

The hardware I have covered, and I almost had the software working. I do not know how to get the data from the game or if there is a way to extract stuff like that. My current method is essentially having my processing sketch take screenshots and analyze the pixels showing the health globe. For those unfamiliar it looks like this:


So, I was basically looking at the vertical strip of pixels running down the center of the health globe and trying to determine the “redness” of them (which changes to green when you’re poisoned, but I can deal with that I think). The problem, as you may have noticed, is that the globe is see-through on the top half, so it gets easily confused in busy environments (some of which are red). Also, the bottom of the globe is very dark, nearly black, so it gets a little bias. I thought about trying to do some character recognition to get the actual health values, but I don’t know a good way to go about that in processing. Any suggestions for that or alternative solutions?

1 Like

@Randios – do you really want the health as a percentile?

429 / 746 = ~0.575 (58%), so that is “ground truth” for the graphic.

Or do you want the actual values – 429 and 746 (745?) – given that the total health changes in the game, in which case you should really be extracting the text instead.

I did a bunch of image processing data extraction from video games like this in the late 00s (color extraction from defined regions in games with fixed UI / HUDS), and smoothing the values helped. Sometimes the screen flashes white. Sometimes the HUD interface “shakes.” A certain weapon causes glare effects that may cross the HUD. My experience was that some games are noisier than others, but that noise is a normal part of the data when using techniques like these.

The percentile is what I’m after. I don’t currently have any plans that would need the raw values.

However, extracting the text would probably be the most accurate method (instead of trying to process the pixels of the globe, and assuming there isn’t an easy way to poll the game for information). So, if that is the case, I’m not really sure where to go from here. I’ve seen mention of Optical Character Recognition (OCR) during some searches, but it seemed pretty heavy. I was hoping there might be an easier way to do this in processing, given I know the font and size of the text. But I haven’t seen any good explanations of how to go about this.

1 Like

Because you only care about 10 character – 0-9 – and they are in a wacky low-rez font, you could collect character streams using template matching.

BoofCV for Processing already supports template matching.

Use the provided Template example from the BoofCV library, Crop copies of the numbers into 10 files with 10 masks, and bam – bespoke OCR data. Since your matches are on a single line, you can just sort hits by their x coordinates to reconstruct the string.

Screen Shot 2020-06-11 at 12.54.35 PM

Example template (crop) and mask (black and white, e.g. in Gimp or Paint / Photoshop):


Edit caveat–you need scores for template matching to work like this, and I haven’t tested whether they were added to the library yet. See the thread I linked above.