Problems parsing an UTF-8 string backwards

LuckSmith · August 10, 2019, 3:37pm

If I go forward like

String str = "This is a string containing different encodings. В этой строке есть кириллица, например (2 bytes per char)";
for (int i=0; i<str.length(); i++) {
  println(str.charAt(i));
}

it works fine.
But if I then need to check a previous char str.charAt(i-1) and that char is encoded with more than 1 byte I get garbage.

any ideas how to solve this?

GoToLoop · August 10, 2019, 4:28pm

Docs.Oracle.com/en/java/javase/11/docs/api/java.base/java/lang/StringBuilder.html#reverse()

/**
 * Reverse Unicode String w/ Surrogate Characters (v1.1)
 * GoToLoop (2019/Aug/10)
 * Discourse.Processing.org/t/problems-parsing-an-utf-8-string-backwards/13294/2
 */

static final String ORIGINAL =
  "This is a string containing different encodings.\n" + 
  "В этой строке есть кириллица, например (2 bytes per char).";

static final String REVERSED = reverseString(ORIGINAL);

static final color FG = #FFFF00, BG = #0000FF;
static final int FONT_SIZE = 030;

void setup() {
  size(900, 200);
  noLoop();

  fill(FG);
  textSize(FONT_SIZE);
  textAlign(CENTER, CENTER);

  println(ORIGINAL + ENTER + ENTER + REVERSED);
}

void draw() {
  final int cx = width >> 1, qy = height >> 2;
  background(BG);

  text(ORIGINAL, cx, qy);
  text(REVERSED, cx, 3*qy);
}

static final String reverseString(final String original) {
  return new StringBuilder(original).reverse().toString();
}

LuckSmith · August 10, 2019, 7:07pm

I guess my problem is local then. Some settings must have been went off with the console fonts, I guess…

…I sense the flavor of «da Old School» in your code here: width >> 1 height >> 2 as we were writing back in those days

Topic		Replies	Views
Rosetta Challenge! -- Reverse a String Coding Questions	32	1799	May 15, 2020
Problem with Unicode Coding Questions	3	781	May 15, 2020
Is there an Android Mode 4.1 problem(characters are broken.) / solution? Processing for Android	9	1237	October 31, 2019
How to output UTF8 character on screen? Coding Questions	2	307	March 26, 2022
Radix 32+ Converter Coding Questions	4	488	November 17, 2020

Problems parsing an UTF-8 string backwards

Related topics