Rosetta Challenge! -- Reverse a String

Nice catch @noel! :baseball:

I’ve just assumed a solution w/ StringBuilder would be faster, but I was wrong. :woozy_face:

Actually reverseSlow() is more than 2x faster than reverseFast()! :open_mouth:

However the Java/Pjs cross-mode reverseStr() is absurdly slower than the Java-only versions. :snail:

// Discourse.Processing.org/t/rosetta-challenge-reverse-a-string/20644/14
// GoToLoop (2020/May/09)

static final String STR = "àéïõû";
static final int ITERS = 10_000_000, LOOPS = 3, FUNCTS = 4;
final IntList timers = new IntList(LOOPS * FUNCTS);

void setup() {
  println("reverseStr(), reverseUnicode(), reverseSlow(), reverseFast()");

  for (int i = 0; i < LOOPS; ++i) {
    reverseStrMillis();
    reverseUnicodeMillis();
    reverseSlowMillis();
    reverseFastMillis();

    println(timers);
  }

  exit();
}

void reverseStrMillis() { // slowest
  final int start = millis();
  String s;

  for (int i = 0; i < ITERS; ++i)  s = reverseStr(STR);
  timers.append(millis() - start);
}

void reverseUnicodeMillis() { // slow
  final int start = millis();
  String s;

  for (int i = 0; i < ITERS; ++i)  s = reverseUnicode(STR);
  timers.append(millis() - start);
}

void reverseSlowMillis() { // fastest
  final int start = millis();
  String s;

  for (int i = 0; i < ITERS; ++i)  s = reverseSlow(STR);
  timers.append(millis() - start);
}

void reverseFastMillis() { // fast
  final int start = millis();
  String s;

  for (int i = 0; i < ITERS; ++i)  s = reverseFast(STR);
  timers.append(millis() - start);
}

static final String reverseStr(final String s) { // slowest
  return join(reverse(s.split("")), "");
}

static final String reverseUnicode(final CharSequence s) { // slow
  final int[] reversedUnicodes = reverse(s.codePoints().toArray());
  return new String(reversedUnicodes, 0, reversedUnicodes.length);
}

static final String reverseSlow(final String s) { // fastest
  return new String(reverse(s.toCharArray()));
}

static final String reverseFast(final CharSequence s) { // fast
  return new StringBuilder(s).reverse().toString();
}
2 Likes

I found it frustrating too. I believe that in order to correctly do this you need to correctly parse the unicode string into graphemes and reverse the grapheme list. Clearly there are implementations that are able to do this,

but I haven’t seen one for Java 8 – let alone a simple one. StringBuilder may have built-in support for this in some Java version… but given that PDE 3 can’t display these characters anyway – either in the editor or in output – I honestly don’t see the point of that part of the task for most Processing users.

How so?

ide screen

1 Like

Hmm. I may have launched 3.4 by habit.

So it looks like, in PDE 3.5.4,

as⃝df̅ … is the string
a sdf- … is what the editor can display, and
a sdf̅ … is what the window draws. So there is partial support.

In 3.4, the main difference is just the circle. The string in the editor separates the f and includes a broken-box image for the circle rather than silently dropping it, like this:

Screen Shot 2020-05-09 at 10.09.46 AM

and then draws with open boxes, like this:

Screen Shot 2020-05-09 at 10.09.25 AM

How about this 1 relying on CharSequence::codePoints()? :wink:
Docs.Oracle.com/en/java/javase/11/docs/api/java.base/java/lang/CharSequence.html#codePoints()

// Discourse.Processing.org/t/rosetta-challenge-reverse-a-string/20644/18
// GoToLoop (2020/May/09)

static final String TXT = "I💖🎮!";

void setup() {
  noLoop();

  background(#0000FF);
  fill(#FFFF00);
  textAlign(CENTER, BASELINE);

  println(TXT);
  text(TXT, width >> 1, height >> 2);

  final String rev = reverseUnicode(TXT);
  println(rev);
  text(rev, width >> 1, 3 * height >> 2);
}

static final String reverseUnicode(final CharSequence s) {
  final int[] reversedUnicodes = reverse(s.codePoints().toArray());
  return new String(reversedUnicodes, 0, reversedUnicodes.length);
}
1 Like

And apparently Python Mode doesn’t need any changes at all: :partying_face:

# Discourse.Processing.org/t/rosetta-challenge-reverse-a-string/20644/19
# GoToLoop (2020/May/09)

TEXTS = 'asdf', u'àéïõû', u'I💖🎮!'

def setup():
    print TEXTS
    print TEXTS[0], TEXTS[1], TEXTS[2]
    print reverseStr(TEXTS[0]), reverseStr(TEXTS[1]), reverseStr(TEXTS[2])
    exit()


def reverseStr(s): return s[::-1]

(‘asdf’, u’\xe0\xe9\xef\xf5\xfb’, u’I\U0001f496\U0001f3ae!’)

asdf àéïõû I💖🎮!

fdsa ûõïéà !:video_game::sparkling_heart:I

2 Likes

Very cool! but not quite yet. When I run that,
“as⃝df̅” incorrectly becomes “̅fd⃝sa” – with bar ahead of f, and circle on fd, not sa. That is the example of bad output that the wiki task gives.
It should output “f̅ds⃝a”.

On Python mode it outputs: :snake:

(u’\xe0\xe9\xef\xf5\xfb’, u’I\U0001f496\U0001f3ae!’, u’as\u20dddf\u0305\u0305’)

àéïõû I💖🎮! as⃝df̅̅

ûõïéà !:video_game::sparkling_heart:I ̅̅fd⃝sa

# Discourse.Processing.org/t/rosetta-challenge-reverse-a-string/20644/21
# GoToLoop (2020/May/09)

TEXTS = u'àéïõû', u'I💖🎮!', u'as⃝df̅̅'

def setup():
    print TEXTS
    print TEXTS[0], TEXTS[1], TEXTS[2]
    print reverseStr(TEXTS[0]), reverseStr(TEXTS[1]), reverseStr(TEXTS[2])
    exit()


def reverseStr(s): return s[::-1]
1 Like

Great question. I think it is case-by-case. My personal approach has been “both are good, but simple first.”

So if something is easy to do in Processing – like draw a box – show the built-in, the same way that you would show a beginner who asked to draw a box how to use box().

Then, as a second example. a supplemental approach is to build your own box out of six planes. This is really useful for animated morphs, per-face texture control, etc., which you can’t do with box(). So ideally we’d have both DrawASphere and DrawASphereVertex – the simple but less flexible way, and then more complex, low-level / customizable ways.

This is kind of how the built-in Processing example set already works with examples like TextureCube – although they are not systematic.

1 Like

In order to better understand what is happening, I started to solve the UTF-8 encode and decode related to this task.
The task is :


My working stage is:
Where I ran again in problems because the last one is also a combined character.
I can´t believe this isn’t possible in java. It must be possible at byte level.

import java.nio.charset.StandardCharsets;
import java.util.Formatter;
int t = 50; 
int tel;

Character[] chars = {'A', 'ö', 'Ж', '€', '?'}; // I changed last character because "\u1D11E" the Musical Symbol G Clef
                                               // can't be displayed in the editor it gives the errot "Invalid character constant "

void setup() {
  size(740, 200);
  background(255);
  fill(0);
  textSize(15);
  text("Character      Name                                                            Unicode         UTF-8 encoding (hex)", 25, 30);
  text("-------------------------------------------------------------------------------", 25, 50);
  for (int codepoint : new int[]{0x0041, 0x00F6, 0x0416, 0x20AC, 0x1D11}) {
    byte[] encoded = utf8encode(codepoint);
    Formatter formatter = new Formatter();
    for (byte b : encoded) {
      formatter.format("%02X ", b);
    }
    String encodedHex = formatter.toString();
    int decoded = utf8decode(encoded);
    println(decoded);
    t += 25;    
    text(chars[tel], 30, t); 
    text(Character.getName(codepoint), 130, t);
    text("U+"+hex(chars[tel]), 450, t);
    text(encodedHex, 550, t);
    tel++;
  }
}

final byte[] utf8encode(int codepoint) {
  return new String(new int[]{codepoint}, 0, 1).getBytes(StandardCharsets.UTF_8);
}

int utf8decode(byte[] bytes) {
  return new String(bytes, StandardCharsets.UTF_8).codePointAt(0);
}
1 Like

Still stuck.
Why gives this code

11 1D 45 00

instead of

D8 34 DD 1E

import java.nio.charset.StandardCharsets;

byte ar[] = "\u1D11E".getBytes(StandardCharsets.UTF_16LE) ;
println("byte length = "+ar.length);
for(int i = 0; i < ar.length; i++) print(hex(ar[i])+" ");

@jeremydouglass this is kind of a non-problem in JRubyArt (propane and PiCrate), because the ruby code in JrubyArt sketches is just ruby code.

str = 'àéïõû'
puts str.class
puts str.inspect
puts str.respond_to? :reverse
puts str.reverse
jstring = str.to_java
puts jstring.inspect
puts jstring.respond_to? :reverse

output:-

String
"àéïõû"
true
ûõïéà
#<Java::JavaLang::String:0x61ab6521>
false

So it possible to do static analysis on ruby code using such useful tools as rubocop, reek etc. Further rubocop has an autocorrect option, useful for correcting indentations, and other useful style changes, it is normal to use single quotes for strings, unless doing interpolation. Fortunately most of the time we do not need to coerce ruby to java before applying processing methods (generally jruby just does the right thing) but occasionally it is necessary, usually where the java method signature has been overloaded. This one of the reasons I encourage JRubyArt users to ignore many java convenience methods that are generally available in ruby (I despair at seeing people use for loops, that is certainly not the ruby way) but equally I’m not very fond of perlisms.

2 Likes

I agree and disagree. If P5 is used as a jumping-off point to learn to code, sooner or later one will be confronted with Unicode.
I’ve spent quite some hours now, to dig into this, and I want to leave some links with tutorials for “dummies” for whom is interested. One, two , three , four.

With a better understanding, I was able to solve the UTF-8 encode and decode task and already posted it there. It was asked to start with the code-points, but I think it can be used as a nice tool to analyze Unicode. So

here
import java.nio.charset.StandardCharsets;

String str =  "ऒॐѬЩ∰⋫";
  
void setup() {
  size(850, 280);
  background(255);
  fill(0);
  textSize(16);
  int tel_1 = 80;
  int tel_2 = 50;
  text("Char     Name                                                            Unicode          UTF-8 (encoding)      Decoded", 40, 40);
  char[] myBuffer = str.toCharArray();
  int[] code_points = new int[myBuffer.length];
  for (int i = 0; i < str.length(); i++) {
    println(hex(myBuffer[i]));
    code_points[i] = int(myBuffer[i]);
  }
   printArray( code_points);
  for (int cp : code_points) {  
    byte[] encoded = new String(new int[]{cp}, 0, 1).getBytes(StandardCharsets.UTF_8);
    for (byte b : encoded) {                                                    
      text(hex(b), tel_2+530, tel_1);
      tel_2 += 30;
    }
    text(char(cp), 50, tel_1);
    text(Character.getName(cp), 100, tel_1);
    String unicode = hex(cp);
    while (unicode.length() > 4 && unicode.indexOf("0") == 0) unicode = unicode.substring(1);
    text("U+"+unicode, 450, tel_1);
    Character decoded = char(new String(encoded, StandardCharsets.UTF_8).codePointAt(0));
    text(decoded, 750, tel_1);
    tel_1 += 30;  tel_2 = 50;
  }
}

is a slightly modified version to be able to past a text string. Processing accepts more characters then I expected, but no emojis. (I don’t know what platform @GoToLoop used to display the emojis ) The android version(At least APDE) does accept quite a few.

I started at byte level searching in the string and filling the space left after the leading and continuation bits, but soon I realized that the code would be quite large, so I went one level up using UnicodeBlocks to filter the string. With this, I was able to solve the “Extra credit task” for P5.java as well with a reverseSring() function. See

here
import java.lang.Character.UnicodeBlock;

String str =  "as⃝df̅";

void setup() {
  size(350, 150);
  background(255);
  fill(0);
  textSize(30);
  textAlign(CENTER);
  text(str, width/2, height/3);    
  String str1 = reverseString(str);
  text(str1, width/2, 2*height/3);
}   

String reverseString(String inStr) {
  char[] myBuffer = inStr.toCharArray();
  StringBuffer sb = new StringBuffer();
  String mb = "";
  for (int i = 0; i < inStr.length(); i++) {
    UnicodeBlock ub = UnicodeBlock.of(myBuffer[i]);
    if (ub == UnicodeBlock.COMBINING_DIACRITICAL_MARKS
    //|| ub == UnicodeBlock.COMBINING_MARKS_FOR_SYMBOLS
    //|| ub == UnicodeBlock.COMBINING_DIACRITICAL_MARKS_SUPPLEMENT
    //|| ub == UnicodeBlock.COMBINING_HALF_MARKS
    ) mb = mb+'1';  // Storing 'mark' indexes
    else mb = mb+'0';   
    sb.append(myBuffer[i]);              
  }
  String nmb = new StringBuilder(mb).reverse().toString();
  sb = sb.reverse();
  Integer bl = myBuffer.length;
  for (int i = 0; i < bl; i++) {
    if(nmb.charAt(i) == '1') {
      char temp = sb.charAt(i+1);
      sb.setCharAt(i+1, sb.charAt(i));
      sb.setCharAt(i, temp);
    }  
  }  
  return sb.toString();
}

Whoever set up this task knew how to ‘bully’, because the character in the middle is also of a combining type, so when out-comment the “ub == UnicodeBlock.COMBINING_MARKS_FOR_SYMBOLS” line, the order is wrongly displayed again.

reverse

@monkstone How does the extra credit task play out in JrubyArt ?

2 Likes

@noel since ruby-2.4 default encoding in ruby is unicode, so once more a facile challenge:-

given following emojis​:walking_man: :racing_car: :oncoming_police_car: :minibus: we can do this in PiCrate (limitations of color etc).

str = '🚶 🏎 🚔 🚐'
puts str.reverse

Output:-

🚐 🚔 🏎 🚶

Except it does not quite look that way in my monochrome terminal.

2 Likes

Hi @monkstone – great to see the unicode emoji support in ruby. I think maybe (?) @noel was wondering about if ruby can reverse the extra challenge from the Rosetta Wiki task, which isn’t just using individual unicode characters – it is using modifiers and combining marks, which then may fail in unique ways if they are simply reversed:

Ruby fails with that set of characters, but least we can inspect why.

str = 'as⃝df̅'
chars = str.scan(/[[:graph:]]/)
puts chars.inspect

output:-

["a", "s", "⃝", "d", "f", "̅"]

1 Like

I guess when you know the reason, there must be a solution. :slight_smile:

@noel You would think so, but I’ve trawled the internet, and none seem forthcoming. Found out lots of interesting stuff though, including this on string composition.. Hey I trawled some more, and this works:-

str = 'as⃝df̅'
puts str
chars = str.scan(/\X/) # splits into grapheme clusters
puts chars.inspect
rev_string = chars.reverse
puts rev_string.inspect
puts rev_string.join

output:-

as⃝df̅
["a", "s⃝", "d", "f̅"]
["f̅", "d", "s⃝", "a"]
f̅ds⃝a

2 Likes

Great! Now I believe that the challenge was fulfilled for all versions.

I’ve added the "DrawASphereVertex’ as you sugested.