I agree and disagree. If P5 is used as a jumping-off point to learn to code, sooner or later one will be confronted with Unicode.
I’ve spent quite some hours now, to dig into this, and I want to leave some links with tutorials for “dummies” for whom is interested. One, two , three , four.
With a better understanding, I was able to solve the UTF-8 encode and decode task and already posted it there. It was asked to start with the code-points, but I think it can be used as a nice tool to analyze Unicode. So
here
import java.nio.charset.StandardCharsets;
String str = "ऒॐѬЩ∰⋫";
void setup() {
size(850, 280);
background(255);
fill(0);
textSize(16);
int tel_1 = 80;
int tel_2 = 50;
text("Char Name Unicode UTF-8 (encoding) Decoded", 40, 40);
char[] myBuffer = str.toCharArray();
int[] code_points = new int[myBuffer.length];
for (int i = 0; i < str.length(); i++) {
println(hex(myBuffer[i]));
code_points[i] = int(myBuffer[i]);
}
printArray( code_points);
for (int cp : code_points) {
byte[] encoded = new String(new int[]{cp}, 0, 1).getBytes(StandardCharsets.UTF_8);
for (byte b : encoded) {
text(hex(b), tel_2+530, tel_1);
tel_2 += 30;
}
text(char(cp), 50, tel_1);
text(Character.getName(cp), 100, tel_1);
String unicode = hex(cp);
while (unicode.length() > 4 && unicode.indexOf("0") == 0) unicode = unicode.substring(1);
text("U+"+unicode, 450, tel_1);
Character decoded = char(new String(encoded, StandardCharsets.UTF_8).codePointAt(0));
text(decoded, 750, tel_1);
tel_1 += 30; tel_2 = 50;
}
}
is a slightly modified version to be able to past a text string. Processing accepts more characters then I expected, but no emojis. (I don’t know what platform @GoToLoop used to display the emojis ) The android version(At least APDE) does accept quite a few.
I started at byte level searching in the string and filling the space left after the leading and continuation bits, but soon I realized that the code would be quite large, so I went one level up using UnicodeBlocks to filter the string. With this, I was able to solve the “Extra credit task” for P5.java as well with a reverseSring() function. See
here
import java.lang.Character.UnicodeBlock;
String str = "as⃝df̅";
void setup() {
size(350, 150);
background(255);
fill(0);
textSize(30);
textAlign(CENTER);
text(str, width/2, height/3);
String str1 = reverseString(str);
text(str1, width/2, 2*height/3);
}
String reverseString(String inStr) {
char[] myBuffer = inStr.toCharArray();
StringBuffer sb = new StringBuffer();
String mb = "";
for (int i = 0; i < inStr.length(); i++) {
UnicodeBlock ub = UnicodeBlock.of(myBuffer[i]);
if (ub == UnicodeBlock.COMBINING_DIACRITICAL_MARKS
//|| ub == UnicodeBlock.COMBINING_MARKS_FOR_SYMBOLS
//|| ub == UnicodeBlock.COMBINING_DIACRITICAL_MARKS_SUPPLEMENT
//|| ub == UnicodeBlock.COMBINING_HALF_MARKS
) mb = mb+'1'; // Storing 'mark' indexes
else mb = mb+'0';
sb.append(myBuffer[i]);
}
String nmb = new StringBuilder(mb).reverse().toString();
sb = sb.reverse();
Integer bl = myBuffer.length;
for (int i = 0; i < bl; i++) {
if(nmb.charAt(i) == '1') {
char temp = sb.charAt(i+1);
sb.setCharAt(i+1, sb.charAt(i));
sb.setCharAt(i, temp);
}
}
return sb.toString();
}
Whoever set up this task knew how to ‘bully’, because the character in the middle is also of a combining type, so when out-comment the “ub == UnicodeBlock.COMBINING_MARKS_FOR_SYMBOLS” line, the order is wrongly displayed again.
@monkstone How does the extra credit task play out in JrubyArt ?