Character Encoding Conversion Techniques in C++
// Convert UTF-8 to GB2312
char* ConvertUTF8ToGB(const char* utfInput) {
int bufferSize = MultiByteToWideChar(CP_UTF8, 0, utfInput, -1, NULL, 0);
wchar_t* wideBuffer = new wchar_t[bufferSize+1];
memset(wideBuffer, 0, (bufferSize+1)*sizeof(wchar_t));
MultiByteToWideChar(CP_UTF8, 0, utfInput, -1, wideBuffer, bufferSize);
...
Posted on Mon, 22 Jun 2026 17:59:29 +0000 by Reformed
Decoding CJK String Length and Display Width in Java
Java stores strings internally using UTF-16 encoding, where String.length() returns the number of 16-bit code units. While most common CJK characters occupy a single code unit, their visual representaiton in monospaced terminals or grid interfaces typically spans two horizontal cells. Consequently, one Chinese character functions as the equival ...
Posted on Fri, 05 Jun 2026 18:01:01 +0000 by asolell
Java Basics: Characters and Strings
The char data type is used to represent a single character. Character literals are enclosed in single quotes. ```
char a = 'A';
char b = '4';
char c = '\u041'; // Unicode for A
String literals must be enclosed in double quotes, while character literals are single characters enclosed in single quotes. Thus, "A" is a string, and 'A' is ...
Posted on Fri, 15 May 2026 15:18:46 +0000 by davey_b_