How to keep string literals "u8"

In ImGui, and probably in general, utf-8 string literals must be preceded by the prefix 'u8'. For example, consider the string literal u8"こんにちは!テスト %d" found at: https://github.com/ocornut/imgui/wiki/Loading-Font-Example.

Now, I would like to use .po files to implement my multilingual support, and under such a scheme, I would attempt calls like this: ImGui::Text(Translate("Hello! test"), 123), where the translation function would hopefully return u8"こんにちは!テスト %d" instead of just "こんにちは!テスト %d".

I believe I'm constrained by the following:
(1) PO files contain no u8 prefixes anywhere within the file; all strings are simply enclosed by double quotes
(2) I have no access to the translation function, which is provided by some library.

Within these constraints, how do I ensure that the translated string returned by the library function would still be u8, and thereby remaining compatible with ImGui?

(I'm new to cpp and this is just a tinkering sort of project, so my question might not make very much sense and I apologize in advance for that. I also performed a search for "string literal" on these forums but the search function appears at least offline for the moment)
You need to open file in UTF mode, put BOM mark into file, write UTF-8 string to file.
To read from file into u8 variable, either define char8_t array or u8string.

In any case, you need all wide variables, ex: file streams, string streams, IO streams, everything wide etc. what ever wide means for your OS.
"u8" has no particular meaning. It ensures basically that the string literal is compiled as an 8 bit string literal (as opposed to "L").

Within a file you don't have string literals (the compiler has nothing to do with it). Thus it is up to you to ensure that the file contains utf-8 characters.
what do you get back from it without doing anything? What is the return type from Translate(..) ? Does it have overloads or other versions?
Last edited on
@malibor, @coder777: I'll make sure that the .po files are in utf-8 and that only wide variant tools/variables are used when dealing with such files.

@jonnin: I've yet to test it (since I'm still working through other parts of the program), but I searched and saw that the return type is "char *" (the actual name of the translation function is gnu "gettext"). I suppose I could wrap a function around this return value from gettext and turn it into whatever type is necessary?

I say this because I noticed that in the original code, while most strings are u8"" literals, some are L"" literals. Consequently, when I use .po files and gettext() for translation, if the result is already compatible with "u8", then I might need to convert such a "u8" result to a "L" literal ? I've yet to find a way to do so (i.e., convert "u8" to "L").

A quick look at google search results indicate that I haven't made much progress on understanding utf-8-related strings. However, I decided to leave a quick reply here in case that there are simple answers that will either show how nonsensical my questions are or point to relevant readings/solutions. Finally, thanks to everyone who's already responded.
Last edited on
most normal c++ compilers think char is an 8 bit type. You may already be fine (seems likely).
string s = (your char*); //turn the char* into c++ object may be useful
or leave it be and use it as a C - string (the C tools work in c++ if your needs are simple enough the extra copy into a string object may or may not be worthwhile esp if you turn around and cast it back to char* for another library call after you get it).


Last edited on
if the result is already compatible with "u8", then I might need to convert such a "u8" result to a "L" literal ?
Yes, you need MultiByteToWideChar(CP_UTF8, ...), See:

https://docs.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-multibytetowidechar
@coder777

the OP is working with linux code.
@jonnin, @coder777, @malibor Thanks very much, and sorry for the delayed response
(@malibor I'm using gettext on windows; srry for not being clear previously)
Last edited on
Topic archived. No new replies allowed.