-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype: unicode string support #1517
base: mc-1.19.x
Are you sure you want to change the base?
Conversation
tbh i do not believe that this has any chance of being merged into CC:T, best of luck but i feel like a lot of features like this have been rejected before because it would conflict with the mods "feel" |
I really hope that this can find some kind of compromise, because having Unicode support for a terminal kinda is something you'd expect. And it would make stuff so much easier. |
Personally fully in support of such addition. This is obviously an enromous change, but I believe the current status quo of only having latin + a few extra chars creates a pointless barrier of entry for people from different cultures and is not the way to go forward, even if this takes a while to polish out. I think the potential to slightly "break the feel", as dev1955 pointed out, is worth it to allow people unfamiliar with english to use the mod. |
I dont think the rom even supports multiple languages so it sounds kinda pointless in making it more accessible for newbies Im not against this change as i use older versions anyway but this still feels too drastic |
I guess I worded this a bit poorly, I meant not that newbies would be able to set ROM to their language or code in it, but that non-technical users could potentially interact with CC programs others made in their native tongue e.g. a shop or a dashboard |
O_o That's a lot of new functions to add and import... |
Thank you for looking into this. I realise this is a bit of a pain, but I think it probably makes sense to do this work in two stages/two PRs:
|
Quick response of point 1:
As for the cloning hell in point 2, it is because we cannot distinguish when we expect a latin1 string and when we expect a utf8 string. I believe using
that could ruin the subsequent calls if someone forgets to restore state. For duplicating event issue, would it be better if we send two params for |
FWIW, my approach for events in my test was to use the same event name, but adding a second parameter with the I didn't do a good job at describing every change I made in my test back then, but you could take a look at the ROM patches for inspiration. I still think this is the most elegant solution, even if it ends up adding a bunch of extra Unicode options to functions. Also, I don't really like the idea of making people interact with the UTF-8 representations of strings directly. It's really easy to slip up and end up putting it into a normal string function, which would destroy the codepoints and/or not function correctly. It's good that there's a usermode library to help, but IMO we shouldn't be exposing the raw encoding data to users unless they specifically ask for it (e.g. an |
merge separated unicode modules/functions back to their normal variant.
How's this going ? It would be an actually great feature. I'm following the progress. |
This aims to address the issue #860 about reading and writing unicode character into terminals.
This pull request mostly adapt the first route in the discussion ("separate versions of methods for unicode").(Edit: no longer valid since the commit at 12 July.)Additions
utflib
apiThis api provides a
UTFString
"class" that wraps a utf8-encoded byte string and act as a normal string. Functions that are provided in the standardstring
library, exceptstring.dump
,string.pack
,string.packsize
,string.unpack
, are also provided inUTFString
. Users can use this to adapt unicode strings into their old system painlessly. If users want to get Latin-1 string from UTFString, they can useUTFString:toLatin()
. Otherwisetostring
will return the backend byte string.Besides
UTFString
, the module also exports the following functions:fromLatin(str)
: consider the string as fully Latin-1 and convert it into utf8. Such function is provided asUTFString(str)
will consider the string as already utf8-encoded, and only consider invalid byte subsequences as Latin1-encoded and convert them.isUTFString(v)
: return true ifv
is a UTFString.wrapStr(str)
: wrap a lua string so that normal string can be compared with unicode string.isStringWrapper(v)
: return true ifv
is a string wrapper fromwrapStr(str)
shell.unicode
,edit.unicode
,lua.unicode
settingsNew settings allows
shell
,edit
andlua
programs to receive and print unicode strings. Such settings will not affect other programs, especially user-defined programs.Changes
Now they accepts UTFString properly, and do not write "table: 0x??????" on screen.TermMethods.write
andTermMethods.blit
functionsedit
program when unicode text presents.
Now it also send utf8-encoded string as the second parameterchar
andpaste
events
Now it acceptsread
function_bReadUnicode
as the 5th argument, indicating whether it should take UTFString whentrue
, or a normal string otherwise.Roadmap
Edit
2023-07-12: merge separated unicode modules/functions back to their normal variant to reduce code duplication hell.