Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please explain the rationale behind the current alphabet #1

Open
mimi89999 opened this issue Sep 21, 2024 · 2 comments
Open

Please explain the rationale behind the current alphabet #1

mimi89999 opened this issue Sep 21, 2024 · 2 comments

Comments

@mimi89999
Copy link

Hello,

I noticed that the alphabet you defined uses all printable characters except space. Why is that? Some very problematic chars like backslash or doublequote that will need to be escaped in JSON and many programming languages. Why were they included in the alphabet?

@vorakl
Copy link
Owner

vorakl commented Sep 21, 2024

I explain the rationale in the "The key problem" section of the https://vorakl.com/articles/base94/ article.

Basically, this encoding is a solution to a different problem than "convenient embedding in JSON". Base94 uses all printable characters, which have the same ASCII codes in all character sets, which explains why no whitespace is included (you won't see any visible difference between ASCII 32 and 10, for example). So it is limited to 7-bit codes only, excluding all codes that don't have a printable symbol.

The main goal - to extend the alphabet as much as possible, but limit it to the point where it's supported everywhere, all the time. For the case you mentioned, Base64 is widely used, which has the alphabet limited by 6 bits and carefully chosen characters, with a small adjustments https://datatracker.ietf.org/doc/html/rfc4648#page-7 to the original set.

@vorakl
Copy link
Owner

vorakl commented Sep 21, 2024

For more information about context, see another article, https://vorakl.com/articles/stream-encoding/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants