Skip to content

Latest commit

 

History

History
99 lines (72 loc) · 3.14 KB

README.md

File metadata and controls

99 lines (72 loc) · 3.14 KB

Base 2048   pypi_badge versions

build_badge Rust Tests Python Tests

codecov pre-commit.ci status

When Base 64 is not enough

Allows up to 11 bits of data per unicode character as counted by social media and chat platforms such as Twitter and Discord.

Uses a limited charset within the Basic Multilingual Plane.

Based on, and uses a compatible encoding table with the Rust crate rust-base2048.

- Charset displayable on most locales and platforms

- No control sequences, punctuation, quotes, or RTL characters

Getting Started

pip install base2048
import base2048

base2048.encode(b'Hello!')
# => 'ϓțƘ໐µ'

base2048.decode('ϓțƘ໐µ')
# => b'Hello!'

Up to 2x less counted characters compared to Base 64

import zlib
import base64

import base2048

string = ('🐍 🦀' * 1000 + '🐕' * 1000).encode()
data = zlib.compress(string)

b64_data = base64.b64encode(data)
# => b'eJztxrEJACAQBLBVHNUFBBvr75zvRvgxBEkRSGqvkbozIiIiIiIiIiIiIiIiIiIiIiJf5wAAAABvNbM+EOk='
len(b64_data)
# => 84

b2048_data = base2048.encode(data)
# => 'ը྿Ԧҩ২ŀΏਬйཬΙāಽႩԷ࿋ႬॴŒǔ०яχσǑňॷβǑňॷβǑňॷβǯၰØØÀձӿօĴ༎'
len(b2048_data)
# => 46

unpacked = zlib.decompress(base2048.decode(b2048_data)).decode()
len(unpacked)
# => 4000
unpacked[2000:2002]
# => '🦀🐍'

Decode errors are provided with a character-position of failure

----> base2048.decode('༗ǥԢΝĒϧǰ༎ǥ')

DecodeError: Unexpected character 8: ['ǥ'] after termination sequence 7: ['༎']
  • To catch the error, use either base2048.DecodeError or its base exception, ValueError.
import base2048

try:
    base2048.decode('🤔')
except base2048.DecodeError as e:
    print(e)

License

The code in this project is released under the MIT License.

Related and prior works

Javascript - base2048

Rust - rust-base2048