Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm Optimizations #5

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

marcoonroad
Copy link
Owner

@marcoonroad marcoonroad commented Nov 25, 2018

Planned optimizations are:

  • Deal directly with cstruct/bytes rather than sole strings during signing and verification processes.
  • Increase the encoding basis from 16 (hexadecimal), for instance, base 32 and base 64. Such increased encoding base makes signatures shorter (while private keys and public keys stay on the same size).
  • Internal optimizations aimed by profiling to catch the library bottlenecks.
  • Possible parallelization of all processes?
    • Private Key Generation.
    • Public Key Derivation.
    • Message Signing.
    • Signature Verification.

(More discussion here is needed too.)


UPDATE:

New tasks:

  • Winternitz compression with w = 8.
  • Serialize and load from Base64 instead hex.
  • Use scrypt or other KDF for encryption.
  • Use an authenticated encryption mode.

Bonus:

  • Parallel signing, private key generation, public key derivation and signature verification.

this is an ongoing refactor process, aiming to optimize the
whole algorithm performance and memory consumption as well.

Signed-off-by: Marco Aurélio da Silva <[email protected]>
@marcoonroad marcoonroad added enhancement New feature or request question Further information is requested labels Nov 25, 2018
@marcoonroad
Copy link
Owner Author

marcoonroad commented Mar 2, 2019

UPDATE:

By using Merkle tries (for Master Public Key) and Seed generation (for Master Private Key) we can leverage and improve a bunch of things.

First and foremost, our one-time algorithm plays the PRNG many times to generate a huge number of bits for every piece/part from one-time Private Key. It's used to reduce collisions because the keys are used only once. But with Merkle tries and Seed generation this is not needed. In fact, there's no problem with Private Key collision, they're still huge bits to track all possible cases and Merkle tries maintain their own identity, avoiding the collision of identities as well. I mean, Private Keys can collide without problem, the only problem is the collision of all Private Keys and the collision of the tree ordering for their respective Public Keys hashed on the Merkle trie of commitments.

So, in this sense, we can use HD-alike "wallets" as found in Bitcoin for one-time signatures/accounts. Instead of calling random(2ⁿ) (where n is the number of bits) for every piece of Private Key, we can call randomNum := random(2ⁿ) only once and generate the Private Key's pieces with hash(randomNum || idx), where idx is the index/position for this Private Key piece. While calling our Seed generator, we can derive a master/principal random number, generate m random numbers for m Private Keys, hash the master random number with the random number (at i position) to generate a Private Key i seed, and then hash that with idx to generate the Private Keys pieces.

Needs further discussion.

@marcoonroad
Copy link
Owner Author

Continuation:

If we make a Merkle trie with depth of 3 (in which we can sign 8 messages, 2^3 = 8), we will need only to play the PRNG 9 times (the master random number and the 8 random private keys seeds). It's a huge improvement instead of playing the PRNG (128 x 16) times (where 128 is the length for the hexadecimal hash and 16 the space/set for every hexadecimal character).

it now uses HD-wallet-alike hashing to generate all the private key pieces

Signed-off-by: Marco Aurélio da Silva <[email protected]>
@marcoonroad
Copy link
Owner Author

BUMP:

It's also possible to reduce the size of our private keys, signatures and public keys. We only need to use MD5 hashing to fingerprint the signing message instead of full-blown BLAKE2B. It's quite non-sense to use a strong hash to reduce the signing message size, and thus, sign arbitrary-length messages. A weak hash such as MD5 suffices, because:

  • The only strong invariant needed is on the hardness to invert the Public Key into the Private Key, and thus, it must be infeasible to break the whole Signature Scheme.
  • It should be agnostic on the whatever-the-kind of message is passed to sign, the only required condition is not to break the deterministic & reproducible nature of signing and verification operations.

MD5, therefore, will reduce our space and processor power needed to run this Signature Scheme algorithm - all of this without breaking our security guarantees. Encryption, on the other hand, must use huge key sizes 'cause it's said that the Grover's Quantum algorithm can break short key sizes for AES (in the same sense of Shor's Quantum algorithm breaking huge key sizes for RSA and ECDSA).

@marcoonroad
Copy link
Owner Author

BUMP v2:

While the signature scheme using MD5 to reduce the size of the message is still safe, applications relying on this library would be broken.

It's due the weak collision resistance property of MD5. And in the context of Quantum Computers, even SHA256 may be finally broken. In this sense, I should embrace Blake2B nevertheless.

Besides these facts, it is still interesting to decouple the hash used on public key verification from the message fingerprint. It would enable custom hashes for every kind of sub-operation here.

@marcoonroad
Copy link
Owner Author

Ops, I forgot that! 😆
👉 See issue #4.

@marcoonroad
Copy link
Owner Author

marcoonroad commented Mar 7, 2019

Due the HD-wallet-alike nature of private key pieces, it's possible to generate such pieces on-demand. That is, instead of computing all of them beforehand, we can just compute them before the signing and public key derivation processes.

In the end we gain reduced private key size (to store) and private key generation duration time. All of this implies the cost of slower processes of signing and public key derivation.

Last but not least, it makes possible for the user to pass a custom private key generated by herself. It is quite common on the crypto-currency world for people to generate private keys using extremely high-quality entropy.


Honestly, as a warning, it breaks part of our security of one-time signing, 'cause it is not known whether the private key was already used or not. It is a risk for the user, and if she generates her own keys, she should assume and deal with the consequences for that risk.

@marcoonroad
Copy link
Owner Author

BUMP:

one-time-hash-sig

Another optimization, this time to reduce the public key size, is to have 4 kind of keys:

  • Private Key (512-bits, 128-chars hex string)
  • Signing Key (128x16 matrix, made of 512-bits cells)
  • Verification Key (128x16 matrix, made of 512-bits cells)
  • Public Key (512-bits, 128-chars hex string)

The gain here is a short public key, where the user can publish that even in one Twitter post. It would also help on the migration for the Merkle Tree Traversal & Authentication Path, which will enable the Few-Time Signature feature here.

The problem now is a huge signature (glup)! I can deal with that later, perhaps using some sort of compression algorithm, and so on...

@marcoonroad marcoonroad force-pushed the refactor/algorithm-optimizations branch from 8cb8cb7 to 6efc8b0 Compare March 17, 2019 01:45
Copy link
Owner Author

@marcoonroad marcoonroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I must keep things clear and separated. It means that the core library/domain layer must not deal with serialization/parsing, tests must be refactored for that. I must also test specific failure cases, and for that I will need specific defined exceptions (a module called Errors would aggregate all of them). So far I think that it's the only thing needed by now.

Bonus if I parametrize the implementation over any kind of hash algorithm, so it will be a functor injecting hash : bytes -> bytes and I would refactor everything to be paired with that. Future plans might include to separate the verification key hashing, the message hashing and the root hashing (merkle tree traversal).

let fingerprint = Serialization.digest ver_key_bytes in
fingerprint = Utils.bytes_to_string pub
else false
with _ -> false
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should improve the error report from the verify function here. A good approach might be to define standard exceptions which are thrown in very specific kind of errors during the verification. The planned CLI on top of this core library layer would just decode the thrown exceptions and then reply on stderr the due human-friendly message. Kinds of errors could be:

  • Invalid signature format.
  • Signature mismatch (when the signature doesn't match the verification key).
  • Verification failed (when the verification key doesn't match the public key fingerprint).
  • etc...

lib/verification.ml Outdated Show resolved Hide resolved
let result = cipher |> Cstruct.of_string |> Base64.decode in
let open Option in
result
>>= fun msg ->
msg |> decrypt ~key |> Cstruct.to_string |> Utils.unpad |> some
msg |> decrypt ~iv ~key |> Cstruct.to_string |> Utils.unpad |> some
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind that I should drop PoW as PBKDF and instead use a real one. 😠 💢 😡 👿

|> Hash.digest
|> Utils.to_hex

let digest pub = pub |> show |> Hash.digest
Copy link
Owner Author

@marcoonroad marcoonroad Apr 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should rethink the serialization of public key and signature (text signature + verification key). By using Base64 instead Hexcodes to reduce the signature size, I will ending up using only newlines as separators/markers, so no "-" neither "@" would help me (base 64 only uses the non-alphanum symbols + - / * =) - I need the core library layer accepting a list of bytes/string instead. The responsibility to serialize and parse should be deferred to the topmost layer possible to not pollute my domain logic neither to confuse/mix things potentially introducing bugs/security loopholes.

It will possibly break some of my tests, so they will need some refactoring to be paired too.

Signed-off-by: Marco Aurélio da Silva <[email protected]>
…of-work pass on Encryption module

Signed-off-by: Marco Aurélio da Silva <[email protected]>
… base64 characters

Signed-off-by: Marco Aurélio da Silva <[email protected]>
… hex payload pieces

Signed-off-by: Marco Aurélio da Silva <[email protected]>
this all was cause the FOSSA service parsing the project as public domain, due the
text '... <given the public> key ...' on this file

Signed-off-by: Marco Aurélio da Silva <[email protected]>
@marcoonroad
Copy link
Owner Author

marcoonroad commented Jul 26, 2019

BUMP:

'Cause public key derivation is deterministic, it's possible to pre-compute the public key alongside the private key non-deterministic generation. So, in simpler terms, the private key would be a set of the actual private key to sign and the pre-computed public key to verify. Public key derivation, thus, would be O(1), simply extracting the public key from runtime-opaque / storage-encrypted private key.


UPDATE:

✔️ Done.

…f - public key derivation is O(1)

Signed-off-by: Marco Aurélio da Silva <[email protected]>
@marcoonroad marcoonroad force-pushed the refactor/algorithm-optimizations branch from 855272b to d6f7b50 Compare September 15, 2019 01:31
…d environment variables

Signed-off-by: Marco Aurélio da Silva <[email protected]>
@marcoonroad
Copy link
Owner Author

BUMP:

To increase reuse, I should refactor things and create libraries/main-modules within this hieroglyphs package:

  • hieroglyphs.common library, Hieroglyphs_common module: provides helper functions consumed by all other libraries.
  • hieroglyphs.unsafe_core library, Hieroglyphs_unsafe_core module: the OTS core without state tracking.
  • hieroglyphs.ots library, Hieroglyphs_ots module: provides the core One-Time Signature algorithm with blacklist state.
  • hieroglyphs.merkle library, Hieroglyphs_merkle module: provides the classic few-time Merkle Tries signature.
  • hieroglyphs.links library, Hieroglyphs_links module: provides a transformation of OTS into unlimited signature by signing the next public key embedded on signature - during verification we extract this part.

…ls module into Verification module itself

Signed-off-by: Marco Aurélio da Silva <[email protected]>
… needs winternitz checksum to protect against byte flip attacks)

Signed-off-by: Marco Aurélio da Silva <[email protected]>
…nd-hashing against secret seed

Signed-off-by: Marco Aurélio da Silva <[email protected]>
…keys, signature and verification keys (64 bytes of message plus 2 checksum bytes to protect against byte flip attacks)

Signed-off-by: Marco Aurélio da Silva <[email protected]>
@marcoonroad
Copy link
Owner Author

UPDATE:

The algorithm now uses Winternitz compression with the w parameter being 8 bits. The signature is pretty short, just 66 hashes (64 hashes for every byte from message digest, plus 2 hashes for the checksum of the message digest -- just to protect from byte flipping attacks). The checksum works in the opposite side of the digest, if someone flips a byte in the digest, the checksum decreases, and if someone flips a byte in the checksum, the digest decreases too. Thus, it's impossible to forge signatures on this compression mode without knowing beforehand the preimages, something not feasible by the used strong hash (Blake2B in the case).

@marcoonroad
Copy link
Owner Author

TODO:

I should use some kind of authenticated encryption to hide, store and protect private keys.

marcoonroad and others added 10 commits September 29, 2019 20:00
…ternitz compression

Signed-off-by: Marco Aurélio da Silva <[email protected]>
Signed-off-by: Marco Aurélio da Silva <[email protected]>
… implementation being more GC-friendly

Signed-off-by: Marco Aurélio da Silva <[email protected]>
…ts instead of manually hashing them after string concatenation

Signed-off-by: Marco Aurélio da Silva <[email protected]>
…generation from private key/seed

Signed-off-by: Marco Aurélio da Silva <[email protected]>
Signed-off-by: Marco Aurélio da Silva <[email protected]>
… keys and now signature is under base-64 format instead concatenated hex-strings

Signed-off-by: Marco Aurélio da Silva <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant