Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stdlib)!: Seeded hashing #2170

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

spotandjake
Copy link
Member

@spotandjake spotandjake commented Oct 4, 2024

This pr refactors the hashing function so we do not keep global state, it also implements a new Hash.makeSeeded and Hash.make function which returns a HashInstance. If we want a different api I am happy to change things around.

API:

let globalHash = Hash.make()
assert Hash.hash(globalhash, 1) == Hash.hash(globalHash, 1)
assert Hash.hash(globalhash, "a") == Hash.hash(globalHash, "a")
let seeded1 = Hash.makeSeeded(1)
assert Hash.hash(seeded1, 1) == Hash.hash(seeded1, 1)
assert Hash.hash(seeded1, "a") == Hash.hash(seeded1, "a")
let seeded2 = Hash.makeSeeded(2)
assert Hash.hash(seeded1, 1) != Hash.hash(seeded2, 1)

Questions:

  • Should we generate a seperate seed on Hash.make every time or do we want to use the globalSeed? Maybe if we want a globalSeed we should provide it a makeGlobal.
  • Should we use makeSeeded with a set seed on Map and Set so we can get rid of the possible exception.

My original implementation had the signature of Hash.make: (seed: Number) => (anything: a) => Number but this causes our inference engine to lock the hashing function a particular data type when you do something like:

let hash = Hash.make()
assert hash("H") == hash("H")
assert hash(1) == hash(1) // fails to typecheck because `a` is now `String` so you would need to make a separate instance for each dataType.

The alternative approach suggested was to use a submodule for seeded but as the api's would differ this did not seem ideal.

One benefit to this approach is because we call Random.random() on make we will not throw a possible exception on Hash.hash which means all of our Set and Map functions besides make wont throw.

Succeeds: #1952

This is breaking as we changed the type of Hash.hash from (a) => Number to (HashInstance, a) => Number.

stdlib/hash.gr Outdated Show resolved Hide resolved
@spotandjake
Copy link
Member Author

I applied that change.

Another small question on this pr, should we fix the seed used for Map and Set to keep the order consistent between runs on operations like toList this helps to preserve same input == same output semantics.

@ospencer
Copy link
Member

Another small question on this pr, should we fix the seed used for Map and Set to keep the order consistent between runs on operations like toList this helps to preserve same input == same output semantics.

We don't guarantee order currently and we don't want to make a change that makes people think the order is guaranteed, unless we decide we want to guarantee order. Either way, that's not a decision for this PR.

@ospencer ospencer changed the title feat(stdlib)!: Add seededHash feat(stdlib)!: Seeded hashing Nov 7, 2024
@ospencer ospencer added this pull request to the merge queue Nov 7, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants