Duplicate word(s) #7

delitescere · 2020-08-30T10:34:27Z

"dusty" appears twice, maybe others? ¯_(ツ)_/¯

DannyBen · 2020-08-30T10:58:41Z

Absolutely right...

There is more than one non-unique adjectives:

thirsty (2 times)
tense (2 times)
sore (2 times)
silent (2 times)
petite (2 times)
lively (2 times)
dusty (2 times)
cute (2 times)
bright (2 times)

The nouns list seems to be ok.

I confirmed that removing these duplicates will cause a different random result (tested on concode).

So, not sure what is the best course of action:

Keep as is.
Change and break the persistence of previous codes.

jjmontesl · 2020-08-31T00:47:34Z

It definitely will.

I think it's reasonable to introduce a backwards-compatibility breaking change. My proposal:

Instead of removing the adjectives, replace them by new adjectives and document this. This wouldn't break current hashes backwards, except for the 50% of the affected words.
Optionally, we can use a comparative/superlative version of the adjectives so it's easy to realize that they were formerly the duplicated ones (/ej thirsty/thristier, tense/tenser, sore/... I am unsure about some of them). I vote yes to this.
Optionally, keep the current list and offer a compatibility mode (I don't think this is necessarily good). I vote no to this.

What do you think?

DannyBen · 2020-08-31T07:13:59Z

What do you think?

Well. I am always in favor of simple solutions that do not leave you with a "bad taste" and definitely ones that do not create additional or alternative technical debt.

So, my opinion is to do one of these:

Option 1: Avoid the breaking change and leave as is.
Option 2: Accept the breaking change and simply remove the duplicates.
Option 3: Embrace the breaking change and remove the duplicates, and use this opportunity to add or update the dictionaries even further.

akram-hameed · 2021-01-06T03:19:47Z

Any further movement on this? I've ported Codenamize to Kotlin and while dealing with a bug caused by the white space in the 'ad hoc' adjective, I found a duplicate for dusty. I am not sure I can see how using the comparative/superlative would work in this case, since it means using additional characters which breaks part of the algorithm, no?

jjmontesl · 2021-01-06T20:11:24Z

I feel so bad for this bug.

After considering this again I'll vote for @DannyBen 's option 1 above.

This effectively reduces the hashing space and increases the chance of collisions a bit, but I think it's the least breaking change.

If this approach caused a limitation to any client, I propose to start working on option 3 and please open a new Issue in order to update the list of adjectives for a future codenamize hash version (removing duplicates and adding a few adjectives).

In any case, for compatibility with current hashes, we need to keep the current list as-is at least for the current version.

jjmontesl · 2021-01-06T20:12:20Z

(I'll keep this issue open for anyone hitting this until we settle on a new future codenamize hashing format).

jjmontesl · 2021-01-06T20:16:51Z

Please let me know of your Kotlin port when ready to link it from the home page. I'll also add notes to README and code to inform about the reduced hashing spaces and non-linear collision chances. Again, being a hashing library, though, I think that maintaining backwards compatibility is more important than reducing collision chances.

xkortex · 2021-04-07T20:10:58Z

My 0.02: I use it primarily for human-readable identifiers for what is backed by full hashes elsewhere. The risk of collisions on short prefixes is decently high to begin with - I don't expect it to be collision resistant even on human scales.

I definitely think they should be changed at some point.

I don't really like the idea of keeping them in the same semantic-space, I'd pick wholly different ones. In fact you might want to check with a tool like editdistance for near-collisions.

Regardless of path taken, changing the word list is definitely a semver-major-breaking-change.

jjmontesl assigned jjmontesl and unassigned jjmontesl Aug 31, 2020

jjmontesl added the bug label Aug 31, 2020

jjmontesl added this to the 2.0 milestone Jan 6, 2021

DannyBen mentioned this issue Feb 3, 2022

Removed duplicate adjective: bright #8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate word(s) #7

Duplicate word(s) #7

delitescere commented Aug 30, 2020

DannyBen commented Aug 30, 2020 •

edited

jjmontesl commented Aug 31, 2020 •

edited

DannyBen commented Aug 31, 2020

akram-hameed commented Jan 6, 2021

jjmontesl commented Jan 6, 2021 •

edited

jjmontesl commented Jan 6, 2021 •

edited

jjmontesl commented Jan 6, 2021

xkortex commented Apr 7, 2021

Duplicate word(s) #7

Duplicate word(s) #7

Comments

delitescere commented Aug 30, 2020

DannyBen commented Aug 30, 2020 • edited

jjmontesl commented Aug 31, 2020 • edited

DannyBen commented Aug 31, 2020

akram-hameed commented Jan 6, 2021

jjmontesl commented Jan 6, 2021 • edited

jjmontesl commented Jan 6, 2021 • edited

jjmontesl commented Jan 6, 2021

xkortex commented Apr 7, 2021

DannyBen commented Aug 30, 2020 •

edited

jjmontesl commented Aug 31, 2020 •

edited

jjmontesl commented Jan 6, 2021 •

edited

jjmontesl commented Jan 6, 2021 •

edited