New service reference is not unique enough #411

mngan · 2021-05-09T19:51:16Z

Environment

Python version: 3.8.5
Peering Manager version: main

Steps to Reproduce

run manage.py migrate from main

Expected Behavior

Migration should run without errors

Observed Behavior

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, devices, extras, net, peering, peeringdb, sessions, taggit, users, utils, webhooks
Running migrations:
  Applying peering.0076_auto_20210426_1944...Traceback (most recent call last):
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: could not create unique index "peering_internetexchange_service_reference_40f6403f_uniq"
DETAIL:  Key (service_reference)=(IX32590-FE3D39S) is duplicated.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./manage.py", line 11, in <module>
    execute_from_command_line(sys.argv)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/base.py", line 89, in wrapped
    res = handle_func(*args, **kwargs)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/commands/migrate.py", line 246, in handle
    fake_initial=fake_initial,
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/migration.py", line 126, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/operations/fields.py", line 244, in database_forwards
    schema_editor.alter_field(from_model, from_field, to_field)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 595, in alter_field
    old_db_params, new_db_params, strict)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/postgresql/schema.py", line 198, in _alter_field
    new_db_params, strict,
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 786, in _alter_field
    self.execute(self._create_unique_sql(model, [new_field.column]))
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 145, in execute
    cursor.execute(sql, params)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/cacheops/transaction.py", line 93, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: could not create unique index "peering_internetexchange_service_reference_40f6403f_uniq"
DETAIL:  Key (service_reference)=(IX32590-FE3D39S) is duplicated.

As you can see there are multiple collisions just within the IX peering sessions. I think the UUID should be longer or perhaps a different section of the UUID should be used? With 6 hex digits, this is only 24bits or 16M possible UUIDs which seems a little small given the number of sessions you could have. This is a class of the Birthday Problem. If I understand it correctly, with 10,000 sessions, the chance of a collision is around 94% with the number of bits you have.

>>> sr=dict()
>>> for i in InternetExchangePeeringSession.objects.all():
...     if i.service_reference not in sr:
...             sr[i.service_reference] = i
...     else:
...             print("COLLISION:", sr[i.service_reference], i, i.service_reference)
...
COLLISION: Equinix Los Angeles - AS11164 - IP 206.223.123.199 SGIX: Peering LAN - AS55518 - IP 103.16.102.13 IX32590-0571AAS
COLLISION: VIX - AS34347 - IP 193.203.0.124 DE-CIX Frankfurt: DE-CIX Frankfurt Peering LAN - AS41998 - IP 80.81.192.35 IX32590-5B2C89S
COLLISION: VIX - AS42 - IP 2001:7f8:30:0:1:1:0:42 DE-CIX Frankfurt: DE-CIX Frankfurt Peering LAN - AS28876 - IP 2001:7f8::70cc:0:1 IX32590-9F6012S
COLLISION: Equinix Ashburn - AS13489 - IP 206.126.237.53 DE-CIX Frankfurt: DE-CIX Frankfurt Peering LAN - AS64049 - IP 2001:7f8::fa31:0:1 IX32590-C4C037S
COLLISION: NAPAfrica IX Johannesburg: Peering - AS26415 - IP 2001:43f8:6d0::93 SGIX: Peering LAN - AS45758 - IP 103.16.102.137 IX32590-F0EBC6S
COLLISION: SIX Seattle: MTU 1500 - AS6456 - IP 206.81.81.41 InterLAN: InterLAN Peering Network - AS39347 - IP 2001:7f8:64:225:0:3:9347:1 IX32590-FE3D39S

The text was updated successfully, but these errors were encountered:

gmazoyer · 2021-05-09T21:59:09Z

To be honest, I was afraid this what going to happen. Two ways of fixing this:

Keep generating a reference until we got one unique, but it's probably going to be costly
Generate a reference and use a salt that could be a combination of local ASN, remote ASN and IP address

jamesditrapani · 2021-05-09T21:59:17Z

Nice pickup @mngan. If we were to go with a 8 digit hexadecimal value we would reduce our chance of collision to 1.15% across 10,000 sessions. If we were to expand this up to 12 digits we would only have a 0.00001776179% chance of a collision allowing for much more confidence in the uniqueness.

jamesditrapani · 2021-05-09T22:08:05Z

Keep generating a reference until we got one unique, but it's probably going to be costly

Generate a reference and use a salt that could be a combination of local ASN, remote ASN and IP address

Will get more costly as total session count increases, probably not scaleable.
Could work however your hash will more than likely end up being +32 digits, which gets a bit large for a reference.

gmazoyer · 2021-05-09T22:39:56Z

Could work however your hash will more than likely end up being +32 digits, which gets a bit large for a reference.

I did not mean to use the values as string but rather use them as seed for a random value generator. We could then get a random integer (hopefully unique) that could be displayed as its hexadecimal representation. Not sure it's the best way (welcome to the world of pseudo random uniqueness), just a midnight thought :)

mngan · 2021-05-10T06:36:23Z

Based on the probability table on the Wikipedia page, a 12 digit hex number (48bits) will have about a 1 in a million chance of collision at 24,000 entries, or 1 in 10,000 change at 750,000 entries. That seems pretty reasonable odds for having to re-roll the UUID in the case of a collision. In terms of the length of the string, you can reduce it by using a different encoding. Instead of hexadecimal, what is you used base 32 or base 64? That could reduce it down to around 8 or 9 digits. 32 is readable because you can use 22 upper case letters plus digits. Base64 you have to use mixed case and symbols which isn't great.

The other way is to calculate a monotonically increasing UUID. Start at a (random?) point and then keep adding a random value to the last value. It is unlikely you will have enough entries to wrap a 32bit number, especially if you limit the starting value range, and you don't make your random offset too large. You may still want to have collision detection just in case you do wrap, but that seems highly unlikely.

Using random numbers is easy to code, but has surprising edge cases. The monotonically increasing number you have to code yourself, but should be fairly robust if you give yourself enough space.

jamesditrapani · 2021-05-10T08:47:33Z

Based on the probability table on the Wikipedia page, a 12 digit hex number (48bits) will have about a 1 in a million chance of collision at 24,000 entries, or 1 in 10,000 change at 750,000 entries. That seems pretty reasonable odds for having to re-roll the UUID in the case of a collision.

I too would agree these are reasonable odds, and this is the easy fix.

Base32/Base64 are a valid option the only roadblock would be finding enough unique information to encode. Local/Remote ASN + Peer IP may be enough but there is a possibility one may have duplicates (e.g. customer sessions using RFC1918).

I do like the monotonically approach and this is something @gmazoyer and myself spoke about, I cannot remember the reason why we didn't agree on this now, possibly to do with handling of deleted sessions. In theory, we could just use InternetExchangePeeringSession.objects.count() + 1 & DirectPeeringSession.objects.count() + 1 to increment an integer (padded with zeros up to 8 digits) that'll make up the Service Reference, e.g. IX9268-00000001S (might see issues when sessions begin to be deleted, not 100% sure on how/if we store deleted objects). Migrations would be easy as well, we're already loading InternetExchangePeeringSession.objects.all() and iterating, a simple enumerate() on these loop to get the index would be enough to get us a unique integer.

@mngan, @gmazoyer keen to hear both of your thoughts.

mngan · 2021-05-10T18:22:58Z

count()+1 will not work if things get deleted. It needs to be a counter that keeps moving forwards regardless of the number of active sessions. I have been meaning to build a system like this for cable/service identifiers. But basically you want a table that tracks a counter based on a key. Then you can use the counter in whatever identifier to make a unique service identifier. The key could be a combination of ix name+peer asn, or bgp group name. Could be anything, so long as the key is unique to create a unique namespace for the counter.

Deleted objects are just deleted. So you will get holes in your data. The soft delete PR I had, would keep the deleted objects around, but there was a clean up process that could be run to flush out the deleted objects. Also, when using count(), I believe, the soft delete PR would not count deleted objects unless you explicitly ask it to.

jamesditrapani · 2021-05-10T23:53:07Z

We actually already have the unique primary key stored in item.id, we could potentially use this as the unique aspect of the service ID. Something like the below to force the ID to be generated prior to generating the service ref:

    def save(self, *args, **kwargs):
        """
        Overrides default `save()` to set the service reference if left blank.
        """
        # Create ID
        super().save(*args, **kwargs)

        if not self.service_reference:
            self.service_reference = self.generate_service_reference()

            return super().save(*args, **kwargs)

Reference could then just pad the ID out to 8 digits or something similar.

mngan · 2021-05-11T00:05:08Z

I would be wary about using database ID's directly. It could be fragile. Unless you were using a table with an autoincrement explicitly to generate the service IDs.

jamesditrapani · 2021-05-11T00:28:48Z

I would be wary about using database ID's directly. It could be fragile. Unless you were using a table with an autoincrement explicitly to generate the service IDs.

Given the ID is the primary key then I cannot see an instance where this wouldn't be auto incremented nor non unique, unless you can think of a specific case we may hit?

mngan · 2021-05-11T20:05:14Z

I think mainly I am concerned about cases where someone exports the data (without ID's) and re-imports it, but gets assigned new ID's and so all the service ID's will change. You want this to be stable.

jamesditrapani · 2021-06-05T01:18:49Z

Service Reference has been set to a free text field for the 1.4.0 release. Generating references automatically may be addressed again in the future. For now, I'm going to close this one down.

jamesditrapani self-assigned this May 9, 2021

jamesditrapani added priority: high type: bug labels May 9, 2021

jamesditrapani added this to To do in Core via automation May 9, 2021

jamesditrapani moved this from To do to In progress in Core May 17, 2021

jamesditrapani mentioned this issue May 23, 2021

PM-411 - Service References Unique Constraints #428

Draft

jamesditrapani closed this as completed Jun 5, 2021

Core automation moved this from In progress to Done Jun 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New service reference is not unique enough #411

New service reference is not unique enough #411

mngan commented May 9, 2021

gmazoyer commented May 9, 2021

jamesditrapani commented May 9, 2021

jamesditrapani commented May 9, 2021 •

edited

gmazoyer commented May 9, 2021

mngan commented May 10, 2021

jamesditrapani commented May 10, 2021 •

edited

mngan commented May 10, 2021

jamesditrapani commented May 10, 2021

mngan commented May 11, 2021 •

edited

jamesditrapani commented May 11, 2021

mngan commented May 11, 2021

jamesditrapani commented Jun 5, 2021

New service reference is not unique enough #411

New service reference is not unique enough #411

Comments

mngan commented May 9, 2021

Environment

Steps to Reproduce

Expected Behavior

Observed Behavior

gmazoyer commented May 9, 2021

jamesditrapani commented May 9, 2021

jamesditrapani commented May 9, 2021 • edited

gmazoyer commented May 9, 2021

mngan commented May 10, 2021

jamesditrapani commented May 10, 2021 • edited

mngan commented May 10, 2021

jamesditrapani commented May 10, 2021

mngan commented May 11, 2021 • edited

jamesditrapani commented May 11, 2021

mngan commented May 11, 2021

jamesditrapani commented Jun 5, 2021

jamesditrapani commented May 9, 2021 •

edited

jamesditrapani commented May 10, 2021 •

edited

mngan commented May 11, 2021 •

edited