Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New service reference is not unique enough #411

Closed
mngan opened this issue May 9, 2021 · 12 comments · May be fixed by #428
Closed

New service reference is not unique enough #411

mngan opened this issue May 9, 2021 · 12 comments · May be fixed by #428
Assignees
Projects

Comments

@mngan
Copy link
Contributor

mngan commented May 9, 2021

Environment

  • Python version: 3.8.5
  • Peering Manager version: main

Steps to Reproduce

run manage.py migrate from main

Expected Behavior

Migration should run without errors

Observed Behavior

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, devices, extras, net, peering, peeringdb, sessions, taggit, users, utils, webhooks
Running migrations:
  Applying peering.0076_auto_20210426_1944...Traceback (most recent call last):
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: could not create unique index "peering_internetexchange_service_reference_40f6403f_uniq"
DETAIL:  Key (service_reference)=(IX32590-FE3D39S) is duplicated.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./manage.py", line 11, in <module>
    execute_from_command_line(sys.argv)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/base.py", line 89, in wrapped
    res = handle_func(*args, **kwargs)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/core/management/commands/migrate.py", line 246, in handle
    fake_initial=fake_initial,
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/migration.py", line 126, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/migrations/operations/fields.py", line 244, in database_forwards
    schema_editor.alter_field(from_model, from_field, to_field)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 595, in alter_field
    old_db_params, new_db_params, strict)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/postgresql/schema.py", line 198, in _alter_field
    new_db_params, strict,
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 786, in _alter_field
    self.execute(self._create_unique_sql(model, [new_field.column]))
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 145, in execute
    cursor.execute(sql, params)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/cacheops/transaction.py", line 93, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/home/milton/peering-manager/venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: could not create unique index "peering_internetexchange_service_reference_40f6403f_uniq"
DETAIL:  Key (service_reference)=(IX32590-FE3D39S) is duplicated.

As you can see there are multiple collisions just within the IX peering sessions. I think the UUID should be longer or perhaps a different section of the UUID should be used? With 6 hex digits, this is only 24bits or 16M possible UUIDs which seems a little small given the number of sessions you could have. This is a class of the Birthday Problem. If I understand it correctly, with 10,000 sessions, the chance of a collision is around 94% with the number of bits you have.

>>> sr=dict()
>>> for i in InternetExchangePeeringSession.objects.all():
...     if i.service_reference not in sr:
...             sr[i.service_reference] = i
...     else:
...             print("COLLISION:", sr[i.service_reference], i, i.service_reference)
...
COLLISION: Equinix Los Angeles - AS11164 - IP 206.223.123.199 SGIX: Peering LAN - AS55518 - IP 103.16.102.13 IX32590-0571AAS
COLLISION: VIX - AS34347 - IP 193.203.0.124 DE-CIX Frankfurt: DE-CIX Frankfurt Peering LAN - AS41998 - IP 80.81.192.35 IX32590-5B2C89S
COLLISION: VIX - AS42 - IP 2001:7f8:30:0:1:1:0:42 DE-CIX Frankfurt: DE-CIX Frankfurt Peering LAN - AS28876 - IP 2001:7f8::70cc:0:1 IX32590-9F6012S
COLLISION: Equinix Ashburn - AS13489 - IP 206.126.237.53 DE-CIX Frankfurt: DE-CIX Frankfurt Peering LAN - AS64049 - IP 2001:7f8::fa31:0:1 IX32590-C4C037S
COLLISION: NAPAfrica IX Johannesburg: Peering - AS26415 - IP 2001:43f8:6d0::93 SGIX: Peering LAN - AS45758 - IP 103.16.102.137 IX32590-F0EBC6S
COLLISION: SIX Seattle: MTU 1500 - AS6456 - IP 206.81.81.41 InterLAN: InterLAN Peering Network - AS39347 - IP 2001:7f8:64:225:0:3:9347:1 IX32590-FE3D39S
@jamesditrapani jamesditrapani self-assigned this May 9, 2021
@jamesditrapani jamesditrapani added this to To do in Core via automation May 9, 2021
@gmazoyer
Copy link
Member

gmazoyer commented May 9, 2021

To be honest, I was afraid this what going to happen. Two ways of fixing this:

  1. Keep generating a reference until we got one unique, but it's probably going to be costly
  2. Generate a reference and use a salt that could be a combination of local ASN, remote ASN and IP address

@jamesditrapani
Copy link
Contributor

Nice pickup @mngan. If we were to go with a 8 digit hexadecimal value we would reduce our chance of collision to 1.15% across 10,000 sessions. If we were to expand this up to 12 digits we would only have a 0.00001776179% chance of a collision allowing for much more confidence in the uniqueness.

@jamesditrapani
Copy link
Contributor

jamesditrapani commented May 9, 2021

  1. Keep generating a reference until we got one unique, but it's probably going to be costly
  2. Generate a reference and use a salt that could be a combination of local ASN, remote ASN and IP address
  1. Will get more costly as total session count increases, probably not scaleable.
  2. Could work however your hash will more than likely end up being +32 digits, which gets a bit large for a reference.

@gmazoyer
Copy link
Member

gmazoyer commented May 9, 2021

  1. Could work however your hash will more than likely end up being +32 digits, which gets a bit large for a reference.

I did not mean to use the values as string but rather use them as seed for a random value generator. We could then get a random integer (hopefully unique) that could be displayed as its hexadecimal representation. Not sure it's the best way (welcome to the world of pseudo random uniqueness), just a midnight thought :)

@mngan
Copy link
Contributor Author

mngan commented May 10, 2021

Based on the probability table on the Wikipedia page, a 12 digit hex number (48bits) will have about a 1 in a million chance of collision at 24,000 entries, or 1 in 10,000 change at 750,000 entries. That seems pretty reasonable odds for having to re-roll the UUID in the case of a collision. In terms of the length of the string, you can reduce it by using a different encoding. Instead of hexadecimal, what is you used base 32 or base 64? That could reduce it down to around 8 or 9 digits. 32 is readable because you can use 22 upper case letters plus digits. Base64 you have to use mixed case and symbols which isn't great.

The other way is to calculate a monotonically increasing UUID. Start at a (random?) point and then keep adding a random value to the last value. It is unlikely you will have enough entries to wrap a 32bit number, especially if you limit the starting value range, and you don't make your random offset too large. You may still want to have collision detection just in case you do wrap, but that seems highly unlikely.

Using random numbers is easy to code, but has surprising edge cases. The monotonically increasing number you have to code yourself, but should be fairly robust if you give yourself enough space.

@jamesditrapani
Copy link
Contributor

jamesditrapani commented May 10, 2021

Based on the probability table on the Wikipedia page, a 12 digit hex number (48bits) will have about a 1 in a million chance of collision at 24,000 entries, or 1 in 10,000 change at 750,000 entries. That seems pretty reasonable odds for having to re-roll the UUID in the case of a collision.

I too would agree these are reasonable odds, and this is the easy fix.

Base32/Base64 are a valid option the only roadblock would be finding enough unique information to encode. Local/Remote ASN + Peer IP may be enough but there is a possibility one may have duplicates (e.g. customer sessions using RFC1918).

I do like the monotonically approach and this is something @gmazoyer and myself spoke about, I cannot remember the reason why we didn't agree on this now, possibly to do with handling of deleted sessions. In theory, we could just use InternetExchangePeeringSession.objects.count() + 1 & DirectPeeringSession.objects.count() + 1 to increment an integer (padded with zeros up to 8 digits) that'll make up the Service Reference, e.g. IX9268-00000001S (might see issues when sessions begin to be deleted, not 100% sure on how/if we store deleted objects). Migrations would be easy as well, we're already loading InternetExchangePeeringSession.objects.all() and iterating, a simple enumerate() on these loop to get the index would be enough to get us a unique integer.

@mngan, @gmazoyer keen to hear both of your thoughts.

@mngan
Copy link
Contributor Author

mngan commented May 10, 2021

count()+1 will not work if things get deleted. It needs to be a counter that keeps moving forwards regardless of the number of active sessions. I have been meaning to build a system like this for cable/service identifiers. But basically you want a table that tracks a counter based on a key. Then you can use the counter in whatever identifier to make a unique service identifier. The key could be a combination of ix name+peer asn, or bgp group name. Could be anything, so long as the key is unique to create a unique namespace for the counter.

Deleted objects are just deleted. So you will get holes in your data. The soft delete PR I had, would keep the deleted objects around, but there was a clean up process that could be run to flush out the deleted objects. Also, when using count(), I believe, the soft delete PR would not count deleted objects unless you explicitly ask it to.

@jamesditrapani
Copy link
Contributor

We actually already have the unique primary key stored in item.id, we could potentially use this as the unique aspect of the service ID. Something like the below to force the ID to be generated prior to generating the service ref:

    def save(self, *args, **kwargs):
        """
        Overrides default `save()` to set the service reference if left blank.
        """
        # Create ID
        super().save(*args, **kwargs)

        if not self.service_reference:
            self.service_reference = self.generate_service_reference()

            return super().save(*args, **kwargs)

Reference could then just pad the ID out to 8 digits or something similar.

@mngan
Copy link
Contributor Author

mngan commented May 11, 2021

I would be wary about using database ID's directly. It could be fragile. Unless you were using a table with an autoincrement explicitly to generate the service IDs.

@jamesditrapani
Copy link
Contributor

I would be wary about using database ID's directly. It could be fragile. Unless you were using a table with an autoincrement explicitly to generate the service IDs.

Given the ID is the primary key then I cannot see an instance where this wouldn't be auto incremented nor non unique, unless you can think of a specific case we may hit?

@mngan
Copy link
Contributor Author

mngan commented May 11, 2021

I think mainly I am concerned about cases where someone exports the data (without ID's) and re-imports it, but gets assigned new ID's and so all the service ID's will change. You want this to be stable.

@jamesditrapani
Copy link
Contributor

Service Reference has been set to a free text field for the 1.4.0 release. Generating references automatically may be addressed again in the future. For now, I'm going to close this one down.

Core automation moved this from In progress to Done Jun 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Core
Done
Development

Successfully merging a pull request may close this issue.

3 participants