New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add OrgAuthToken
model
#50409
feat: add OrgAuthToken
model
#50409
Conversation
src/sentry/models/orgauthtoken.py
Outdated
"sentry.Project", related_name="projects", through=OrgAuthTokenProject | ||
) | ||
# max JWT token length is 2048, +7 for our custom prefix (sntrys_) | ||
token = models.CharField(max_length=2055, unique=True, null=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure what I should make the max length here. on the RFC there is a comment that we should store this in hashed form, if we do this I guess we need a more generic length here...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mitsuhiko why would we need to store it hashed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great thanks! I supposed it was for security reasons then it makes sense.
Here we should make the length, the length of the hash. I don't expect we will need to store the token in any other format besides the hashed one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning on storing the encoded JWT? or only the important claims within the token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this to be explicit that we store token_hashed
in the DB.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #50409 +/- ##
==========================================
+ Coverage 80.14% 81.15% +1.01%
==========================================
Files 4843 4845 +2
Lines 203733 203799 +66
Branches 11130 11130
==========================================
+ Hits 163275 165401 +2126
+ Misses 40212 38152 -2060
Partials 246 246
|
Note: I put these new models into the control silo, mirroring the api token model, I think that is the correct place? |
|
||
|
||
@control_silo_only_model | ||
class OrgAuthTokenProject(Model): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should put this model in orgauthtoken.py
since it belongs to the same cluster of models related to org auth tokens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after talking this through with @mitsuhiko , we decided to skip the projects for now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the nature of the relationship, it's not a big deal at all for the db.
src/sentry/models/orgauthtoken.py
Outdated
"sentry.Project", related_name="projects", through=OrgAuthTokenProject | ||
) | ||
# max JWT token length is 2048, +7 for our custom prefix (sntrys_) | ||
token = models.CharField(max_length=2055, unique=True, null=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mitsuhiko why would we need to store it hashed?
src/sentry/models/orgauthtoken.py
Outdated
# max JWT token length is 2048, +7 for our custom prefix (sntrys_) | ||
token = models.CharField(max_length=2055, unique=True, null=False) | ||
label = models.CharField(max_length=255, null=False) | ||
scope_list = ArrayField(of=models.TextField) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we statically type the list of scopes? Having such a generic field is a bit suspicious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for reference, this is how the scope list is currently stored in other models. But we may improve on this here :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible i would like to bound it list of elements, maybe like:
from django.db import models
from django.core.exceptions import ValidationError
class MyModel(models.Model):
OPTIONS = [
('option1', 'Option 1'),
('option2', 'Option 2'),
('option3', 'Option 3'),
]
choices_array = models.ArrayField(
models.TextField(),
validators=[],
)
def validate_choices_array(value):
for choice in value:
if choice not in dict(MyModel.OPTIONS).keys():
raise ValidationError(f"{choice} is not a valid choice.")
choices_array.validators.append(validate_choices_array)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I implemented this!
src/sentry/models/orgauthtoken.py
Outdated
label = models.CharField(max_length=255, null=False) | ||
scope_list = ArrayField(of=models.TextField) | ||
|
||
user_added = FlexibleForeignKey("sentry.User", null=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user_added
refers to the User
that created this token? If so, I would call this user
or added_by
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's the idea - I'll rename it to added_by
(IMHO user
is a potentially confusing as it may imply this token belongs to that user, which it doesn't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, that's a good point!
src/sentry/models/orgauthtoken.py
Outdated
|
||
user_added = FlexibleForeignKey("sentry.User", null=True) | ||
date_added = models.DateTimeField(default=timezone.now, null=False) | ||
date_last_used = models.DateTimeField(null=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having a null
field, we could have it default to timezone.now
and update it when the user uses it, but this changes the semantics. I don't know which ones are more representative of the situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO it's useful to know that the token was never used, I'd say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could technically infer it if the two dates are equal but ofc it's not the best way. My main concern was that having nullability in the db is not very nice but I understand your point.
src/sentry/models/orgauthtoken.py
Outdated
"sentry.Project", related_name="projects", through=OrgAuthTokenProject | ||
) | ||
# max JWT token length is 2048, +7 for our custom prefix (sntrys_) | ||
token = models.CharField(max_length=2055, unique=True, null=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning on storing the encoded JWT? or only the important claims within the token?
src/sentry/models/orgauthtoken.py
Outdated
project_last_used = HybridCloudForeignKey( | ||
"sentry.Project", null=True, blank=True, on_delete="cascade" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you want cascading here? If the last used project is deleted should the org token be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point @markstory, I missed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mydea we can just put the project id here as a BoundedBigIntegerField
and forget about possible accidental cascading deletions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point, true! So it should def. not cascade, but should we use HubridCloudForeignKey
or BoundedBigIntegerField
then? Happy to use BoundedBigIntegerField, just wanting to make sure to get the hybrid could stuff right...!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(BTW also created_by
should not cascade but set to null as well, adjusted that too!)
|
||
def generate_token(org_slug: str): | ||
jwt_payload = { | ||
"iss": "sentry.io", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be use even when the instance is self-hosted/single-tenant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, valid question... took this from here: https://github.com/getsentry/rfcs/pull/91/files#diff-3109d1f30b4b81085e85d841deafefb0d1b43cc5345c64514475512a4ce9fdeeR94
IMHO it should be something consistent that says "this is from sentry", and not differ when it is self-hosted/single-tenant. Not sure if sentry.io
is the best value for this then, but I'd say it's fine, maybe (just "Sentry" may also be ambiguous, ...)
"iat": datetime.utcnow(), | ||
"nonce": uuid4().hex, | ||
"sentry_site": settings.SENTRY_OPTIONS["system.url-prefix"], | ||
"sentry_org": org_slug, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll also need the organizations 'root' URL as in the future, we'll have multiple regions (with different API urls). While customer traffic can continue using https://sentry.io/api/0/**
there will be latency overhead (and cost to us) so we should encourage using regional domains instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this different from "sentry_site": settings.SENTRY_OPTIONS["system.url-prefix"],
? that is what this is designed to do (but honestly I am not 100% sure what fields are what there, @mitsuhiko pointed me to this settings field)
So to be clear, the only point of the sentry_site
field is exactly what you mentioned, if we should store something else in there, happy to be pointed to it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, organizations will have a couple URLs:
sentryUrl
which is our root domain and is the same assystem.url-prefix
. ex.sentry.io
organizationUrl
which is domain for the organization's UI. exacme.sentry.io
regionUrl
Which is the domain that the organization's API endpoints are available on. exus.sentry.io
Any API requests sent to sentryUrl
will be routed to the correct region, but will suffer a latency penalty (because we're proxying the request to the region) and the request will go to the US (this can matter for EU customers).
For this scenario, I think you'll want both the sentryUrl
and regionUrl
available in the JWT. There are a small number of API endpoints that are only available on sentryUrl
(users, integrations, sentry apps).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do I get regionUrl
from? (I left a comment in the RFC as well to make sure this is aligned - I'd go with sentry_url
and sentry_region_url
then)
src/sentry/models/orgauthtoken.py
Outdated
project_last_used = HybridCloudForeignKey( | ||
"sentry.Project", null=True, blank=True, on_delete="cascade" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
project_last_used = HybridCloudForeignKey( | |
"sentry.Project", null=True, blank=True, on_delete="cascade" | |
) | |
project_last_used = BoundedBigIntegerFieldd(db_index=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this preferred over the HybridCouldForeignKey
? 🤔 Feels like that is designed for exactly this use case, but honestly not entirely sure... 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other option would be to use HybridCloudForeignKey("sentry.Project", null=True, blank=True, on_delete="SET_NULL")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the expertise to judge what is better, since I lack in-depth knowledge of hybrid cloud. @markstory could you give an input here, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying my comment from the issue:
#50144 (comment)
Basically, is there a tech spec for this? And have we properly investigated what introducing a new token type means?
This PR has a migration; here is the generated SQL for --
-- Create model OrgAuthToken
--
CREATE TABLE "sentry_orgauthtoken" ("id" bigserial NOT NULL PRIMARY KEY, "organization_id" bigint NOT NULL, "token_hashed" varchar(3000) NOT NULL UNIQUE, "token_last_characters" varchar(4) NULL, "name" varchar(255) NOT NULL, "scope_list" text[] NULL, "date_added" timestamp with time zone NOT NULL, "date_last_used" timestamp with time zone NULL, "project_last_used_id" bigint NULL, "date_deactivated" timestamp with time zone NULL, "created_by_id" integer NULL);
ALTER TABLE "sentry_orgauthtoken" ADD CONSTRAINT "sentry_orgauthtoken_created_by_id_5e2288f9_fk_auth_user_id" FOREIGN KEY ("created_by_id") REFERENCES "auth_user" ("id") DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "sentry_orgauthtoken" VALIDATE CONSTRAINT "sentry_orgauthtoken_created_by_id_5e2288f9_fk_auth_user_id";
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_organization_id_17552a60" ON "sentry_orgauthtoken" ("organization_id");
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_token_hashed_4c36306d_like" ON "sentry_orgauthtoken" ("token_hashed" varchar_pattern_ops);
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_project_last_used_id_0b9aa639" ON "sentry_orgauthtoken" ("project_last_used_id");
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_created_by_id_5e2288f9" ON "sentry_orgauthtoken" ("created_by_id"); |
Hey, there is an RFC getsentry/rfcs#91 by @mitsuhiko. If you have any concerns about this on a fundamental level, that's the place to discuss this, I guess. My understanding is that the basic decision that we want to add new org-level tokens has been taken, but @mitsuhiko can probably give you more information if you have more fundamental questions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR LGTM but I can't judge the usage of the HybridCloudForeignKey
. The only thing I can say is that semantically it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR LGTM but I can't judge the usage of the HybridCloudForeignKey
. The only thing I can say is that semantically it makes sense.
This PR has a migration; here is the generated SQL for --
-- Create model OrgAuthToken
--
CREATE TABLE "sentry_orgauthtoken" ("id" bigserial NOT NULL PRIMARY KEY, "organization_id" bigint NOT NULL, "token_hashed" text NOT NULL UNIQUE, "token_last_characters" varchar(4) NULL, "name" varchar(255) NOT NULL, "scope_list" text[] NULL, "date_added" timestamp with time zone NOT NULL, "date_last_used" timestamp with time zone NULL, "project_last_used_id" bigint NULL, "date_deactivated" timestamp with time zone NULL, "created_by_id" integer NULL);
ALTER TABLE "sentry_orgauthtoken" ADD CONSTRAINT "sentry_orgauthtoken_created_by_id_5e2288f9_fk_auth_user_id" FOREIGN KEY ("created_by_id") REFERENCES "auth_user" ("id") DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "sentry_orgauthtoken" VALIDATE CONSTRAINT "sentry_orgauthtoken_created_by_id_5e2288f9_fk_auth_user_id";
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_organization_id_17552a60" ON "sentry_orgauthtoken" ("organization_id");
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_token_hashed_4c36306d_like" ON "sentry_orgauthtoken" ("token_hashed" text_pattern_ops);
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_project_last_used_id_0b9aa639" ON "sentry_orgauthtoken" ("project_last_used_id");
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_created_by_id_5e2288f9" ON "sentry_orgauthtoken" ("created_by_id"); |
This PR has a migration; here is the generated SQL for --
-- Create model OrgAuthToken
--
CREATE TABLE "sentry_orgauthtoken" ("id" bigserial NOT NULL PRIMARY KEY, "organization_id" bigint NOT NULL, "token_hashed" text NOT NULL UNIQUE, "token_last_characters" varchar(4) NULL, "name" varchar(255) NOT NULL, "scope_list" text[] NULL, "date_added" timestamp with time zone NOT NULL, "date_last_used" timestamp with time zone NULL, "project_last_used_id" bigint NULL, "date_deactivated" timestamp with time zone NULL, "created_by_id" integer NULL);
ALTER TABLE "sentry_orgauthtoken" ADD CONSTRAINT "sentry_orgauthtoken_created_by_id_5e2288f9_fk_auth_user_id" FOREIGN KEY ("created_by_id") REFERENCES "auth_user" ("id") DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "sentry_orgauthtoken" VALIDATE CONSTRAINT "sentry_orgauthtoken_created_by_id_5e2288f9_fk_auth_user_id";
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_organization_id_17552a60" ON "sentry_orgauthtoken" ("organization_id");
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_token_hashed_4c36306d_like" ON "sentry_orgauthtoken" ("token_hashed" text_pattern_ops);
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_project_last_used_id_0b9aa639" ON "sentry_orgauthtoken" ("project_last_used_id");
CREATE INDEX CONCURRENTLY "sentry_orgauthtoken_created_by_id_5e2288f9" ON "sentry_orgauthtoken" ("created_by_id"); |
This adds a new
OrgAuthToken
model:token
which is expected to be a JWT tokenref #50144
based on RFC getsentry/rfcs#91