-
-
Notifications
You must be signed in to change notification settings - Fork 586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SA: Implement schema and methods for (account, hostname) pausing #7490
base: main
Are you sure you want to change the base?
Conversation
@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few high-level suggestions to simplify the new gRPC API methods:
- Pause and Repause can be the same method, since I don't think the RA needs to know that a pair has been previously paused. The SA can determine which case it is in inside a transaction, and increment the relevant metric.
- CheckPair[s]Paused can be a single method, no need for separate singular/plural implementations (when would we be calling the singular one?). And rather than taking a single identifier type and many values, it should take a repeated
message Identifier { string type = 1; string value = 2 }
. Let the SA handle the need for multiple separate db queries if necessary; no need to expose that deficiency of our database schema to the RA. - I'm not clear on why we need both UnpausePair and UnpauseAccount, since the current design doc only calls for ever unpausing whole accounts at a time.
Doing all three of these would reduce the new API surface to just three methods: CheckPairsPaused, PausePair, UnpauseAccount.
There's one more method which might be necessary: GetPausedIdentifiersForAccount, to be used by the self-service unpausing page, to populate it with a (truncated) list of identifiers that will be unpaused.
On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.
Fair. I'll fix this.
You're right, UnpausePair was an oversight on my part.
Thanks, I'll add this. |
I totally agree with this line of thinking. But for me it's a balancing act. I think emitting the "repaused" metric from the SA is slightly unfortunate, because it does feel like an RA sort of thing to be measuring. But I think that exposing twice as many methods from the SA and having strict calling conventions for those methods that will fail if the RA is ever wrong (or races against another RA!) is significantly more unfortunate. Requiring every potential SA client to have identical "call CheckPairsPaused, then depending on its answer, either call Pause or Repause, and gracefully handle the error in case the situation has changed between those two calls" logic feels much worse than allowing SA clients to just say "hey, pause this one" and letting the SA handle the intricacies of making that idempotent inside a database transaction. |
I agree and as long as I've got support from you I'm happy to implement this as described above. |
ec34ece
to
588a35d
Compare
8918fe4
to
8e95c04
Compare
8e95c04
to
7adf408
Compare
// provided account. If no matches are found, an empty slice is returned. If a | ||
// non-DNS identifier is provided, an error is returned. | ||
func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) { | ||
if len(ids) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ids
is a []pauseId
so this could just be a nil check because a slice zero value is nil. @aarongable brought up a good point in a different PR that this catches the case where the []pauseids
slice exists but has an obviously-invalid (zero) length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meta question, what is the most safe/correct way of checking? https://go.dev/play/p/m4QpBnr6DP5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idiomatic thing to do for slices is to check their length, because it is a single highly-readable check that works on both nil and non-nil-but-empty slices.
func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) { | ||
if len(ids) == 0 { | ||
// No identifier values to check. | ||
return []string{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could return nil, nil
here since the function returns []string
anyways and you save allocating an slice containing nothing.
|
||
// pausedModel represents a row in the paused table. The pausedAt and unpausedAt | ||
// fields are pointers because they are NULL-able columns. At no point should | ||
// both pausedAt and unpausedAt be non-NULL. Valid states are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At no point should both pausedAt and unpausedAt be non-NULL.
How does this square with the example directly below it?
// - identifier unpaused: pausedAt is non-NULL, unpausedAt is non-NULL
|
||
// pausedModel represents a row in the paused table. The pausedAt and unpausedAt | ||
// fields are pointers because they are NULL-able columns. At no point should | ||
// both pausedAt and unpausedAt be non-NULL. Valid states are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// both pausedAt and unpausedAt be non-NULL. Valid states are: | |
// both pausedAt and unpausedAt be NULL. Valid states are: |
// checkIdentifiersPaused takes a slice of identifiers of type "dns" and returns | ||
// a slice of the first 15 identifier values which are currently paused for the | ||
// provided account. If no matches are found, an empty slice is returned. If a | ||
// non-DNS identifier is provided, an error is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, for future readers, we should note that this decision to reject non-DNS identifiers is not about the existence of non-DNS identifiers themselves, but rather about the fact that we don't (yet) want this function to eat the complexity of supporting more than one identifier type at a time.
rpc CheckIdentifiersPaused (CheckIdentifiersPausedRequest) returns (Hostnames) {} | ||
rpc GetPausedIdentifiersForAccount (RegistrationID) returns (Hostnames) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have these return Identifiers? We're being flexible enough to support taking different kinds of identifiers as input, we should do the same for results.
@@ -96,6 +100,8 @@ service StorageAuthority { | |||
rpc UpdateRevokedCertificate(RevokeCertificateRequest) returns (google.protobuf.Empty) {} | |||
rpc LeaseCRLShard(LeaseCRLShardRequest) returns (LeaseCRLShardResponse) {} | |||
rpc UpdateCRLShard(UpdateCRLShardRequest) returns (google.protobuf.Empty) {} | |||
rpc PauseIdentifier(PauseIdentifierRequest) returns (PauseIdentifierResponse) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
truly a question: would it make sense for this to take a plural set of identifiers to pause? If the RA detects that they've hit the limit for many names at the same time (which wouldn't surprise me, in the median case of requesting foo.co and www.foo.co) then we could pause all of them with a single request.
// provided account. If no matches are found, an empty slice is returned. If a | ||
// non-DNS identifier is provided, an error is returned. | ||
func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) { | ||
if len(ids) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idiomatic thing to do for slices is to check their length, because it is a single highly-readable check that works on both nil and non-nil-but-empty slices.
func (ssa *SQLStorageAuthority) GetSerialsByAccount(req *sapb.RegistrationID, stream sapb.StorageAuthority_GetSerialsByAccountServer) error { | ||
return ssa.SQLStorageAuthorityRO.GetSerialsByAccount(req, stream) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wrapper has been erroneously re-added, and the similar wrappers below are no longer necessary, thanks to #7501
// a slice of the first 15 identifier values which are currently paused for the | ||
// provided account. If no matches are found, an empty slice is returned. If a | ||
// non-DNS identifier is provided, an error is returned. | ||
func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this function (and the similar complex helper functions below) in this file? These are each called in only one place, and they have no direct tests (all of the tests operate at the SA level). Why not inline their logic directly into the SA methods? This would also prevent having to deal with the annoyance of ensuring the executor they're called with is actually a transaction.
Add the storage implementation for our new (account, hostname) pair pausing feature.
This PR adds the functionality, logic will be implemented in the WFE and RA, respectively.
Part of #7406
Part of #7475